I hate the application called Excel.
You can read 10,000 lines of csv into Excel, filter it, drag a function to copy it, and all the troublesome things are done. There are times when it takes a long time to load, and it doesn't work on a mac, so on the day when it freezes, I'm not motivated and will leave early.
You're going crazy! !! !!
If you are an engineer, you want to finish everything from data processing / extraction from csv import to ftp server upload from export with just a command! Excel is an evil way of evil! Reliable database! !! !!
The point is, if you have a script that creates a create table statement from csv and batches up to load data, Excel is goodbye without even saying good by. Such a python script is today's theme.
Since python has a pip that guesses the type from csv called csv_to_table and creates a create table statement with the header part of the first line as the column name, we created a script that adds the function to create up to the load data statement this time. I did.
I wrote how to use it on github, so please execute it as it is w
https://github.com/ShusukeOtani/table_stmt_generator_from_csv
gitclone
git clone https://github.com/ShusukeOtani/table_stmt_generator_from_csv
Install pip and pyenv and run it with python2.7.1.
Dependent pip installation
pip install -r requirements.txt
Prepare -d, -t, and -p as optional arguments. -d is an option to determine the type of database. -d m is mysql, -d p is postgre. The default value is mysql. -t is a bool whether to make it a Temp table. If you cut it, make it a permament table.
You can specify the primary key with -p. If -p id, the column id will be PK.
I will write an example. First of all, there is no primary key, permament table, option is not required for mysql, so simply specify the file path of csv.
exmaple1
python execute.py ./example/test.csv
result1
DROP TABLE IF EXISTS table_stmt_generator_from; CREATE TABLE table_stmt_generator_from (
id smallint,
pref text
);LOAD DATA LOCAL INFILE '/Users/200302/table_stmt_generator_from_csv/example/test.csv' INTO TABLE table_stmt_generator_from FIELDS TERMINATED BY ',' IGNORE 1 LINES;
If you want to use id column for postgre, temp table and pk, add the following option.
example2
`python execute.py -d p -t -p id ./example/test.csv
result2
DROP TABLE IF EXISTS test; CREATE TEMPORARY TABLE test (
id smallint,
pref text
, PRIMARY KEY(id));COPY test FROM '/Users/200302/table_stmt_generator_from_csv/example/test.csv' CSV HEADER;
One of the jobs of advertising is to create a feed. We create an optimized case list feed for vendors such as Criteo and Indeed. You can't join when you go to see multiple DBs. I wrote this script to solve this problem. In actual operation, this script is used to make an application that performs the following series.
Recommended Posts