Parsing CSV format data using SQL

You can use q to issue SQL directly to CSV or TSV format files.

Installation procedure

http://harelba.github.io/q/

Example of data analysis using q

When you have the following CSV (file name is orders.csv) of order data.

OrderId,OrderDate,ProductID,UserID,Quantity,TotalPrice
10000005,2015/2/13,100,10000,1,10000
10000006,2015/2/13,101,10001,2,2000
10000007,2015/2/13,102,10002,3,6000
10000008,2015/2/13,100,10003,1,10000
10000009,2015/2/13,101,10004,2,2000
10000010,2015/2/14,100,10005,3,30000
10000011,2015/2/14,101,10006,1,2000
10000012,2015/2/14,102,10007,2,4000
10000013,2015/2/14,100,10008,3,30000
10000014,2015/2/14,101,10009,1,1000

I want to extract data with sales (Total Price) of 10,000 yen or more

$ q -H -d',' "SELECT * FROM ./orders.csv WHERE TotalPrice >= 10000"

10000005,2015/2/13,100,10000,1,10000
10000008,2015/2/13,100,10003,1,10000
10000010,2015/2/14,100,10005,3,30000
10000013,2015/2/14,100,10008,3,30000

-H means skip header lines. Specify the delimiter with -d (specify "," because it is CSV)

I want to aggregate daily sales

$ q -H -d',' "SELECT OrderDate, SUM(TotalPrice) FROM ./orders.csv GROUP BY OrderDate"

2015/2/13,30000
2015/2/14,67000

I want to put out a product sales ranking

q -H -d',' "SELECT ProductId, SUM(Quantity), SUM(TotalPrice) AS total FROM ./orders.csv GROUP BY ProductId Order BY total DESC"

100,8,80000
102,5,10000
101,6,7000

Data analysis is quick.