You can use q to issue SQL directly to CSV or TSV format files.
When you have the following CSV (file name is orders.csv) of order data.
OrderId,OrderDate,ProductID,UserID,Quantity,TotalPrice
10000005,2015/2/13,100,10000,1,10000
10000006,2015/2/13,101,10001,2,2000
10000007,2015/2/13,102,10002,3,6000
10000008,2015/2/13,100,10003,1,10000
10000009,2015/2/13,101,10004,2,2000
10000010,2015/2/14,100,10005,3,30000
10000011,2015/2/14,101,10006,1,2000
10000012,2015/2/14,102,10007,2,4000
10000013,2015/2/14,100,10008,3,30000
10000014,2015/2/14,101,10009,1,1000
$ q -H -d',' "SELECT * FROM ./orders.csv WHERE TotalPrice >= 10000"
10000005,2015/2/13,100,10000,1,10000
10000008,2015/2/13,100,10003,1,10000
10000010,2015/2/14,100,10005,3,30000
10000013,2015/2/14,100,10008,3,30000
-H means skip header lines. Specify the delimiter with -d (specify "," because it is CSV)
$ q -H -d',' "SELECT OrderDate, SUM(TotalPrice) FROM ./orders.csv GROUP BY OrderDate"
2015/2/13,30000
2015/2/14,67000
q -H -d',' "SELECT ProductId, SUM(Quantity), SUM(TotalPrice) AS total FROM ./orders.csv GROUP BY ProductId Order BY total DESC"
100,8,80000
102,5,10000
101,6,7000
Data analysis is quick.
Recommended Posts