Sendai Central Wholesale Market data is finally available in csv.
But
seika.csv
Sendai City Central Wholesale Market Daily Report (Fruit and Fruit Department),,,,,,,,,,,,,,,,,,,,,,,,,,,,
January 13, 3rd year of Reiwa (Wednesday) Weather(Sunny),,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,Sendai City Central Wholesale Market,,,,,,,,,,,,,,,,,
Sales volume,(1) Vegetables,310 t,,,,,,,,,TEL (022)232−8121〜2,,,,,,,,,,,,,,,,,
,(2) Fruit,94 t,,,,,,,,,FAX (022)232−8144,,,,,,,,,,,,,,,,,
Total sales volume,404 ,t,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Market price list by major item (1) Vegetables,,,,,,,,,,,,,,,,,,,,,,,,,,,,
Product name,Origin,unit,,High price,Medium price,Low price,Product name,Origin,unit,,High price,Medium price,Low price,,,,,,,,,,,,,,,
Daikon L,Chiba,10,Kg,"1,728","1,512","1,296",Tomato M,Kumamoto,4,Kg,"1,404",− ,"1,296",,,,,,,,,,,,,,,
Daikon L,Kanagawa,10,Kg,"1,728",− ,"1,296",Cherry tomato M,Chiba,200,g,140,− ,108,,,,,,,,,,,,,,,
Turnip 3L,Chiba,1,bundle,− ,194,− ,Cherry tomato M,Kumamoto,3,Kg,"1,512",− ,"1,296",,,,,,,,,,,,,,,
Turnip LL,Chiba,1,bundle,216,173,86,Bell pepper M,Ibaraki,150,g,108,97,92,,,,,,,,,,,,,,,
Carrot M,Ibaraki,10,Kg,"1,944","1,836","1,728",Peaman M,Kochi,150,g,108,97,86,,,,,,,,,,,,,,,
Carrot M,Chiba,10,Kg,"1,944","1,728","1,512",Sayaendo A,Kagoshima,1,Kg,"1,944",− ,"1,620",,,,,,,,,,,,,,,
Burdock M,Aomori,4,Kg,"1,620","1,404",756,Sweet potato L,Ibaraki,5,Kg,− ,"1,512",− ,,,,,,,,,,,,,,,
Chinese cabbage,Ibaraki,15,Kg,756,432,432,Sweet potato L,Chiba,5,Kg,− ,"1,512",− ,,,,,,,,,,,,,,,
Mizuna,Miyagi,150,g,− ,86,− ,Bareisho L,Hokkaido,10,Kg,"1,728",− ,"1,620",,,,,,,,,,,,,,,
Mizuna,Ibaraki,200,g,− ,86,− ,Make-in L,Hokkaido,10,Kg,− ,"2,376",− ,,,,,,,,,,,,,,,
Komatsuna,Miyagi,250,g,108,86,86,Taro LL,Niigata,5,Kg,"3,240","1,620",540,,,,,,,,,,,,,,,
Cabbage L,Chiba,10,Kg,− ,"1,296",− ,Long LL,Aomori,10,Kg,"3,240","2,700","2,160",,,,,,,,,,,,,,,
Cabbage L,Aichi,10,Kg,"1,404","1,296","1,080",Onion L,Hokkaido,20,Kg,"1,728",− ,"1,620",,,,,,,,,,,,,,,
spinach,Miyagi,200,g,162,119,97,Raw shiitake,Miyagi,100,g,162,130,108,,,,,,,,,,,,,,,
Small onion,Miyagi,100,g,130,108,97,Nameko S,Miyagi,100,g,54,− ,43,,,,,,,,,,,,,,,
Long onion L,Miyagi,5,Kg,"2,700","2,700","2,484",Nameko S,Yamagata,100,g,− ,65,− ,,,,,,,,,,,,,,,
Long onion L,Saitama,5,Kg,"4,320","3,780","3,456",Enoki,Miyagi,100,g,− ,38,− ,,,,,,,,,,,,,,,
Bent green onions,Miyagi,4,Kg,"1,620",− ,"1,404",Enoki,Miyagi,200,g,77,77,65,,,,,,,,,,,,,,,
Asatsuki,Fukushima,4,Kg,− ,"2,160",− ,Enoki,Niigata,200,g,70,− ,65,,,,,,,,,,,,,,,
Thread mitsuba,Miyagi,60,g,− ,130,− ,Bunashimeji,Miyagi,100,g,97,− ,86,,,,,,,,,,,,,,,
Thread mitsuba,Miyagi,100,g,108,− ,76,Bunashimeji,Niigata,170,g,130,− ,108,,,,,,,,,,,,,,,
Crowndaisy M,Miyagi,150,g,162,162,140,Chicken egg AM,Iwate,10,Kg,"1,728",− ,"1,620",,,,,,,,,,,,,,,
Seri M,Miyagi,100,g,216,162,162,,,,,,,,,,,,,,,,,,,,,,
Chinese chive L,Fukushima,100,g,108,108,86,,,,,,,,,,,,,,,,,,,,,,
Chinese chive M,Kochi,100,g,− ,130,− ,,,,,,,,,,,,,,,,,,,,,,
Cerli LL,Aichi,10,Kg,"2,376",− ,"2,160",,,,,,,,,,,,,,,,,,,,,,
Lettuce L,Kagawa,10,Kg,"3,348",− ,"3,240",,,,,,,,,,,,,,,,,,,,,,
Bok choy,Miyagi,250,g,108,65,65,,,,,,,,,,,,,,,,,,,,,,
Yukina,Miyagi,200,g,− ,76,− ,,,,,,,,,,,,,,,,,,,,,,
Yukina,Miyagi,250,g,97,− ,76,,,,,,,,,,,,,,,,,,,,,,
Cucumber M,Kochi,5,Kg,"2,160","2,160","1,944",,,,,,,,,,,,,,,,,,,,,,
Cucumber S,Miyazaki,5,Kg,"2,376",− ,"2,160",,,,,,,,,,,,,,,,,,,,,,
Pumpkin 7 balls,Mexico,10,Kg,"2,160",− ,"1,728",,,,,,,,,,,,,,,,,,,,,,
Eggplant M,Kochi,5,Kg,"2,916","2,160","1,944",,,,,,,,,,,,,,,,,,,,,,
Tomato M,Miyagi,4,Kg,"1,296",− ,"1,080",,,,,,,,,,,,,,,,,,,,,,
(2) Fruit,,,,,,,market conditions,,,,,,,,,,,,,,,,,,,,,
Product name,Origin,unit,,High price,Medium price,Low price,Vegetables,,,,,,,,,,,,,,,,,,,,,
140 lemons,America,17,Kg,"7,020",− ,"6,480",Total vegetable sales volume decreased by 201t,,,,,,,,,,,,,,,,,,,,,
Gure-36 balls,America,17,Kg,"3,780",− ,"3,240",Seri M,,,Small cheap,,,,,,,,,,,,,,,,,,
Decopon 18 balls,Wakayama,5,Kg,"3,564","3,240","3,024",Sayaendo A,,,10 to 20% cheaper,,,,,,,,,,,,,,,,,,
Another one,Miyagi,270,g,432,378,356,,,,,,,,,,,,,,,,,,,,,,
Tochiotome,Miyagi,270,g,410,410,378,,,,,,,,,,,,,,,,,,,,,,
Earl melon 6 balls,Kochi,9,Kg,− ,"9,720",− ,,,,,,,,,,,,,,,,,,,,,,
banana,Philippines,13,Kg,"3,024",− ,"2,700",,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,Others in general,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,fruit,,,,,,,,,,,,,,,,,,,,,
,,,,,,,Total fruit sales volume decreased by 62t,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,Overall,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,
And this A terrible thing that seems to have output the Excel file to CSV as it is ...
I can do anything, so I tried to normalize it for the time being.
processing_csv.py
import re, datetime
import pandas as pd
# from IPython.display import display
url='http://kei008220.webcrow.jp/seika.csv'
r=pd.read_csv(url,encoding='Shift-JIS', header=None)
#pd.set_option('display.max_rows', 100)
#Verification
r=r.dropna(how="all", axis=1)
#display(r)
#I will put it in by appearance
hyoudai=r.iloc[0,0] #title
nichiji=r.iloc[1,0] #Date and time: Wednesday, January 13, 3rd year of Reiwa Weather(Sunny)Separated later
sijou=r.iloc[2,11] #Market name
hanbairyo_vegetable=r.iloc[3,2] #Vegetable sales volume
hanbairyo_fruit=r.iloc[3,2] #Fruit sales volume, unit(t)Want to separate later
hanbairyo_full=r.iloc[5,1]+r.iloc[5,2] #Sales volume, unit(t)For the time being, stick them together
hinshu_kugiri=r[r[0].str.contains('(\d)', na=False)].index #Rows (1) and (2) of the market price table by major item[8,45]
vegetable=r.iloc[hinshu_kugiri[0]+1:hinshu_kugiri[1],:]
fruit=r.iloc[hinshu_kugiri[1]+1:,:].dropna(how="all")
#Create a table of vegetables
veg1=vegetable.iloc[:,:7]
veg2=vegetable.iloc[:,7:]
veg2.columns=[0,1,2,3,4,5,6]
veg=pd.concat([veg1,veg2[1:]],ignore_index=True).dropna(how="all") #The two rows were combined vertically. At that time, the index is changed.
veg[2]=veg[2].str.cat(veg[3], na_rep='') #Since the unit was in two rows, stick them together
veg=veg.drop(columns=3,axis=1) #Delete unnecessary columns after sticking
veg.columns=veg.iloc[0,:] #Specify column name
veg=veg.iloc[1:,:] #Delete the row used for the column name
veg['type']='Vegetables' #Vegetablesだけじゃないので列を追加
#display(veg)
#Create a table of fruits
fru=fruit.iloc[:,:7]
fru=fru.dropna(how="all", axis=0)
fru.iloc[:,2]=fru.iloc[:,2].str.cat(fru.iloc[:,3], na_rep='')
fru=fru.drop(columns=3,axis=1)
fru.columns=fru.iloc[0,:]
fru=fru.iloc[1:,:]
fru['type']='fruit'
#display(fru)
#Join
seika=pd.concat([veg,fru], ignore_index=True)
for i in ['High price','Medium price','Low price']:
seika[i]=seika[i].str.replace(',','')
seika['date']=nichiji.split()[0]
seika['weather']=re.search(r'(?<=weather.)[^\)]+',nichiji).group()
#Reiwa → AD
wareki=re.search(r'(\d+)\D+(\d+)\D+(\d+)',nichiji.split()[0]).groups()
year=str(2000+int(wareki[0])+18)
month=wareki[1]
day=wareki[2]
#print(f'{year}/{month}/{day}')
#Date conversion
seika['date']=datetime.datetime.strptime(year+'/'+month+'/'+day,'%Y/%m/%d')
#csv output
filename="seika_"+year+month+day+".csv"
header=['product_name','area','unit','high_price','middle_price','low_price','category','wareki','weather','date']
seika.to_csv(filename,header=header,index=None,encoding='UTF-8')
index | product_name | area | unit | high_price | middle_price | low_price | category | wareki | weather | date |
---|---|---|---|---|---|---|---|---|---|---|
0 | Daikon L | Chiba | 10Kg | 1728 | 1512 | 1296 | Vegetables | January 13, 3rd year of Reiwa | Sunny | 2021-01-13 |
1 | Daikon L | Kanagawa | 10Kg | 1728 | − | 1296 | Vegetables | January 13, 3rd year of Reiwa | Sunny | 2021-01-13 |
2 | Turnip 3L | Chiba | 1 bundle | − | 194 | − | Vegetables | January 13, 3rd year of Reiwa | Sunny | 2021-01-13 |
3 | Turnip LL | Chiba | 1 bundle | 216 | 173 | 86 | Vegetables | January 13, 3rd year of Reiwa | Sunny | 2021-01-13 |
4 | Carrot M | Ibaraki | 10Kg | 1944 | 1836 | 1728 | Vegetables | January 13, 3rd year of Reiwa | Sunny | 2021-01-13 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
59 | Decopon 18 balls | Wakayama | 5Kg | 3564 | 3240 | 3024 | fruit | January 13, 3rd year of Reiwa | Sunny | 2021-01-13 |
60 | Another one | Miyagi | 270g | 432 | 378 | 356 | fruit | January 13, 3rd year of Reiwa | Sunny | 2021-01-13 |
61 | Tochiotome | Miyagi | 270g | 410 | 410 | 378 | fruit | January 13, 3rd year of Reiwa | Sunny | 2021-01-13 |
62 | Earl melon 6 balls | Kochi | 9Kg | − | 9720 | − | fruit | January 13, 3rd year of Reiwa | Sunny | 2021-01-13 |
63 | banana | Philippines | 13Kg | 3024 | − | 2700 | fruit | January 13, 3rd year of Reiwa | Sunny | 2021-01-13 |
--Basically brute force
――The place where the cells are merged will be the leftmost column, so once read the column name by number
--Specify the data frame with iloc
and read each item
--Delete all lines with dropna ()
as NaN
--concat ()
when combining data frames vertically
--Since it is better to match the column names, I changed it with columns = []
.
--Because I am doing it with colaboratory, I could not use japanera
to convert the Japanese calendar, so I get the numbers withre.search ()
and convert it to the Christian era etc.
--From Reiwa, the Christian era is _ + 18_.
--When outputting to csv, I changed the header to English in consideration of importing it into Splunk.
--This should make INDEXED_EXTRACTIONS = csv
work ...
Originally there was only pdf, so I think I made progress just by giving CSV: sweat:
It's not listed in Sendai City Open Data Portal, and I wonder if I don't care.
I think that visualization is all right once it has been processed so far.
Recommended Posts