I have given lectures on how to predict power consumption using machine learning in various places, but the actually predicted values are not so correct, aren't they? After all, it's a PoC, right? We received various opinions, so I would like to publish the actually predicted value in CSV and have it verify its ability ...
In the past, we have published an example of predicting with multiple methods as an article of Qiita, so please have a look.
Power consumption forecast with Keras (TensorFlow)
Predict power usage after 2 days in the Chugoku Electric Power Area.
The forecast results two days later are published on the website so that they can be compared with the published results of electricity usage by Chugoku Electric Power Company.
https://blueomega.jp/20200811_power_prediction_challenge/yyyy-mm-dd_.csv
If it is September 2, 2020, it will be the following URL. https://blueomega.jp/20200811_power_prediction_challenge/2020-09-02_.csv
You can compare by running the following script on Colaboratory.
python
import datetime as dt
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import r2_score
#Acquire actual data up to the day before the Chugoku Electric Power Area
url = "https://www.energia.co.jp/nw/jukyuu/sys/juyo-2020.csv"
df_juyo = pd.read_csv(url, skiprows=2, encoding="Shift_JIS")
df_juyo.index = pd.to_datetime(df_juyo["DATE"] + " " + df_juyo["TIME"])
#Acquire actual data of the day in the Chugoku Electric Power Area
d = dt.datetime.now() + dt.timedelta(hours=9) - dt.timedelta(days=0)
url = "https://www.energia.co.jp/nw/jukyuu/sys/juyo_07_" + d.strftime("%Y%m%d") + ".csv"
df_tmp = pd.read_csv(url, skiprows=13, encoding="Shift_JIS",nrows=24)
df_tmp.index = pd.to_datetime(df_tmp.DATE + " " + df_tmp.TIME)
#Get forecast data after August 31st
df = pd.DataFrame()
d = dt.datetime(2020,8,31)
while d < dt.datetime.now() + dt.timedelta(days=3):
try:
url = "https://blueomega.jp/20200811_power_prediction_challenge/" + d.strftime("%Y-%m-%d") + "_.csv"
df = pd.concat([df, pd.read_csv(url)])
except:
print("No file.")
d += dt.timedelta(days=1)
df.index = pd.to_datetime(df.pop("datetime"))
#Reflects the actual value up to the previous day
df["act"] = df_juyo["Performance(10,000 kW)"]
#Reflect the actual value of the day
for idx in df_tmp[df_tmp["Results on the day(10,000 kW)"] > 0].index:
df.loc[idx, "act"] = df_tmp.loc[idx]["Results on the day(10,000 kW)"]
#Visualize forecasts and performance
df_plot = df.copy()
df_plot = df_plot[["act", "y2"]]
df_plot.columns = ["act", "pred tuned"]
df_plot["2020-08-31":].plot(figsize=(15,5), ylim=(300,1200))
plt.show()
#Coefficient of determination
df_scr = df[df.act > 0]
print("Coefficient of determination(R2 SCORE) : ", r2_score(df_scr.act, df_scr.y2))
This is the execution result as of 5 o'clock on September 3rd. Coefficient of determination (R2 SCORE): 0.9494716417755021
Forecasts are updated daily at 1:00 and 12:00, so please take a look if you are interested. We are also looking forward to hearing from those who need the same ability to make predictions.
This is the execution result as of 6:00 on September 4th. Coefficient of determination (R2 SCORE): 0.9454478929760703
I wonder if it got a little worse ...
Recommended Posts