A few days ago, from facebook research, HiPlot A new data drawing library has been announced.
https://github.com/facebookresearch/hiplot
How about because the ReadMe of git was simple? I thought, When I actually used it, I felt the future, so I will share it.
HiPlot is a drawing tool that specializes in discovering data correlations and patterns. I'm not sure, so please watch the next video.
In this way, not only drawing the data, You can interactively select, filter, and exclude data.
There is a sample that can be moved in the official document, so please touch it.
You can install it from pip.
pip install hiplot
To use it, just pass the dictionary type data or CSV file path to HiPlot.
import pandas as pd
import hiplot as hip
#pandas → dictionary → HiPlot
train = pd.read_csv('../input/titanic/train.csv')
# orient='records'Must be passed in.
import_dict = train.to_dict(orient='records')
dict_hip = hip.Experiment.from_iterable(import_dict)
dict_hip.display()
#Directly from csv
csv_hip = hip.Experiment.from_csv('../input/titanic/train.csv')
csv_hip.display()
In addition, the created graph can be saved as html.
dict_hip.to_html()
Intuitive and easy to use, not only for initial data analysis I think that it can be used in various aspects such as tuning hyperparameters during learning.
Also, because it is extremely lightweight, it can be used without stress.
I have the impression that the functions are not yet complete, probably because it has just been released. It's a little troublesome because there is no function to return to the previous operation.
It's still a new tool, so I have the impression that it's functionally lacking. I think it's a useful tool.
When doing EDA with kaggle, why not dive into this library first?
Recommended Posts