This blog is the third day entry of jupyter notebook Advent Calendar 2016.
There is an online learning service called Udemy, and "Practical Python Data Science" was 2,300 yen. So I tried it. Shingo Tsuji, the author of "Python Startbook", explained in Japanese in the 17.5 hour course. It's a very easy-to-understand course. I haven't listened to all the sections yet, but I would like to introduce you to this course.
Since this course basically uses Jupyter Notebook to explain almost all sections, what are the features of Jupyter Notebook introduced in this course? I will introduce you while exchanging.
The first half is in the form of explaining how to use Python's data analysis library. After that, data analysis and visualization are explained, and in the latter half, more practical data analysis is explained using actual data.
Anaconda This is a library package required for data analysis provided by Continuum Analytics. This package also contains a Jupyter Notebook. If you install Anaconda, pip, a Python package management software, is included, so you can use it to install the required libraries. I think that a super beginner like me should first install this Anaconda in order to prepare the data analysis environment of Python.
After installing Anaconda, just start it in the working directory as shown below and the browser will start.
$ ipython notebook
<img src="lec28.png ">
%matplotlib inline
Sections 3 and beyond of this course use Jupyter to explain the basics of data science. It will flow in the form of introducing the library.
NumPy
NumPy arrays play a central role in data analysis
Pandas
A data library that is very often used when analyzing data with Python --Series is a very popular library --DataFrame is a very popular data type
In this way, you can also read data from the clipboard.
from pandas import Series, DataFrame
import pandas as pd
#Paste the data you want to read to the clipboard
nfl_frame = pd.read_clipboard()
nfl_frame
Kaggle is a predictive modeling and analytical method related platform and its operating company where companies and researchers post data and statisticians and data analysts around the world compete for the optimal model. See also: https://ja.wikipedia.org/wiki/Kaggle
from pandas.io.data import DataReader
from datetime import datetime
tech_list = ['AAPL','GOOG','MSFT','AMZN']
end = datetime.now()
start = datetime(end.year - 1,end.month,end.day)
for stock in tech_list:
globals()[stock] = DataReader(stock,'yahoo',start,end)
AAPL.describe()
As mentioned above, the content is a little different from the content related to Jupyter Notebook, but I hope it will be helpful. I think "Practical Python Data Science" is worth more than 2,300 yen. The content is very easy to understand for a super beginner like me, so I recommend you to take it.
Recommended Posts