Learning record No. 20 (24th day)

Learning record (24th day)

Start studying: Saturday, December 7th

Teaching materials, etc .: ・ Miyuki Oshige "Details! Python3 Introductory Note ”(Sotec, 2017): Completed on Thursday, December 19th ・ Progate Python course (5 courses in total): Ends on Saturday, December 21st ・ Andreas C. Müller, Sarah Guido "(Japanese title) Machine learning starting with Python" (O'Reilly Japan, 2017): Completed on Saturday, December 23 ・ Kaggle: Real or Not? NLP with Disaster Tweets: Posted on Saturday, December 28th to Friday, January 3rd Adjustment ・ ** Wes Mckinney "(Japanese title) Introduction to data analysis by Python" (O'Reilly Japan, 2018) **: January 4th (Sat) ~

"Introduction to Data Analysis with Python"

p.346 Chapter 10 Data Aggregation and Group Operations Completed reading.

Chapter 9 Plot and visualization

・ Explanation of data visualization libraries such as matplotlib and seaborn Setting elements such as linetypes can be found in ** DocString (function name +'?') **. (If you are importing matplotlib with as plt, use it like ** plt.plot? **.)

-Basically, matplotlib should be used, and add-on libraries such as pandas and seaborn should be used as needed.

Plot preparation


import matplotlib.pyplot as plt
fig = plt.figure() #An object that contains plotting capabilities.
ax1 = fig.add_subplot(1, 1, 1) #Add one or more subplots to plot.

#The format of the figure and the input data are described below.

・ Overview of what you can do Margin adjustment, axis sharing, title, legend and display position adjustment (optimal position with loc ='best'), Label rotation (rotation), annotation (annotate), figure addition (add_patch), Default value setting of matplotlib (rc method)

Axis class (AxesSubplot)Batch setting of attributes using the set method of


props = {'title': 'namae no ikkatsu settei', 'xlabel': 'aiueo'}
ax.set(**props)

-DataFrame also has a plot method. Can be used as is for data frames.

Visualization of value frequency


s.value_counts().plot.bar() #Horizontal bar at barh

The seaborn package makes it easy to visualize data that needs to be aggregated or summarized before plotting. Specify the data in the argument data, and specify the row and column names of the data frame in x and y.

・ Histogram: A type of bar graph, displaying the frequency of values as discrete data

Density plot: Generated from a continuous probability distribution that is presumed to have produced the observed data. Usually, this distribution is approximated as a simple sum such as a normal distribution called kernel. Therefore, the density plot is also called the "kernel density estimation (KDE) plot". (Plot.kde)

・ Methods that are likely to be used very often seaborn.distplot (histogram and density estimation plot can be created at the same time) seaborn.regplot (Create a scatter plot and apply a regression line by linear regression) seaborn.pairplot (Can visualize scatterplot matrix comparing each element at once)

Chapter 10 Data aggregation and group calculation

・ Pandas groupby method Arbitrary processing can be executed by combining elements of datasets (understood as something)

-The group calculation process is a flow of split-apply-combine.

-Multiple elements can be specified for one data set. Is it possible to extract an arbitrary value, process it (average, count, etc.), and then group it again?

-It can also be classified using mapping information using a dictionary.

・ Functions of groupby method (count, sum, mean, median ...) Let's cover basic arithmetic calculations.

-The name given when data is aggregated by groupby can be changed by passing a tuple. You can also specify no index with as_index = False.

-Apply separates the objects, ** applies the function passed to each piece, and ** then joins them. Imagination is required because the function passed to apply must be implemented by the programmer himself.

・ Pivot table and cross tabulation. It can be implemented in both data frame functions and group by. Being able to handle these will be useful for data cleaning, modeling, and statistical analysis.

Recommended Posts

Learning record No. 21 (25th day)
Learning record No. 10 (14th day)
Learning record No. 24 (28th day)
Learning record No. 23 (27th day)
Learning record No. 25 (29th day)
Learning record No. 26 (30th day)
Learning record No. 20 (24th day)
Learning record No. 14 (18th day) Kaggle4
Learning record No. 15 (19th day) Kaggle5
Learning record 4 (8th day)
Learning record 9 (13th day)
Learning record 5 (9th day)
Learning record 6 (10th day)
Learning record 8 (12th day)
Learning record 16 (20th day)
Learning record 22 (26th day)
Learning record 13 (17th day) Kaggle3
Learning record 12 (16th day) Kaggle2
Learning record No. 18 (22nd day)
Learning record No. 19 (23rd day)
Learning record No. 29 (33rd day)
Learning record No. 28 (32nd day)
Learning record No. 27 (31st day)
Learning record 11 (15th day) Kaggle participation
Programming learning record day 2
Learning record
Learning record # 3
Learning record # 2
Python learning day 4
Learning record (2nd day) Scraping by #BeautifulSoup
Learning record (4th day) #How to get the absolute path from the relative path
Learning record so far
Linux learning record ① Plan
Effective Python Learning Memorandum Day 15 [15/100]
<Course> Deep Learning: Day2 CNN
Effective Python Learning Memorandum Day 6 [6/100]
Effective Python Learning Memorandum Day 12 [12/100]
Effective Python Learning Memorandum Day 9 [9/100]
Effective Python Learning Memorandum Day 8 [8/100]
Learning record (3rd day) #CSS selector description method #BeautifulSoup scraping
Rabbit Challenge Deep Learning 1Day
<Course> Deep Learning: Day1 NN
Learning record (6th day) #Set type #Dictionary type #Mutual conversion of list tuple set #ndarray type #Pandas (DataFrame type)
Effective Python Learning Memorandum Day 14 [14/100]
Effective Python Learning Memorandum Day 1 [1/100]
Subjects> Deep Learning: Day3 RNN
Rabbit Challenge Deep Learning 2Day
Effective Python Learning Memorandum Day 13 [13/100]
Effective Python Learning Memorandum Day 3 [3/100]
Effective Python Learning Memorandum Day 5 [5/100]
Effective Python Learning Memorandum Day 4 [4/100]
Effective Python Learning Memorandum Day 2 [2/100]
Thoroughly study Deep Learning [DW Day 0]