[Data analysis] Let's analyze US automobile stocks

Hello. This is Hayashi @ Ienter.

In the previous Blog, the Python data analysis library "scikit-learn" Introduced the regression analysis in.

At that time, I installed a Python package called Anaconda. This time, we introduced Pandas and Seaborn for data visualization in the United States. Let's analyze the automobile stock.

Reading stock price data

First, import the basic library for analysis with jupyter notebook. shot1.png

For handling time, for datetime module and for reading data from the outside Prepare the DataReader for use. shot2.png

For example, let's write a process to read the data for the past year of "[General Motors](https://ja.wikipedia.org/wiki/General Motors)" from the Yahoo site. By the way, the brand code of General Motors is "GM". shot3.png

Now, let's display the top 5 data. shot4_.png

The meaning of each column is as follows.

--Open: Open price --High: High price --Low: Low price --Close: Closing price --Volume: Volume (number of shares closed per day) --Adj Close: Adjusted closing price ([What is adjusted closing price](http://www.yahoo-help.jp/app/answers/detail/p/546/a_id/45316/~/ What is adjusted closing price)) )

Visualization of changes in stock price data

Let's check the transition of the closing price on the graph. For the closing price, use the adjusted closing price of "Adj Close". shot5_.png

Daily fluctuations are a key indicator of stock price investment risk. For example, the fluctuation transition can be calculated by using the pct_change function of Series. shot6.png

Stock price correlation comparison between companies

Earlier, I focused on the stock price of General Motors and analyzed it. Next, let's look at the correlation of stock prices among companies in the same industry.

Introduce "Seaborn" to visualize the correlation. You can install it on the command line by entering the following command. pip install seaborn

Import the module. shot7.png

This time, we will look at the correlation of the following five companies as automobile manufacturers.

--General Motors (brand code "GM") --Ford Motor (brand code "F") --Toyota (brand code "TM") --Honda (brand code "HMC") --Tesla Motors (brand code "TSLA")

Get the closing price data of the above 5 companies. shot8.png

With the closing price data of these companies, we will calculate the daily fluctuation data. shot9_.png

I will try to plot it. shot10.png

I don't really understand the relationship. .. ..

Now let's visualize it using Seaborn's pairplot function. shot11.png

Regarding the height of the correlation in the graph, the denser the points on the straight line, the higher the correlation. I hope you can imagine it. (Reference: [Correlation coefficient](https://ja.wikipedia.org/wiki/Correlation coefficient))

From that point of view US company pair of "GM (General Motors)" and "F (Ford Motor)", "TM (Toyota)" and "HMC (Honda)" Japanese company pair I think you can imagine that the correlation is relatively high.

On the contrary, latecomer electric vehicle companies such as "TSLA (Tesla Motors)" I think you can imagine that there is little correlation with other companies.

In addition, try using Seaborn's heatmap to make the correlation easier to understand. shot12.png The correlation coefficient value of the data between each company is expressed by the color depth. The shape is easier to understand visually.

That's all for this story!

Recommended Posts

[Data analysis] Let's analyze US automobile stocks
Let's analyze the questionnaire survey data [4th: Sentiment analysis]
Data analysis python
Data analysis Titanic 1
[Python3] Let's analyze data using machine learning! (Regression)
Let's analyze Covid-19 (Corona) data using Python [For beginners]
Let's look at the scatter plot before data analysis
Data analysis with python 2
Data analysis using xarray
Data analysis parts collection
Data analysis using Python 0
Data analysis with Python
Let's make the analysis of the Titanic sinking data like that
Let's try analysis! ~ Data scientists also started coding ~ By Fringe81
I tried to analyze scRNA-seq data using Topological Data Analysis (TDA)