Continuing from Yesterday, we will continue to analyze financial data.
In analyzing a stock portfolio, returns usually indicate a percentage change in asset price. Find the percentage change in stock price from Apple's stock price in Yahoo! Finance.
Pandas dataframes have powerful functions for frequency conversion.
function | Description |
---|---|
resample | Convert data to fixed frequency |
reindex | Assign data to a new index |
See the Reference (http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.html) for other dataframe functions.
[Adjusted closing price](http://www.yahoo-help.jp/app/answers/detail/p/546/a_id/45316/~/%E8%AA%BF%E6%95%B4%E5%BE % 8C% E7% B5% 82% E5% 80% A4% E3% 81% A8% E3% 81% AF) (Adjusted Closing Values) is a split to capture data continuously before and after a stock split or dividend. It is adjusted to the later value.
The return index is an index that shows the performance when the dividend of the stock is also taken into consideration, and is time-series data that has a value that represents the investment unit. Apple's return index can be found with the cumprod method.
import pandas as pd
import pandas.io.data as web
#Acquired adjusted closing price for Apple shares since 2010
price = web.get_data_yahoo('AAPL', '2009-12-31')['Adj Close']
returns = price.pct_change()
ret_index = (1 + returns).cumprod() #Calculation of return index
ret_index[0] = 1 #1 because the first line is NaN.To 0
print ( ret_index )
# =>
# Date
# 2009-12-31 1.000000
# 2010-01-04 1.015602
# 2010-01-05 1.017330
# 2010-01-06 1.001136
# 2010-01-07 0.999309
# 2010-01-08 1.005974
# 2010-01-11 0.997087
# 2010-01-12 0.985731
# 2010-01-13 0.999654
# 2010-01-14 0.993828
# 2010-01-15 0.977239
# 2010-01-19 1.020490
# 2010-01-20 1.004789
# 2010-01-21 0.987410
# 2010-01-22 0.938432
# ...
# 2014-02-19 2.653155
# 2014-02-20 2.622445
# 2014-02-21 2.593315
# 2014-02-24 2.604671
# 2014-02-25 2.577565
# 2014-02-26 2.554310
# 2014-02-27 2.605263
# 2014-02-28 2.598203
# 2014-03-03 2.605708
# 2014-03-04 2.622889
# 2014-03-05 2.628419
# 2014-03-06 2.620470
# 2014-03-07 2.618939
# 2014-03-10 2.621309
# 2014-03-11 2.646835
#Calculate cumulative return
m_returns = ret_index.resample('BM', how='last').pct_change()
print( m_returns['2014'] ) #Show 2014
# =>
# Date
# 2014-01-31 -0.107696
# 2014-02-28 0.057514
# 2014-03-31 0.018718
#Cumulative return can also be calculated by resample while performing aggregation.
m_returns = (1 + returns).resample('M', how='prod', kind='period') - 1
print( m_returns['2014'] ) #Show 2014(Same result)
When you print () the information of a huge data frame, it is automatically omitted and the beginning and end are displayed.
Let's plot the stock portfolio price history in the financial and IT sectors, focusing on the three years since 2010, especially after the earthquake until March 11th of this year.
def get_px(stock, start, end):
return web.get_data_yahoo(stock, start, end)['Adj Close']
names = ['AAPL', 'GOOG', 'MSFT', 'DELL', 'GS', 'MS', 'BAC', 'C']
px = pd.DataFrame( {n: get_px(n, '1/1/2010', '3/11/2014') for n in names} )
px = px.asfreq('B').fillna(method='pad')
rets = px.pct_change()
result = ((1 + rets).cumprod() - 1)
plt.figure()
result.plot()
plt.show()
plt.savefig("image.png ")
From here you can calculate portfolio returns over a period of time and backtest your strategy with various visualizations.
By handling financial data in a data frame that is easy to visualize and has abundant functions, we found that ad hoc analysis can be tried without relying on expensive software.
Introduction to data analysis with Python-Data processing using NumPy and pandas http://www.oreilly.co.jp/books/9784873116556/
Recommended Posts