If you use Mean on a DataFrame that contains a value, you can get the average DataFrame. At that time, check what happens when the target item contains NULL.
import pandas as pd
df_ExistNone = pd.DataFrame({'a': [1, 2, 1,None, 3],
'b': [0.4, 1.1,None, 0.1, 0.8],
'c': ['X', 'Y',None, 'X', 'Z'],
'd': ['3',None, '5', '2', '1'],
'e': [True,None, True, False, True]})
df = pd.DataFrame({'a': [1, 2, 1, 3],
'b': [0.4, 1.1, 0.1, 0.8],
'c': ['X', 'Y', 'X', 'Z'],
'd': ['3', '5', '2', '1'],
'e': [True, True, False, True]})
df_0 = pd.DataFrame({'a': [1, 2, 1,0, 3],
'b': [0.4, 1.1,0, 0.1, 0.8],
'c': ['X', 'Y',None, 'X', 'Z'],
'd': ['3','0', '5', '2', '1'],
'e': [True,None, True, False, True]})
print(df)
print(df_ExistNone)
print(df_0)
print("-------------------")
print(df.mean())
print(df_ExistNone.mean())
print(df_0.mean())
result
a b c d e
0 1 0.4 X 3 True
1 2 1.1 Y 5 True
2 1 0.1 X 2 False
3 3 0.8 Z 1 True
a b c d e
0 1.0 0.4 X 3 True
1 2.0 1.1 Y None None
2 1.0 NaN None 5 True
3 NaN 0.1 X 2 False
4 3.0 0.8 Z 1 True
a b c d e
0 1 0.4 X 3 True
1 2 1.1 Y 0 None
2 1 0.0 None 5 True
3 0 0.1 X 2 False
4 3 0.8 Z 1 True
-------------------
a 1.75
b 0.60
d 880.25
e 0.75
dtype: float64
a 1.75
b 0.60
e 0.75
dtype: float64
a 1.40
b 0.48
d 6104.20
e 0.75
dtype: float64
You can see that the None item is excluded from the calculation.
If you want to include None skipna = False/True
print(df_ExistNone.mean(skipna = True))
print(df_ExistNone.mean(skipna = False))
a 1.75
b 0.60
e 0.75
dtype: float64
a NaN
b NaN
dtype: float64
Recommended Posts