[Python] How to create a 2D histogram with Matplotlib

When drawing a two-variable histogram, either a "three-dimensional graph" or "two-dimensional frequency is assigned to color or density" is used. While it is easy to understand the change in the frequency direction in the three-dimensional graph, it may be difficult to understand the entire distribution because there are hidden parts. On the other hand, when the frequency is assigned to the color or density in two dimensions, it is difficult to understand the subtle difference in the frequency direction, but it is easy to grasp what the overall distribution is.

data

Two two-dimensional normal distributions are used as data. The distribution is as follows.

import numpy as np
x, y = np.vstack((np.random.multivariate_normal([0, 0], [[10.0, 0],[0,20]], 5000) 
                 ,np.random.multivariate_normal([0,15], [[10.0, 0],[0, 5]], 5000))).T

hist2d_01.png

2D histogram The 2D histogram uses hist2d from matplotlib. The frequency of the histogram is obtained as a return value. The return values are counts, xedges, yedges, Image.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

H = ax.hist2d(x,y, bins=40, cmap=cm.jet)
ax.set_title('1st graph')
ax.set_xlabel('x')
ax.set_ylabel('y')
fig.colorbar(H[3],ax=ax)
plt.show()

hist2d_02.png

Specifying the number of bins

The number of bins is determined by the parameter bins. If specified by scalar, the number of bins will be the same both vertically and horizontally. If you want to specify them separately, use the example. Edge can be specified as well as a one-dimensional histogram.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

H = ax.hist2d(x,y, bins=[40,10], cmap=cm.jet)
ax.set_title('2nd graph')
ax.set_xlabel('x')
ax.set_ylabel('y')

fig.colorbar(H[3],ax=ax)
plt.show()

hist2d_03.png

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

H = ax.hist2d(x,y, bins=[np.linspace(-30,30,61),np.linspace(-30,30,61)], cmap=cm.jet)
ax.set_title('3rd graph')
ax.set_xlabel('x')
ax.set_ylabel('y')

fig.colorbar(H[3],ax=ax)
plt.show()

hist2d_04.png

Normalization

If you want to normalize the histogram, set the parameter normed to True.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

H = ax.hist2d(x,y, bins=[np.linspace(-30,30,61),np.linspace(-30,30,61)], normed=True, cmap=cm.jet)
ax.set_title('4th graph')
ax.set_xlabel('x')
ax.set_ylabel('y')

fig.colorbar(H[3],ax=ax)
plt.show()

hist2d_05.png

Change the color map

To change the color map, specify it in the parameter cmap as shown in the example.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

H = ax.hist2d(x,y, bins=[np.linspace(-30,30,61),np.linspace(-30,30,61)], normed=True, cmap=cm.gray)
ax.set_title('5th graph')
ax.set_xlabel('x')
ax.set_ylabel('y')
fig.colorbar(H[3],ax=ax)
plt.show()

hist2d_06.png

Specify the range of the color map

You may want to specify a range of colormaps when comparing multiple histograms. In this case, use set_clim as in the example.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

H = ax.hist2d(x,y, bins=[np.linspace(-30,30,61),np.linspace(-30,30,61)], normed=True, cmap=cm.jet)
ax.set_title('6th graph')
ax.set_xlabel('x')
ax.set_ylabel('y')
H[3].set_clim(0,0.05)
fig.colorbar(H[3],ax=ax)
plt.show()

hist2d_07.png

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

H = ax.hist2d(x,y, bins=[np.linspace(-30,30,61),np.linspace(-30,30,61)], normed=True, cmap=cm.jet)
ax.set_title('7th graph')
ax.set_xlabel('x')
ax.set_ylabel('y')
H[3].set_clim(0,0.01)
fig.colorbar(H[3],ax=ax)
plt.show()

hist2d_08.png

Log scale

If you want the histogram to be Log scaled, use matplotlib.colors.LogNorm ().

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

H = ax.hist2d(x,y, bins=[np.linspace(-30,30,61),np.linspace(-30,30,61)], norm=matplotlib.colors.LogNorm(), cmap=cm.jet)
ax.set_title('8th graph')
ax.set_xlabel('x')
ax.set_ylabel('y')

fig.colorbar(H[3],ax=ax)
plt.show()

hist2d_09.png

Overlay contour lines (Contour) on top of the histogram

Contour lines are written using contour. At this time, pay attention to the following points

  1. Specify the maximum and minimum values with extend so that the horizontal and vertical positions match the histogram.
  2. Counts transposes because it contains x-axis data in the vertical direction and y-axis data in the horizontal direction.
  3. Use LogNorm to display logarithmically. You don't have to have this, but you can hardly see the line without it.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

counts, xedges, yedges, Image= ax.hist2d(x,y, bins=[np.linspace(-30,30,61),np.linspace(-30,30,61)], norm=matplotlib.colors.LogNorm(), cmap=cm.jet)
ax.contour(counts.transpose(),extent=[xedges.min(),xedges.max(),yedges.min(),yedges.max()])
ax.set_title('8th graph')
ax.set_xlabel('x')
ax.set_ylabel('y')
fig.colorbar(Image,ax=ax)
plt.show()

hist2d_10.png

Make the histogram bin shape hexagonal

When you want to make the shape of the bottle hexagonal.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

H = ax.hexbin(x,y, gridsize=20, extent=[-30, 30, -30, 30], cmap=cm.jet)
ax.set_title('8th graph')
ax.set_xlabel('x')
ax.set_ylabel('y')
fig.colorbar(H,ax=ax)
plt.show()

hist2d_11.png

Smooth histogram (kernel density estimation)

The histogram is Blocky, especially when the bins are widely spaced. There is a kernel density estimation as a method to connect this smoothly. This is for estimating the probability density function from the sample distribution of random variables.

Kernel density estimation is included in scipy and scikit-learn. scipy kernel density estimation specifies the band width normalized by the standard deviation. As a result, even if the distribution of data changes, it can be estimated smoothly. Therefore, when the parameter bw_method = 1.0, value.std (ddof = 1) is used as the band width. (value is data) Here ddof is divided by N-ddof when calculating the standard deviation with delta degrees of freedom.

With kernel = gaussian_kde (value), the kernel is a gaussian_kde object, so if you really want to get the value, Pass the positions of the x and y coordinates like kernel (positions).

Actually, a mesh is created with mgrid, x and y are made one-dimensional separately with ravel, and they are attached with vstack and then passed to the kernel.

Since the value returned by kernel (positions) is one-dimensional, it is made two-dimensional by reshape.

Finally, it is displayed as a graph with contourf.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde
import matplotlib.cm as cm

xx,yy = np.mgrid[-30:30:1,-30:30:1]
positions = np.vstack([xx.ravel(),yy.ravel()])
 
value = np.vstack([x,y])

kernel = gaussian_kde(value)

f = np.reshape(kernel(positions).T, xx.shape)

fig = plt.figure()
ax = fig.add_subplot(111)

ax.contourf(xx,yy,f, cmap=cm.jet)
ax.set_title('11th graph')
ax.set_xlabel('x')
ax.set_ylabel('y')
plt.show()

hist2d_12.png

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde
import matplotlib.cm as cm

xx,yy = np.mgrid[-30:30:1,-30:30:1]
positions = np.vstack([xx.ravel(),yy.ravel()])
 
value = np.vstack([x,y])

kernel = gaussian_kde(value, bw_method=0.5)

f = np.reshape(kernel(positions).T, xx.shape)

fig = plt.figure()
ax = fig.add_subplot(111)

ax.contourf(xx,yy,f, cmap=cm.jet)
ax.set_title('12th graph')
ax.set_xlabel('x')
ax.set_ylabel('y')
plt.show()

hist2d_13.png

When using numpy histogram2d

You can create a histogram using numpy's histogram2d. In this case, the frequency and the edges of x and y can be obtained, so you need to graph yourself. Here, imshow was used for display. At this time, the histogram data contains data in the x-axis direction in the vertical direction and data in the y-axis direction in the horizontal direction, similar to hist2d in matplotlib. It is transposed and the starting point is set to lower left.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

fig = plt.figure()
ax = fig.add_subplot(111)

H = np.histogram2d(x,y, bins=[np.linspace(-30,30,61),np.linspace(-30,30,61)])
im = ax.imshow(H[0].T, interpolation='nearest', origin='lower', extent=[-30,30,-30,30], cmap=cm.jet)
ax.set_title('13st graph')
ax.set_xlabel('x')
ax.set_ylabel('y')
fig.colorbar(im, ax=ax)
plt.show()

hist2d_14.png

Recommended Posts

[Python] How to create a 2D histogram with Matplotlib
[Python] How to draw a histogram in Matplotlib
[Python] How to draw a line graph with Matplotlib
[Python] How to draw a scatter plot with Matplotlib
How to create a heatmap with an arbitrary domain in Python
How to read a CSV file with Python 2/3
How to create a Python virtual environment (venv)
How to create a JSON file in Python
[Python] How to draw multiple graphs with Matplotlib
Steps to create a Twitter bot with python
How to create a multi-platform app with kivy
Create 3D scatter plot with SciPy + matplotlib (Python)
Create 3d gif with python3
Create a directory with python
3. Natural language processing with Python 1-2. How to create a corpus: Aozora Bunko
How to convert / restore a string with [] in python
How to create a submenu with the [Blender] plugin
I want to manually create a legend with matplotlib
How to create a kubernetes pod from python code
[Python] Road to a snake charmer (5) Play with Matplotlib
Create a 2d CAD file ".dxf" with python [ezdxf]
[Python] How to create a local web server environment with SimpleHTTPServer and CGIHTTPServer
How to write a Python class
Python: How to use async with
Create folders from '01' to '12' with python
Solve ABC166 A ~ D with Python
How to create a Conda package
Create a virtual environment with Python!
How to create a virtual bridge
How to get started with Python
How to create a Dockerfile (basic)
How to use FTP with Python
How to calculate date with python
Write a stacked histogram with matplotlib
How to create a config file
How to install NPI + send a message to line with python
How to convert an array to a dictionary with Python [Application]
Create a Mastodon bot with a function to automatically reply with Python
Probably the easiest way to create a pdf with Python3
How to transpose a 2D array using only python [Note]
[Python Kivy] How to create a simple pop up window
How to build a python2.7 series development environment with Vagrant
Create a message corresponding to localization with python translation string
[Python Kivy] How to create an exe file with pyinstaller
How to batch start a python program created with Jupyter notebook
Create a Python function decorator with Class
How to create a git clone folder
Build a blockchain with Python ① Create a class
How to add a package with PyCharm
[Python] How to make a class iterable
How to draw a graph using Matplotlib
[Introduction to Python] How to split a character string with the split function
Create a dummy image with Python + PIL.
[Python 3.8 ~] How to define a recursive function smartly with a lambda expression
[Python] Create a virtual environment with Anaconda
[Python] I want to make a 3D scatter plot of the epicenter with Cartopy + Matplotlib!
Let's create a free group with Python
Try to create a python environment with Visual Studio Code & WSL
How to make a surveillance camera (Security Camera) with Opencv and Python
How to work with BigQuery in Python
[Python] How to invert a character string