Hello.
In the previous article, I introduced material research on the theme of machine learning. Let's feel like a material researcher with machine learning
This time, I will introduce pymatgen </ b> (Python Materials Genomics), which is an open source python library, for my own study. I would like you to come into contact with material research.
--I'm a programmer. People who are a little interested in physics and materials ――People who are doing materials but want to make their research more efficient ――I'm doing materials. People who want to bite into the program -materials informatics Interested people
As an aside, I created a Facebook group called Materials Informatics Young People's Association. There are only a few people yet, but if you are interested, please feel free to contact me (fb or email posted on your Qiita account). I would like to share information about materials informatics and hold study sessions.
then, let's begin.
-[What is material research in the first place? (Quotation from the above)](#What is material research in the first place) -[What is pymatgen](What is #pymatgen) -[Basic function](#Basic function) -[Linkage with other tools](#Linkage with other tools) -[Install pymatgen](Install #pymatgen) -[Let's actually move](# Let's actually analyze)
If you want to feel like a materials researcher right away, you can jump to the installation. The basics are based on the pymatgen official document, so if you don't have a hard time reading English, you can understand it as if you read this. then, let's begin.
This is a quote from the above. If you read it, please skip it. </ b> → [What is pymatgen](What is #pymatgen)
First of all, I think that material research is something like what kind of material it is. As you said, there are various materials such as ceramics, polymers, and metals.
For example iPhone
There are hundreds of such small ceramic capacitors in it.
And in order to make a high-performance capacitor that is chosen by Apple,
・ What kind of elements should be combined? ・ What kind of process should I make?
It will solve the difficult problem. Here is an example of how to solve it.
It's like that. This is just an example, but what is material research?
Make great materials by making full use of process optimization, measurement, theory, analysis, exploration, etc.
It's messy, but it looks like this.
Simply put, it's a useful tool for analyzing materials.
First, take a look at the diagram below to get a rough idea of material analysis. From Wikipedia
As I think I did in high school, matter is composed of atoms, and each has a wide variety of structures. The purpose of material analysis is What kind of atom has what kind of structure and what kind of property? </ b> The solution is to use physics.
So the main subject is pymatgen,
(1) Many useful tools for visualizing materials and using analysis data ② Easy to link with the tools currently used for material analysis
It is a open source </ b> python library that has the advantage of. Please note that it is not an analysis software that runs on a GUI.
As a specific function
Is listed in the official documentation.
From here, let's feel like a material researcher by actually using pymatgen.
Reference: http://pymatgen.org/pymatgen.core.html#module-pymatgen.core First, let's look at the modules provided by pymatgen for expressing atoms and structures.
pymatgen.core.periodic_table module The periodic table is the periodic table </ b> that you all know. Introducing the Element class, Specie class </ b> of this module. In Element class, you can inherit the Enum class and define atoms corresponding to the periodic table.
class Element(Enum):
def __init__(self, symbol):
self.symbol = "%s" % symbol
d = _pt_data[symbol]
...
Since periodic_table.json in the pymatgen library is loaded into _pt_data when the module is imported,
>>> fe = Element("Fe")
>>> fe.data
{'Superconduction temperature': 'no data K', 'Molar volume': '7.09 cm<sup>3</sup>', 'Ionic radii hs': {'2': 0.92, '3': 0.785}, 'Melting point': '1811 K', 'Atomic radius': 1.4, 'Mineral hardness': '4.0', 'Electrical resistivity': '10 10<sup>-8</sup> Ω m', 'Vickers hardness': '608 MN m<sup>-2</sup>', 'Brinell hardness': '490 MN m<sup>-2</sup>', 'Youngs modulus': '211 GPa', 'Ionic radii': {'2': 0.92, '3': 0.785}, 'Atomic no': 26, 'Mendeleev no': 61, 'Thermal conductivity': '80 W m<sup>-1</sup> K<sup>-1</sup>', 'Reflectivity': '65 %', 'Liquid range': '1323 K', 'Ionic radii ls': {'2': 0.75, '6': 0.39, '3': 0.69, '4': 0.725}, 'Rigidity modulus': '82 GPa', 'X': 1.83, 'Critical temperature': 'no data K', 'Poissons ratio': '0.29', 'Oxidation states': [-2, -1, 1, 2, 3, 4, 5, 6], 'Van der waals radius': 'no data', 'Velocity of sound': '4910 m s<sup>-1</sup>', 'Coefficient of linear thermal expansion': '11.8 x10<sup>-6</sup>K<sup>-1</sup>', 'Bulk modulus': '170 GPa', 'Common oxidation states': [2, 3], 'Name': 'Iron', 'Atomic mass': 55.845, 'Electronic structure': '[Ar].3d<sup>6</sup>.4s<sup>2</sup>', 'Density of solid': '7874 kg m<sup>-3</sup>', 'Refractive index': 'no data', 'Atomic radius calculated': 1.56, 'Boiling point': '3134 K'}
You can easily create an object with various information such as the ionic radius, melting point, resistivity, mass, and electronic structure of the atom. If you want to access each, specify the attribute as follows and get it. You can see the attribute list by looking at the source or document.
ionic_radii_fe = fe.ionic_radii
Next, let's take a look at the Specie class. In the Specie class, atoms can be represented by considering the oxidation number.
supported_properties = ("spin",)
class Specie(symbol, oxidation_state, properties=None):
def __init__(self, symbol, oxidation_state, properties=None):
self._el = Element(symbol)
self._oxi_state = oxidation_state
self._properties = properties if properties else {}
for k in self._properties.keys():
if k not in Specie.supported_properties:
raise ValueError("{} is not a supported property".format(k))
Elements can have oxidation numbers and properties. You can simply think of it as an extended version of Element. It is recommended that the Specie object have ideal oxidation numbers and characteristics, and the Site object described later can express the oxidation state and spin state of elements in the crystal structure, so the simulation results are saved. To do this, use the Site object.
pymatgen.core.composition http://pymatgen.org/pymatgen.core.composition.html#module-pymatgen.core.composition This module is a module that expresses the composition of substances such as H2O and NaCl </ b>. Here are the most commonly used Composition class classes.
class Composition(collections.Hashable, collections.Mapping, MSONable):
def __init__(self, *args, **kwargs):
self.allow_negative = kwargs.pop('allow_negative', False)
# it's much faster to recognize a composition and use the elmap than
# to pass the composition to dict()
if len(args) == 1 and isinstance(args[0], Composition):
elmap = args[0]
elif len(args) == 1 and isinstance(args[0], six.string_types):
elmap = self._parse_formula(args[0])
else:
elmap = dict(*args, **kwargs)
elamt = {}
self._natoms = 0
for k, v in elmap.items():
...
It's hard to understand even at a glance and it's difficult to explain, so let's just see how to use it. .. ..
>>> #Easy to define with strings like NaCl and H2O
>>> comp = Composition("LiFePO4")
>>> #Atomic number count
>>> comp.num_atoms
7.0
>>> #Number of each atom
>>> comp.formula
'Li1 Fe1 P1 O4'
>>> #Composition ratio(Atomic number/全Atomic number)
>>> comp.get_atomic_fraction(Element("Li"))
0.14285714285714285
It is easy to define, and you can create a convenient composition object. There are many other features, so please refer to the Documentation.
pymatgen.core.lattice
http://pymatgen.org/pymatgen.core.lattice.html#module-pymatgen.core.lattice Lattice means a lattice, and I think many people remember the unit lattice that they learn even in high school. From wikipedia
Vector defined here
R = {n}_1{a}_1+{n}_2{a}_2+{n}_3{a}_3
Represents the lattice vector. This 3D vector is defined by the following multidimensional array.
R = [[10,0,0], [20,10,0], [0,0,30]]
In pymatgen, using the lattice class makes it even more convenient.
class Lattice(MSONable):
def __init__(self, matrix):
m = np.array(matrix, dtype=np.float64).reshape((3, 3))
lengths = np.sqrt(np.sum(m ** 2, axis=1))
angles = np.zeros(3)
for i in range(3):
j = (i + 1) % 3
k = (i + 2) % 3
angles[i] = abs_cap(dot(m[j], m[k]) / (lengths[j] * lengths[k]))
self._angles = np.arccos(angles) * 180. / pi
self._lengths = lengths
self._matrix = m
self._inv_matrix = None
self._metric_tensor = None
self._diags = None
self._lll_matrix_mappings = {}
self._lll_inverse = None
self.is_orthogonal = all([abs(a - 90) < 1e-5 for a in self._angles])
...
The argument matrix corresponds to the following format or numpy array.
#Multidimensional list
[[1, 0, 0], [0, 1, 0], [0, 0, 1]]
#list
[1, 0, 0 , 0, 1, 0, 0, 0, 1]
#Tuple
(1, 0, 0, 0, 1, 0, 0, 0, 1)
Above is a simple cubic crystal (cube).
>>> l = Lattice([1,0,0,0,1,0,0,0,1])
>>> l._angles
array([ 90., 90., 90.])
>>> l.is_orthogonal
True
>>> l._lengths
array([ 1., 1., 1.])
>>> l._matrix
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
In this way, you can use the Lattice object to access angles, lengths, and so on. We recommend reading the documentation as it has other useful features. http://pymatgen.org/pymatgen.core.lattice.html
pymatgen.core.structure http://pymatgen.org/pymatgen.core.structure.html This module provides features that enable crystal structure representation </ b>. Here, we will look at the IStructure class, which provides the most basic functions.
class IStructure(SiteCollection, MSONable):
def __init__(self, lattice, species, coords, validate_proximity=False,
to_unit_cell=False, coords_are_cartesian=False,
site_properties=None):
...
It takes various arguments, so let's take a look at each one.
You can also use the pymatgen.core.lattice.Lattice class.
This is the type of atom. It supports various formats as follows.
#List of atoms
["Li", "Fe2+", "P", ...]
#Atomic number
(3, 56, ...)
#List including occupancy
[{"Fe" : 0.5, "Mn":0.5}, ...]
It specifies the coordinates of each atom.
#At the time of NaCl
coords = [[0, 0, 0], [0.5, 0.5, 0.5]]
With these in mind, Structure can be defined like this:
from pymatgen import Lattice, IStructure
#CsCl structure
a = 4.209 #Å
latt = Lattice.cubic(a)
structure = IStructure(latt, ["Cs", "Cl"], [[0, 0, 0], [0.5, 0.5, 0.5]])
>>> structure.density
3.7492744897576538
>>> structure.distance_matrix
array([[ 0. , 3.64510092],
[ 3.64510092, 0. ]])
>>> structure.get_distance
<bound method IStructure.get_distance of Structure Summary
Lattice
abc : 4.2089999999999996 4.2089999999999996 4.2089999999999996
angles : 90.0 90.0 90.0
volume : 74.565301328999979
A : 4.2089999999999996 0.0 0.0
B : 0.0 4.2089999999999996 0.0
C : 0.0 0.0 4.2089999999999996
PeriodicSite: Cs (0.0000, 0.0000, 0.0000) [0.0000, 0.0000, 0.0000]
PeriodicSite: Cl (2.1045, 2.1045, 2.1045) [0.5000, 0.5000, 0.5000]>
In this way, you will be able to access distances, positional relationships, densities, etc. in the structure. Next, let's see what kind of analysis function is available.
I will briefly introduce what kind of modules are available. I will pick up some of them and try to actually move them. The right side of this image is the analysis function provided by pymatgen.
--Phase diagram output --Reaction calculation --Electronic structure analysis and visualization --Application characteristic analysis such as battery characteristics --Structural visualization
etc. .. ..
For example, in the case of electronic structure analysis, band structure analysis can be performed.
You can also output a phase diagram.
I will write articles that introduce the functions in detail from the source if there is demand & if I feel like it, so please comment if you have any requests.
Basically, other software is used for simulations with a large amount of calculation, but efficiency will increase if pymatgen is used for analysis and visualization that combine these data. On the left side of this image is the link with commonly used data formats and tools.
--VASP input and output can be imported --Cif files used in Material Studio etc. can also be handled --Supports open babel format --Can be linked with Materials Project rest api
And so on. If you are using the above files or software for first-principles calculations, please introduce it.
Regarding Materials Project rest api, I wrote how to use it in Previous article, so please refer to it. If you use this, the Materials Project will publish a large amount of data and you can freely collect data, so it is essential if you want to do machine learning.
It's been a long time, but let's all use pymatgen!
The basic flow of installation is
is. Please refer to the following articles until installing conda. Python environment construction for those who aim to become data scientists 2016
When you're done that far
conda install --channel matsci pymatgen
After checking the operation, the installation is complete.
>>> import pymatgen
>>> pymatgen.__version__
'4.5.4'
Now that you're ready, let's try it out!
The band structure represents the dispersion of electrons in the periodic structure of a crystal. When illustrating the band structure, the vertical axis is energy, but the horizontal axis is the points of the reciprocal lattice space, which is quite difficult to understand. So, if you are not interested, please understand to the extent that you can understand how the electrons are dispersed.
With pymatgen, you can visualize the analyzed band structure and process it as data. First, import the required libraries.
#module for using REST API of materials project
from pymatgen.matproj.rest import MPRester
#For plotting band structures
%matplotlib inline
from pymatgen.electronic_structure.plotter import BSPlotter
Next, get the analyzed band structure. In the real research, the file analyzed by your own analysis software (VASP etc.) is converted to the object of pymatgen, but this time, the analyzed object is downloaded from Materials Project database </ b>. .. To use the materials project, you need to register and get the API key, so please get the API key on the Official Page.
#Specify your API key
a = MPRester("My API key")
#Specify the id of the desired material and get it by http communication. CuAlO2(mp-Get 3784)
bs = a.get_bandstructure_by_material_id("mp-3748")
This completes the acquisition of the band structure! I got a Band Structure object directly from the materials project. http://pymatgen.org/_modules/pymatgen/electronic_structure/bandstructure.html#BandStructure
By the way, you can get the information of materials like this on the materials project.
We will process this information on python.
>>> bs.is_metal()
False
It doesn't look like metal.
>>> bs.get_band_gap()
{'energy': 1.7978000000000005, 'direct': False, 'transition': '(0.591,0.409,0.000)-\\Gamma'}
Bandgap is 1.977eV
>>> bs.get_direct_band_gap()
18.0201
The bandgap for direct transition is 18.021eV. Now let's plot the band diagram.
%matplotlib inline
from pymatgen.electronic_structure.plotter import BSPlotter
plotter = BSPlotter(bs)
plotter.get_plot().show()
This is the data on the materials project, but you can output the simulation results of the substances you are usually researching.
So, I'd be happy if I could gradually understand that pymatgen objects seem to be useful for comparison with other materials, machine learning, exploration, etc.
There are still many things we can do, but this time we will end here. (It was long···) I would like to hold a pymatgen study session, a materials project study session, a vasp study session, or Materials Informatics Young People's Association, so if you are interested, please come and join us. Please come in ~
Also, I will write an article in Materials Informatics next time. Thank you for your careful reading ~
Recommended Posts