Introducing an ultra-simple molecular phylogenetic tree creation method using the ETE Toolkit.
(Figure: from http://etetoolkit.org/documentation/ete-build/)
--BLAST (Sequence similarity search) --MAFFT (Alignment) https://mafft.cbrc.jp/alignment/server/ --trimAl (correction of alignment for phylogenetic tree construction), optional http://trimal.cgenomics.org/ --RAxML (Phylogenetic tree calculation) https://sco.h-its.org/exelixis/web/software/raxml/index.html -FigTree or iTOL (Phylogenetic tree drawing)
Types of phylogenetic trees: unrooted tree, rooted tree, Phylogenetic tree calculation method: NJ method, maximum likelihood method, Bayes method
How to build a phylogenetic tree in an instant using the ETE Toolkit. It was so much fun that I was really scared. I really wanted to make it a paid information product, but to be honest, I was really worried. And I realized that I was thinking seriously and seriously. I would like to publish it specially this time.
It is assumed that Python and Anaconda are included. If not included / suspicious → macOS: Notes from installing Homebrew to building an Anaconda environment for Python with pyenv Linux: Note on building Python's Anaconda environment with pyenv in Linux environment
$ pip install ete3 #Install ETE toolkit
$ conda install -c etetoolkit ete_toolchain #Installation of necessary tools
The grammar is
$ ete3 build -w Workflow name-n Input array file(Before alignment) -o Output directory name--clearall
Example:
$ ete3 build -w mafft_linsi-none-none-raxml_default -n input.fasta -o output_tree --clearall
Only this! This alone will do all the multiple alignment of the array, trimming of many gap areas, and system estimation! !!
The syntax of this workflow is ʻaligner-trimmer-tester-builderseparated by hyphens. For example, the workflow
mafft_linsi-none-none-raxml_default Then, align with mafft's L-INS-i algorithm and build a phylogenetic tree with RAxML .. If Paisen of the lab uses clustal for alignment! If you say
clustalo_default-none-none-raxml_default, it will be aligned using clustal omega (successor to clustal w). If you want to include the process of trimming the gap area of the array using trimAl, You can do it with
mafft_linsi-trimal01-none-raxml_default. If you want to include the calculation of the bootstrap value when estimating the phylogenetic tree Let's say
mafft_linsi-none-none-raxml_default_bootstrap` (it will take some time).
-Maybe it's safe to say mafft_linsi-none-none-raxml_default_bootstrap
(sorry, sorry)
The tools that can be used
$ ete3 build apps
If so, it will be displayed. Reference: Composing custom workflows
When you execute the phylogenetic tree construction command in the above example,
It will generate various result files in ./output_tree/mafft_linsi-none-none-raxml_default
.
There are various files in it,
・ Input.fasta.final_tree.png
→ A chara diagram that shows the phylogenetic tree and the schematic diagram of the aligned array together.
・ Input.fasta.final_tree.fa
→ Aligned fasta file
・ Input.fasta.final_tree.nw
→ Estimated phylogenetic tree file. If you load this into FigTree or iTOL, you can get a beautiful figure.
Recommended Posts