scanpy is a tool to analyze scRNA-seq data with python. Many people may use R's seurat, but I think there are a certain number of people who want to analyze scRNAseq data with python. It is a tool that answers such people. Recently ~~ (about half a year ago) ~~ I have a tutorial to integrate data into the scanpy. (As of 2020/04/18)
As already mentioned above, since version 1.4.5 of scanpy, a function called sc.tl.ingest has been implemented to integrate the newly acquired data with the reference data. There is already a tutorial. (Integrating data using ingest and BBKNN: https://scanpy-tutorials.readthedocs.io/en/latest/integrating-data-using-ingest.html)
Wow! I want to use it!
I think there are many people who say that. However, when I try to install using conda as below
.sh
$ conda install -c bioconda scanpy
Collecting package metadata (repodata.json): done
(Omission)
The following NEW packages will be INSTALLED:
scanpy bioconda/noarch::scanpy-1.4.3-py_0
Will be displayed. If you install it as it is, scanpy-1.4.3 will be installed and you will not be able to use the data integration functions (even though there is a tutorial!).
I think that the bioconda of conda will be updated to a version higher than scanpy 1.4.5, but it's a big deal, so let's use it first.
[Caution!] ** Do not mix conda and pip! There is a claim **. (Although some people argue that it's okay to mix them ...) Below, please be at your own risk.
scanpy home page (https://scanpy.readthedocs.io/en/latest) Looking at, the latest version seems to be 1.4.6. So let's pip install scanpy-1.4.6.
.sh
$ pip install scanpy=="1.4.6"
Collecting scanpy==1.4.6
(Omission)
Successfully installed anndata-0.7.1 h5py-2.10.0 matplotlib-3.2.1 scanpy-1.4.6
The pip install worked fine.
However, it is still early to be relieved. Just in case, let's check if 1.4.6 is really installed from python.
import scanpy as sc
sc.logging.print_versions()
>scanpy==1.4.6 anndata==0.7.1 umap==0.3.10 numpy==1.17.4 scipy==1.4.1 pandas==1.0.3 scikit-learn==0.22 statsmodels==0.10.1 python-igraph==0.7.1 louvain==0.6.1
It seems that scanpy version 1.4.6 is installed successfully.
However, you may get an error when you actually use the function you want to integrate. Tutorial (https://scanpy-tutorials.readthedocs.io/en/latest/integrating-data-using-ingest.html) We will proceed along.
(Omission)
sc.tl.ingest(adata, adata_ref, obs='leiden')
>running ingest
finished (0:00:06)
The data integration function also worked! (Although not mentioned in this article, it can also be integrated using the bbknn method.)
After integrating using this function, if you visualize it, you can see how the reference and the batch effect of the new data overlap. Quote: https://scanpy-tutorials.readthedocs.io/en/latest/integrating-data-using-ingest.html
When I analyze, I always think that bioinformatics is a field where various tools are coming out and if you use them, you will get results like that for the time being. This time, I have not verified the certainty of integration, but I would like to learn more about that. If you have any mistakes or advice, it would be greatly appreciated if you could give us guidance.
1,Seurat:https://satijalab.org/seurat/ 2,Scanpy:https://scanpy-tutorials.readthedocs.io/en/latest/integrating-data-using-ingest.html 3, About package management with conda and pip: https://qiita.com/ynakayama/items/29efebeb38604d10acef
Recommended Posts