Although prepDE.py is prepared in the StringTie package to analyze the expression in the R environment by inputting the expression level calculated by StringTie, the person who is building the environment of python3 because it is implemented in python2. For me, I need to change to python2 even temporarily. This is a bit unpleasant and can be a big problem for people who aren't willing to write their own code.
There are several ways to convert python2 code to python3, but the conversion doesn't always work. According to my research, RNAseqR, which is a library of R, uses prepDE.py in the backend when reading in R when using the pipeline of StringTie. The [manual] of this RNAseqR (https://bioconductor.org/packages/release/bioc/vignettes/RNASeqR/inst/doc/RNASeqR.html) described the correspondence in the case of python3. Start quoting>
5.9 The Reads Count Table Creator
Whether this step is executed depends on the availability of Python on your workstation.
- Input: ‘samplelst.txt’
- Output: ‘gene_count_matrix.csv’, ‘transcript_count_matrix.csv’
- The reads count table converter Python script is downloaded as
prepDE.py
- Python checking
- When Python is not available, this step is skipped.
- When Python2 is available,
prepDE.py
is executed.- When Python3 is available, the
2to3
command will be checked.(Usually, if Python3 is installed,2to3
command will be installed too.)- When Python3 is available but the
2to3
command is unavailable, the raw read count step will be skipped.- When Python3 and the
2to3
command are available,prepDE.py
is converted to a file that can be executed by Python2 and is automatically executed.
<End of quote
In other words, fortunately, I found that it seems to work even if I convert it to python3 code with the conversion command. Let's try it. The environment is OS X 10.13.4 High Sierra.
suimyenbookpuro:stringtie suimye$ pyenv versions
system
* 2.7.6 (set by /Users/suimye/.pyenv/version)
3.6.0
anaconda3-4.3.1
anaconda3-4.3.1/envs/py3.6.0
suimyenbookpuro:stringtie suimye$ pyenv global 3.6.0
suimyenbookpuro:stringtie suimye$ pyenv versions
system
2.7.6
* 3.6.0 (set by /Users/suimye/.pyenv/version)
anaconda3-4.3.1
anaconda3-4.3.1/envs/py3.6.0
The following command overwrites prepDE.py and translates it into python3 code. The original code is backed up in a file called prepDE.py.bak, so if it doesn't work, you can restore it from this file.
2to3 -w ~/tools/stringtie-2.0.6.OSX_x86_64/prepDE.py
BAMFILE1=test1.sort.bam
BAMFILE2=test2.sort.bam
REFGTF=/Users/suimye/genome/hg19.refFlat.20130205.gtf
/Users/suimye/tools/stringtie-2.0.6.OSX_x86_64/stringtie $BAMFILE1 -e -B -G $REFGTF -o ball.test1.gtf
/Users/suimye/tools/stringtie-2.0.6.OSX_x86_64/stringtie $BAMFILE2 -e -B -G $REFGTF -o ball.test1.gtf
printf "test1\tball.test1.gtf\n" >list.txt
printf "test2\tball.test2.gtf\n" >>list.txt
python ~/tools/stringtie-2.0.6.OSX_x86_64/prepDE.py -i list.txt
-rw-r--r-- 1 suimye staff 318K 12 25 13:32 gene_count_matrix.csv
drwxr-xr-x 32 suimye staff 1.0K 12 25 13:32 .
-rw-r--r-- 1 suimye staff 723K 12 25 13:32 transcript_count_matrix.csv
-rw-r--r-- 1 suimye staff 65M 12 25 13:03 ball.test2.gtf
-rw-r--r-- 1 suimye staff 65M 12 25 13:03 ball.test1.gtf
suimyenbookpuro:stringtie suimye$ head gene_count_matrix.csv
gene_id,test1,test2
DDX11L1,15,2
WASH7P,0,1
MIR6859-1,0,0
MIR6859-2,0,0
MIR6859-3,0,0
MIR6859-4,0,0
------abridgement-----
Recommended Posts