I will leave it as a memorandum A script that retrieves only the complete array predicted by Transdocode from the Trinity result file.
compgene_finder.sh
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
from Bio import SeqIO
import csv
fasta_in = sys.argv[1] #Specify the fasta file as the first argument.
for record in SeqIO.parse(fasta_in, 'fasta'): #Open fasta file Parse using SeqIO(Read one item at a time)
id_part = record.id #Read the ID part of fasta
desc_part = record.description #Read the ID part of fasta
seq = record.seq #Read the array part of fastan
if 'type:complete' in desc_part:
fasta_seq = '>' + desc_part + '\n' + seq #Arrange in fasta format
print(fasta_seq) #Output fasta to standard output
Recommended Posts