Suppose that the following bedgraph file is read and some processing is performed only in the specified area.
example.bedgraph
chr1 10084 10104 2
chr1 10104 10124 4
chr1 10124 10164 6
chr1 10164 10184 11
chr1 10184 10204 14
chr1 10204 10224 16
chr1 10224 10244 14
chr1 10244 10264 15
chr1 10264 10284 14
chr1 10284 10304 17
For example, to calculate the average value in the area, the following script is applied.
averageBedgraph.py (corrected version)
#!/usr/local/bin/python3
# -*- coding: utf-8 -*-
"""
Calculates the average value of the specified area from the Bedgraph file.
"""
__version__ = "1.00"
__date__ = "7 Jun 2017"
import sys
def averageBedgraph(filename, chromosome, start, end):
"""
@function averageBedgraph();
Calculates the average value of the specified area from the Bedgraph file.
@param {string} filename :Input file name
@param {string} chromosome :Chromosome number
@param {int} start :Starting position
@param {int} end :End position
"""
total = 0
with open(filename) as lines:
for line in lines:
c, s, e, v = line.split()
if c != chromosome:
continue
s, e, v = int(s), int(e), int(v)
if s <= start and end <= e:
total += (end - start) * v
elif s <= start < e:
total += (e - start) * v
elif start < s and e <= end:
total += (e - s) * v
elif s < end <= e:
total += (end - s) * v
elif end < s:
break
print('average : %s' % (total / (end - start)))
print('done')
if __name__ == '__main__':
argvs = sys.argv
argc = len(argvs)
if (argc != 5): # Checking input
print("USAGE : python3 averageBedgraph.py <INPUT_FILE> <CHROMOSOME> <START> <END>")
quit()
averageBedgraph(str(argvs[1]),str(argvs[2]),int(argvs[3]),int(argvs[4]))
quit()
Execution example
Calculate the average value of the specified area of bedfile
$ python3 averageBedgraph.py example.bedgraph chr1 10150 10250
Standard output
average : 12.74
done
It's a muddy method that doesn't need to be mentioned, but if you don't want to assemble it, please use it.
Also, if you have a tool that is good at processing such as bedgraph, I would appreciate it if you could teach me. There is no point in reinventing the wheel ... Thank you.
that's all. Thank you very much.
2017/06/09 postscript The code that was posted has been corrected. We would like to express our deep gratitude to shiracamus for making the correction.
Recommended Posts