[Ruby] How to operate IGV using socket communication, and the story of making a Ruby gem using that method

5 minute read

Introduction

IGV is software called Genome Browser. You can browse various files such as aligned read bam file, bed file with annotation of position information on genome, gff3 file with gene information written. If you have been involved in bioinformatics, you probably know this software.

IGV repository on Github IGV.png *Image is from IGV homepage

Although it is such an IGV, it may take some time to operate it while clicking with the mouse.

  • Specify reference genome
  • Load the required files
  • Look around the places you are paying attention to
  • Take screenshot and save as image

Although visually confirming the results is fun and I think it is a very important task, but I also want to automate routine tasks as much as possible. Therefore, we will consider operating the IGV from a programming language.

  • Recently, igv.jsmadewithJavaScriptseemstobeactivelydevelopedandwidelyused…Thisisnotcoveredinthisarticleastheresearchisinsufficient.(ButeventuallyI’dliketomakeitaQiitaarticle.)

Operate IGV using socket communication

Actually, IGV can be operated from port 60151 using socket communication.

There is a tab called advanced in View> Preference of the menu bar. If Enable port is not checked, check here.

image.png

Then, you can use port 60151 to control the IGV using socket communication.

List of commands that can be used to operate IGV

I translated the official reference into Japanese using DeepL translation.

Command Description      
new Create a new session. Remove all tracks except the default genome annotation.      
load file Loads a data or session file. Specify the full path or URL separated by commas.      
collapse trackName Collapses the specified trackName. If no trackName is specified, all tracks will be collapsed.      
echo Returns “echo” in the response. (For testing)   exit Closes the IGV application.
expand trackName Expands the specified trackName. If no trackName is specified, all tracks will be expanded.      
genome genomeIdOrPath Select a genome by id or load the genome (indexed fasta) from the specified path.      
goto locus or listOfLoci Scroll to a single locus or a space-separated list of loci. If a list is provided, these loci are shown in split screen view. Any valid syntax for the IGV search box is fine.      
goto all Scrolls to display the entire genome.      
group option Alignment track only. Group the alignments by one of the following options: STRAND, SAMPLE, READ_GROUP, LIBRARY, FIRST_OF_PAIR_STRAND, TAG, PAIR_ORIENTATION, MATE_CHROMOSOME, SUPPLEMENTARY, MOVIE, ZMW, HAPLOTYPE, READ_ORDER, NONE, BASE_AT_POS      
region chr start end Defines a region of interest surrounded by two loci (eg region chr1 100 200).      
maxPanelHeight height Set the number of pixels (height) in the vertical direction of each panel included in the image. Images created from port commands or batch scripts are not limited to the data displayed on the screen. In other words, the image can include the entire panel, not just the visible portion of the scrollable screen area. The default value for this setting is 1000; increase this value to see more data and decrease it to create smaller images.      
setLogScale(true or false)     setSleepInterval ms Sets the delay (sleep) time in milliseconds. The sleep interval is called between consecutive commands.
snapshotDirectory path Set the directory where the images are written.      
snapshot filename Saves a snapshot of the IGV window to an image file. If filename is omitted, a PNG file with the file name generated based on the trajectory will be written. If filename is specified, the filename extension determines the format of the image file and must be .png, .jpg, or .svg.      
sort option locus Sorts tracks with copy numbers that are aligned or segmented. The values applied to the segmented copy number option are (1) segmented copy number AMPLIFICATION and DELETION, (2) alignment track POSITION, STRAND, BASE, QUALITY, SAMPLE, READGROUP, INSERSTSIZE, FIRSTOFPAIRSTRAND. , MATECHR, READORDER, and READNAME. option is not case sensitive. You can specify locus to define a single location or range. If no option is specified, the sort is based on the visible area or the center of the visible area.      
squish trackName Squish the given trackName. trackName is optional, if not specified all annotation tracks will be squished.      
viewaspairs trackName Set the alignment track display mode to “View as pairs”.      
preference key value Temporarily sets the preference named key to the specified value. This setting will only take effect until the IGV is shut down.      

Control IGV from programming language

Java

IGV is software developed in Java, so the official examples are also written in Java.

  Socket socket = new Socket("127.0.0.1", 60151);
  PrintWriter out = new PrintWriter(socket.getOutputStream(), true);
  BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));

  out.println("load na12788.bam,n12788.tdf");
  String response = in.readLine();
  System.out.println(response);

  out.println("genome hg18");
  response = in.readLine();
  System.out.println(response);

  out.println("goto chr1:65,827,301");
  //out.println("goto chr1:65,839,697");
  response = in.readLine();
  System.out.println(response);

  out.println("snapshotDirectory /screenshots");
  response = in.readLine();
  System.out.println(response);

  out.println("snapshot");
  response = in.readLine();
  System.out.println(response);

R

What about programming languages other than Java? If you are using R, I am not very familiar with it, but I think that a library for manipulating IGV is provided from around bioconductor, so you should use it. Even a little search will find software called igvR.Thisseemstoutilizeigv.js, so it may not be the one to automatically operate the IGV on the desktop…

Python

It’s a little old script, but it was created by Brent Pedersen, who has been enthusiastically developing tools for bioinformatics using Nim language igv.py. This is a small library that wraps the socket communication above.

Ruby

I like Ruby Nani Nani? Fewer tools? Then you can make it yourself! So, referring to Brent Pedersen’s script above, I created a tool called ruby-igv that can operate IGV from Ruby language. Now you can easily operate IGV from Ruby language.

https://github.com/kojix2/ruby-igv

The usage is like this.

igv = IGV.new

igv.load'na12788.bam'
igv.genome'hg18'
igv.go'goto chr1:65,827,301'
igv.save'image.png'

It is a hoyahoya tool that is still fresh.If you come across a rough cut, a bug, or find a request, please report it to issue on Github. Of course, pull requests are also welcome.

in conclusion

Not limited to bioinformatics, it is often the case that attention is focused on how to combine the existing tools to achieve the purpose and how to master the tools. And when you make a tool, you can think that the purpose and means are reversed. But in a sense, I think it’s a very selfish, narrow-minded idea. The more people create tools and show them to the world, the more convenient and expansive the world becomes. Let’s make tools and show them more freely. (Because it doesn’t have to be Ruby)