Recently, I started using PLINK, a genetic statistics software. Learn how to use PLINK with reference to this book.
Practice from scratch genetic statistics seminar
This book is written for Windows, so I'll write a reminder of how to do it on a Mac. This time, I tried eQTL analysis with PLINK.
The basic usage of PLINK and the GWAS analysis method have been posted before. [Linux] I tried using the genetic statistics software PLINK [Linux] GWAS with genetic statistics software PLINK
SNP filtered files
BLK gene expression level data
In the eQTL analysis, a linear analysis is performed in order to analyze the quantitative trait called gene expression level.
List of commands to use.
--pheno
: Enter the phenotype file used for GWAS (Exp_BLK.txt this time)
--linear
: Perform linear regression
--ci 0.95
: Output 95% confidence interval
GWAS (Linear Analysis)
$ ./plink --bfile SNP_QC --out SNP_QC_Exp_BLK --pheno Exp_BLK.txt --linear --ci 0.95
Confirm that the file named ** SNP_QC_Exp_BLK.assoc.linear ** is output to the working directory, and open it with a text editor. The first column is the chromosome number, the second column is the SNP ID, the third column is the chromosome position, and the twelfth column is the * p * value.
From the GWAS result, use the AWK command to extract the columns of "chromosome number", "SNP ID", "chromosome position", and "* p * value".
Use the AWK command to set the input file as ** SNP_QC_Exp_BLK.assoc.linear ** and the output file as a text file ** SNP_QC_Exp_BLK.assoc.linear.P.txt *.
In AWK, separate them with ‘’ and write commands in them to execute them.
By {print $ 2 "\ t" $ 1 "\ t" $ 3 "\ t" $ 12}
The data frame is "2nd column [SNP ID] 1st column [chromosome number] 3rd column [chromosome position] 4th column [ p * value]".
The command that "\ t"
is separated by tabs.
Output as a text file specified by >
.
Extract elements from GWAS results with AWK command and output text file
$ awk '{print $2"\t"$1"\t"$3"\t"$12}' SNP_QC_Exp_BLK.assoc.linear > SNP_QC_Exp_BLK.assoc.linear.P.txt
Draw a Manhattan plot using this GWAS result.
When I drew a Manhattan plot, it looked like this.
Let's extract the SNPs that showed the eQTL effect.
Extract SNP with AWK
awk '$4<=10^-12 {print $0}' 1KG_EUR_QC_Exp_BLK.assoc.linear.P.txt
output
rs13255193 8 11309192 4.539e-13
rs13257831 8 11332964 6.545e-13
rs2736345 8 11352485 1.707e-14
rs1478898 8 11395079 6.882e-16
rs2244894 8 11448659 1.497e-13
rs2244648 8 11450422 2.068e-14
rs13273172 8 11461111 1.188e-14
The SNP with the smallest * p * value was ** rs1478898 **.
Recommended Posts