plink chromosome code
X X chromosome -> 23 Y Y chromosome -> 24 XY Pseudo-autosomal region of X -> 25 MT Mitochondrial -> 26 1-22 autosome -> 1-22
X X chromosome -> 23 Y Y chromosome -> 24 XY Pseudo-autosomal region of X -> 25 MT Mitochondrial -> 26 1-22 autosome -> 1-22
In SNPTEST output, the header get “#” and “alternate_ids” which I don’t want to process. with awk, it is quite simple to skip these lines with regrex: cat snptest.out | awk ‘!/#|alternate_ids {print $0}’ OFS=$”\t” can use tab as output seperator NR line number, if you know how many lines you don’t want in… Continue reading awk regrex pattern
1. download aspera browser plugin and install 2. default in linux, it creates ~/.aspera/ 3.~/.aspera/connect/bin/ascp -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh -Tr -Q -l 100M -L- fasp-g1k@fasp.1000genomes.ebi.ac.uk:vol1/ftp/phase3/data/HG00133/sequence_read/SRR035484_1.filt.fastq.gz ./fastq/ change 100M to 300M will increase the download speed from 100M/s to 300M/s. In average, it is 97M/s to 297M/s.
I have been using stow for couples year (still a very junior user) and it is very good software management tool. But I come across module about 1 year ago and found it is even better ( will discuss it in other post. To use module, suppose the module was installed under /usr/share/modules 1.source… Continue reading using module in HPC enviroment
Recently I moved to another HPC and recompiled all my programs. And suddenly gmon.out appears in many directories. I searched a bit and realized that is the profiling issue. But nobody discussed in details. It is quite confused for a user like me, who don’t used profiling that often. Because when I configured my python,… Continue reading gmon.out
Assuming shellfish and all other related software have been installed correctly. Assuming shellfish.py exists then prepare a pbs script, here I called it Shellfish.pbs and I have plink files calledABC.bim ABC.bed ABC.fam cat Shellfish.pbs #!/bin/bash #PBS -N shellfish #PBS -S /bin/bash #PBS -j oe #PBS -l walltime=24:00:00 #PBS -l ncpus=20 #PBS -l mem=100G hostname cd… Continue reading PCA with shellfish
Recently I came across a problem in R (3.2.3 and 3.2.4). When I type in R: plot(1:5, 1:5) X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 1 at size 12 could not be loaded and I missed all the x-axis and y-axis labels. My R sessionInfo() is R version 3.2.3 (2015-12-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Red Hat Enterprise… Continue reading R X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 1 at size 12 could not be loaded
cat test.txt | awk ‘{if(NR > 10){XXXX}}’
It is simple. Use double quotes for the sed expression. For example:
In GWAS, a common way to investigate if there are any systematic biases that may be present in your association results is to calculate the genomic inflation factor, also known as lambda gc (λgc). The genomic inflation factor λgc is defined as the ratio of the median of the empirically observed distribution of the test statistic… Continue reading Genomic inflation factor calculation