Linux

awk regrex pattern

In SNPTEST output, the header get “#” and “alternate_ids” which I don’t want to process. with awk, it is quite simple to skip these lines with regrex: cat snptest.out | awk ‘!/#|alternate_ids {print $0}’   OFS=$”\t” can use tab as output seperator NR line number, if you know how many lines you don’t want in… Continue reading awk regrex pattern

Linux

Download 1000 genome using Aspera

1. download aspera browser plugin and install 2. default in linux, it creates ~/.aspera/ 3.~/.aspera/connect/bin/ascp -i ~/.aspera/connect/etc/asperaweb_id_dsa.openssh -Tr -Q -l 100M -L- fasp-g1k@fasp.1000genomes.ebi.ac.uk:vol1/ftp/phase3/data/HG00133/sequence_read/SRR035484_1.filt.fastq.gz ./fastq/ change 100M to 300M will increase the download speed from 100M/s to 300M/s. In average, it is 97M/s to 297M/s.

GWAS

PCA with shellfish

Assuming  shellfish and all other related software have been installed correctly. Assuming shellfish.py exists then prepare a pbs script, here I called it Shellfish.pbs and I have plink files calledABC.bim ABC.bed ABC.fam cat Shellfish.pbs #!/bin/bash #PBS -N shellfish #PBS -S /bin/bash #PBS -j oe #PBS -l walltime=24:00:00 #PBS -l ncpus=20 #PBS -l mem=100G hostname cd… Continue reading PCA with shellfish

R

R X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 1 at size 12 could not be loaded

Recently I came across a problem in R (3.2.3 and 3.2.4). When I type in R: plot(1:5, 1:5) X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 1 at size 12 could not be loaded and I missed all the x-axis and y-axis labels. My R sessionInfo() is R version 3.2.3 (2015-12-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Red Hat Enterprise… Continue reading R X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 1 at size 12 could not be loaded