Thursday, 9 July 2015

Pulling out individual chromosomes from fasta file

This is probably only useful under certain instances, but I thought I would share.

Using sed, you can grab a single chromosome from a fasta file.

sed -n '/>chr1/,/>chr2/p' <fasta>

Note that you can use this to grab multiple consecutive chromosomes too.

sed -n '/>chr1/,/>chr4/p' <fasta>

Get the nth base from a certain chromosome in a fasta file

I was trying to write my own tool to do this, but I doubt I could make it run as fast or faster than an existing tool.

Turns out samtools does the trick.

http://seqanswers.com/forums/showthread.php?t=17315

samtools faidx <fasta.fa> <seq>:<pos>-<pos>

For example, to get the 6078th base in chr 3:
samtools faidx <fasta.fa> chr3:6078-6078

Wednesday, 8 July 2015

Grab random lines from a file with Unix

It is very easy to grab a random set of lines from a file using BASH.  There is a built in command called shuf that you can use.

For example,
To extract 1 random line from a file:
shuf -n 1 <file>

To extract 5 random lines from a file:
shuf -n 5 <file>

Tuesday, 7 July 2015

Pulling sections out of FastQC output file

This is fairly straightforward.  The output file generated by FastQC puts dividers in the data already.  They look like the following:
>>Basic Statistics
>>END_MODULE

This is useful for if we want to separate the data.  You can do so with the following script.
https://github.com/agduncan94/BioinformaticTools/blob/master/grabFastQCOutput.pl


I'm sure there are better ways, but if you just want to do it quickly then this will work.

Wednesday, 1 July 2015

Using tmux to split terminal windows

Before using tmux, it was very common that I had 5 or 6 terminal windows open (I don't like using tabs). This made it very hard to navigate between each window, especially since I only have two small screens.  Having other programs like Firefox open just added to the clutter.  With tmux I can lower the amount of windows to just 1 or 2, and that is done by splitting the windows.

I find having multiple panes in a window is very useful for having one dedicated to writing code, another for running code, and a third for processes.  You may find different combinations which work best for you.  Typically I have one pane no the top half and the bottom half is split into two panes side by side.

Tmux has a lot of features besides splitting windows, but this tutorial just looks at the splitting windows feature.

Useful commands
ctrl-b " - split horizontally
ctrl-b % - split vertically
ctrl-b x - close pane
ctrl-b <arrow key> - navigate between panes.

1. Split horizontally
ctrl-b "

2. Split vertically
ctrl-b %

3. Split into three
 ctrl-b "
To get a split row on the bottom, navigate to the bottom pane and do
ctrl-b %
The process is similar for if you want to split the top row.