This is probably only useful under certain instances, but I thought I would share.
Using sed, you can grab a single chromosome from a fasta file.
sed -n '/>chr1/,/>chr2/p' <fasta>
Note that you can use this to grab multiple consecutive chromosomes too.
sed -n '/>chr1/,/>chr4/p' <fasta>
Showing posts with label Bioinformatics. Show all posts
Showing posts with label Bioinformatics. Show all posts
Thursday, 9 July 2015
Get the nth base from a certain chromosome in a fasta file
I was trying to write my own tool to do this, but I doubt I could make it run as fast or faster than an existing tool.
Turns out samtools does the trick.
http://seqanswers.com/forums/showthread.php?t=17315
samtools faidx <fasta.fa> <seq>:<pos>-<pos>
For example, to get the 6078th base in chr 3:
samtools faidx <fasta.fa> chr3:6078-6078
Turns out samtools does the trick.
http://seqanswers.com/forums/showthread.php?t=17315
samtools faidx <fasta.fa> <seq>:<pos>-<pos>
For example, to get the 6078th base in chr 3:
samtools faidx <fasta.fa> chr3:6078-6078
Tuesday, 7 July 2015
Pulling sections out of FastQC output file
This is fairly straightforward. The output file generated by FastQC puts dividers in the data already. They look like the following:
>>Basic Statistics
>>END_MODULE
This is useful for if we want to separate the data. You can do so with the following script.
I'm sure there are better ways, but if you just want to do it quickly then this will work.
>>Basic Statistics
>>END_MODULE
This is useful for if we want to separate the data. You can do so with the following script.
https://github.com/agduncan94/BioinformaticTools/blob/master/grabFastQCOutput.pl
I'm sure there are better ways, but if you just want to do it quickly then this will work.
Subscribe to:
Posts (Atom)