Throughout the blog, I'll be releasing the copy of Stacy's gene that will be investigated. The first gene available for download is FGD4.zip. In each zip file I make available are a .bam and .bai file. The .bai file is a file containing the index data for the .bam file. It needs to be in the same directory as the .bam file. The index file is necessary to visualize the bam file in IGV. If you have a .bam file and you would like to create your own index, type in the following samtools command:
> samtools index FGD5.bam
You will obviously want to replace "FGD4.bam" with the name of the bam file you want indexed.
Known Causes of CMT4
The following genes are shown in the IPN database (linked above) to be causal for CMT4:- FGD4 Download Stacy's copy FGD4.zip
- FIG4
- GDAP1
- MTMR2
- PRX
- SBF2
- SH3TC2
How I extracted Stacy's gene from the overall sequence data
Samtools allow you to work with bam files, which are the files containing the reads (i.e. sequence data reads).
First thing to do is install samtools. With ubuntu linux, the command is:
> sudo apt-get install samtools
Using SAM tools, one can extract individual chromosomes or regions from the bam file:
> samtools view -bh exome.bam 22 > chromosomes/22.bam
This creates a new bam file comprised of only chromosome 22.
If you'd like to extract a single region, you do the following:
> samtools view -bh exome.bam 22:1000-9000 > chromosomes/22_1000_9000.bam
This creates a new bam file comprised of chromosome 22 in between the 1000-9000 base pairs.
The -b flag indicates that the output will be another bam file as opposed to a sam file. A sam file is the human readable (not binary) format. The -h flag indicates the the header should be included in the output.
> sudo apt-get install samtools
Using SAM tools, one can extract individual chromosomes or regions from the bam file:
> samtools view -bh exome.bam 22 > chromosomes/22.bam
This creates a new bam file comprised of only chromosome 22.
If you'd like to extract a single region, you do the following:
> samtools view -bh exome.bam 22:1000-9000 > chromosomes/22_1000_9000.bam
This creates a new bam file comprised of chromosome 22 in between the 1000-9000 base pairs.
The -b flag indicates that the output will be another bam file as opposed to a sam file. A sam file is the human readable (not binary) format. The -h flag indicates the the header should be included in the output.
No comments:
Post a Comment