About this blog

Thursday, May 10, 2012

Database of CMT causing mutations

Stacy had her exome sequenced with 23andMe. I was fortunate enough to have been included in the 80x exome pilot study and they were nice enough to allow me to switch places with Stacy. We got her exome sent to us over this past weekend. The first order of business is finding a list of CMT4 mutations to check. Here is a database of CMT causing mutations I found maintained by the Molecular Genetics Department at the University of Antwerp. In addition, I'll be using the HGMD (Human Gene Mutation Database). These is great resources and will provide dozens of possible CMT4 causing mutations to analyze. I'll check all of the mutations known to cause CMT4 in these databases and proceed from there. There is always a possibility that there is a novel mutation causing CMT4 at work.

Throughout the blog, I'll be releasing the copy of Stacy's gene that will be investigated. The first gene available for download is FGD4.zip. In each zip file I make available are a .bam and .bai file. The .bai file is a file containing the index data for the .bam file. It needs to be in the same directory as the .bam file. The index file is necessary to visualize the bam file in IGV. If you have a .bam file and you would like to create your own index, type in the following samtools command:

> samtools index FGD5.bam
You will obviously want to replace "FGD4.bam" with the name of the bam file you want indexed.

Known Causes of CMT4

The following genes are shown in the IPN database (linked above) to be causal for CMT4:

  1. FGD4  Download Stacy's copy FGD4.zip
  2. FIG4
  3. GDAP1
  4. MTMR2
  5. PRX
  6. SBF2
  7. SH3TC2

How I extracted Stacy's gene from the overall sequence data

Samtools allow you to work with bam files, which are the files containing the reads (i.e. sequence data reads).

First thing to do is install samtools. With ubuntu linux, the command is:
> sudo apt-get install samtools

Using SAM tools, one can extract individual chromosomes or regions from the bam file:
> samtools view -bh exome.bam 22 > chromosomes/22.bam
This creates a new bam file comprised of only chromosome 22.

If you'd like to extract a single region, you do the following:
> samtools view -bh exome.bam 22:1000-9000 > chromosomes/22_1000_9000.bam
This creates a new bam file comprised of chromosome 22 in between the 1000-9000 base pairs.

The -b flag indicates that the output will be another bam file as opposed to a sam file. A sam file is the human readable (not binary) format. The -h flag indicates the the header should be included in the output.

In the following posts we will examine each of the CMT4 causing genes in detail...

No comments:

Post a Comment