Analyzing and Interpreting Genome Sequences
Finding mutations for genetic disorders involves not only the study of genes and their modes of inheritance but the cooperation of the scientific community and the way genomic data is collected and sequenced. Communication between scientific communities is crucial to the propelling understanding of genetic disorders. The genome is universal so gene expressions and mutations in a few individuals can give information about certain genes that are found in all humans.
Source: Lander, ES. Initial sequencing and analyzing of the human genome. Nature 2001.
Genomic sequencing of non-vertebrates such as humans and mice, as shown in Figure 1, has been paramount in learning more about genes. Since the human genome sequencing was completed, researchers have worked to sequence exons, which are DNA sequences that code for proteins (Lyon 2012). This is called exome sequencing, short for “a set of exons in a genome” (Ng 2008). Sequencing genomes were very expensive initially. Work in that field was tied down by costs. As technology improved, a way to sequence genomes cheaply was found in 2007, making the field exome sequencing easier to research (Albert 2007). In 2008, the causes for syndromes such as Bartter syndrome were found through exome sequencing (Choi 2009). Genome and exome sequencing became more available through companies founded with the goals of sequencing them at low costs. Genome or exome sequencing are used to find genetic causes of disorders such as diabetes and autism, to diagnose patients, and to study pedigrees (Bonnefond 2010, Hedges 2009).
Unfortunately, exome sequencing cannot be used to learn about all genetic disorders. Some disorders are caused by mutations in the non-coding regions of DNA (Cartault 2012). It is estimated that genetic causes are found in only 10% to 50% of cases by exome sequencing, but it is difficult to tell because most researchers that fail to find a genetic cause through exome sequencing do not publish that they failed (Lyon 2012).
Genome sequencing is more readily available, but there is a tremendous amount of data that needs to be analyzed and interpreted. The field of bioinformatics deals with analyzing and interpreting genomic data. Current technology such as software tools should be improved to aid in sequencing these data. Software tools have limited ability because they can only analyze one type of data that came from one type of experiment for sequencing (Lyon 2012). However, there is not enough support to improve these tools. As a result, a genome sequence that costs $1,000 is in reality much more expensive (Lyon 2012). Analyzing the sequence will cost $20,000 to $100,000 (McPherson 2009). The astronomical cost caused those working in genomics to seek better technology so newer software programs were released (Lyon 2012). The problem still remains, however, that larger amounts of data need to be analyzed more quickly and accurately with practical costs.
There is a more practical way of analyzing and interpreting genomic data. Instead of each individual or team analyzing an entire genome, several groups can analyze one genome and share data with each other. In addition, genomic data and its analyses can be made available to the entire scientific community to spread knowledge. Collaboration in the scientific community has produced amazing results in the past in sequencing the human genome for the first time. It can work again to analyze the genome. There may be privacy concerns because there are several ways to thinking about who “owns” a genomic sequence from an individual (Lyon 2012). Companies that store, analyze and interpret the sequence may own it or the individual that the sequence came from may own it. Currently, the consensus is that the individual the genomic sequence came from owns the sequence (Lyon 2012). There may also be other concerns between data sharing among scientists. However, it is undeniable that collaborating to tackle the problem of analyzing and interpreting genomic sequences to find the genetic basis for genetic disorders is a smart idea.
Works Cited
Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X, Richmond TA, Middle CM, Rodesch MJ, Packard CJ, Weinstock GM, Gibbs RA: Direct selection of human genomic loci by microarray hybridization. Nat Methods. 2007, 4: 903-905. 10.1038/nmeth1111.
Bonnefond A, Durand E, Sand O, De Graeve F, Gallina S, Busiah K, Lobbens S, Simon A, Bellanné-Chantelot C, Létourneau L, Scharfmann R, Delplanque J, Sladek R, Polak M, Vaxillaire M, Froguel P: Molecular diagnosis of neonatal diabetes mellitus using next-generation sequencing of the whole exome. PLoS One. 2010, 5: e13630-10.1371/journal.pone.0013630.
Cartault F, Munier P, Benko E, Desguerre I, Hanein S, Boddaert N, Bandiera S, Vellayoudom J, Krejbich-Trotot P, Bintner M, Hoarau JJ, Girard M, Génin E, de Lonlay P, Fourmaintraux A, Naville M, Rodriguez D, Feingold J, Renouil M, Munnich A, Westhof E, Fähling M, Lyonnet S, Henrion-Caude A: Mutation in a primate-conserved retrotransposon reveals a noncoding RNA as a mediator of infantile encephalopathy. Proc Natl Acad Sci USA. 2012, 109: 4980-4985. 10.1073/pnas.1111596109.
Choi M, Scholl UI, Ji W, Liu T, Tikhonova IR, Zumbo P, Nayir A, Bakkaloglu A, Ozen S, Sanjad S: Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc Natl Acad Sci USA. 2009, 106: 19096-19101. 10.1073/pnas.0910672106.
Hedges DJ, Burges D, Powell E, Almonte C, Huang J, Young S, Boese B, Schmidt M, Pericak-Vance MA, Martin E, Zhang X, Harkins TT, Züchner S: Exome sequencing of a multigenerational human pedigree. PLoS One. 2009, 4: e8232-10.1371/journal.pone.0008232.
Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, et al: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062.
Lyon GJ, Wang K: Identifying disease mutations in genomic medicine settings: current challenges and how to accelerate progress. Genome Medicine. 2012, 4: 58. 10.1186/gm359.
McPherson JD: Next-generation gap. Nat Methods. 2009, 6: S2-5. 10.1038/nmeth.f.268.
Ng PC, Levy S, Huang J, Stockwell TB, Walenz BP, Li K, Axelrod N, Busam DA, Strausberg RL, Venter JC: Genetic variation in an individual human exome. PLoS Genet. 2008, 4: e1000160-10.1371/journal.pgen.1000160.