Changes
no edit summary
<font size="4"> </font>
<div align="left"><span style="FONTfont-SIZEsize: 13.5pt;">Personal genomics, bioinformatics, and variomics</span><span style="FONTfont-SIZEsize: 9pt;"> <br />
<br />
</span><strong><span style="FONTfont-SIZEsize: 9pt;">Jong Bhak<sup>1</sup>, Ho Ghang<sup>1</sup>, Rohit Reja<sup>1</sup>, and Sangsoo Kim<sup>2</sup>*</span></strong><span style="FONTfont-SIZEsize: 9pt;"><br />
<br />
<strong><sup>1</sup></strong>KOBIC (Korean Bioinformation Center), KRIBB, Daejeon 305-806, Korea. <strong><sup>2</sup></strong>Dept. of Bioinformatics, Soongsil Univ., Seoul 156-743, Korea.<br />
<br />
*Correspondence to: E-mail <a href="mailto:sskimb@ssu.ac.kr"><font color="#000080">sskimb@ssu.ac.kr</font></a> Tel +82-2-820-0457 Fax +82-2-824-4383 or E-mail </span><span style="FONT-SIZE: 9pt"><a href="mailto:jongbhak@yahoo.com"><font color="#0000ff">jongbhak@yahoo.com</font></a> Tel +82-42-879-8500 Fax +82-42-879-8519</span><span style="FONT-SIZE: 9pt"><br /></span><strong><span style="FONT-SIZE: 9pt"><br />Abstract</span></strong><span style="FONT-SIZE: 9pt"><br />There are at least five complete genome sequences available in 2008. It is known that there are over 15,000,000 genetic variants called SNPs in the dbSNP database. The cost of a full genome sequencing in 2009 will be claimed to be less than $5000 USD. The genomics era has arrived in 2008. This review introduces technologies, bioinformatics, genomics visions, and variomics projects. Variomics is the study of the total genetic variation in an individual and populations. Research on genetic variation is the most valuable among many genomics research branches. Genomics and variomics projects will change biology and the society so dramatically that biology will become an everyday technology as personal computers and the internet. 'BioRevolution' is the term that can adequately describe this change.<br />
<br />
Running title: Genomics revolution achieved by cheap sequencing for common people<br />
<br />
</span><span style="font-size: 9pt;"> </span><strong><span style="FONTfont-SIZEsize: 9pt;">Introduction<br /></span></strong><span style="FONTfont-SIZEsize: 9pt;"><strong>Abstract</strong><br />In 2008 at least five complete genome sequences are available. It is known that there are over 15,000,000 genetic variants, called SNPs, in the dbSNP database. The cost of full genome sequencing in 2009 is claimed to be less than $5000 USD. The genomics era has arrived in 2008. This review introduces technologies, bioinformatics, genomics visions, and variomics projects. Variomics is the study of the total genetic variation in an individual and populations. Research on genetic variation is the most valuable among many genomics research branches. Genomics and variomics projects will change biology and the society so dramatically that biology will become an everyday technology like personal computers and the internet. 'BioRevolution' is the term that can adequately describe this change.<br /> <br /><strong>Introduction</strong><br />Since the launch of the Human Genome Project (HGP) in 1990 by NIH of USA, researchers have been developing faster DNA sequencers </span><span style="FONT-SIZE: 9pt">(ShendureChan, Mitra et al. 2004; Chan 2005; Metzker 2005; Gupta , 2008; Mardis , 2008; Metzker, 2005; Shendure et al., 2004)</span><span style="FONT-SIZE: 9pt">. HGP was has been said to be led by James Watson who modeled DNA in Cambridge, UK in 1953. In 2003, the International Human Genome Sequencing Consortium held a press conference to announce the completion of the human genome </span><span style="FONT-SIZE: 9pt">(IHGSC , 2004)</span><span style="FONT-SIZE: 9pt">. In 2008, after 55 years, his Watson's complete genome sequence was publicized by using 454 DNA sequencers developed by a company </span><span style="FONT-SIZE: 9pt">rather than a research institute (Wheeler, Srinivasan et al. , 2008)</span><span style="FONT-SIZE: 9pt">. In 2007, Craig Venter of , a former Celera founder , published his own personal genome in PLoS Biology </span><span style="FONT-SIZE: 9pt">(Levy, Sutton et al. , 2007)</span><span style="FONT-SIZE: 9pt">. We are entering the personalized biology era with the advent of next generation sequencing technologies.<br />
<br />
<br />
<br />
<br />
</spanstrong>Cytochrome p450 family example</strong><span style="FONT-SIZE: 9pt"br />Genomes The cytochrome P450 (CYP) family of liver enzymes is responsible for breaking down more than 30 different classes of drugs during Phase I of drug metabolism. Structural and Personalized MedicineSNP variations of the genes that code for these enzymes can influence their ability to metabolize certain drugs. Based upon this, a population can be categorized into four major types of drug metabolizers: <br /span>" Extensive metabolizers: Individuals that can be administered with normal drug dosage <br /strong><span style="FONT-SIZE" Intermediate metabolizers: 9pt">Individuals that metabolize drugs with a slower than normal rate. <br />The consequences of 'BioRevolution' where genomic information is utilized by scientists to engineers all kinds of biological processes including evolution itself will bring us the personalized medicine" Poor metabolizers: Individuals with poor metabolizing rates. The essence of personalized medicine is that enzymes in our tissues such as cytochrome P450 have distinct differences among individuals Drugs may accumulate and populations. Certain drugs produce different responses in individualscause serious adverse effects.<br />" Ultra metabolizers: Individuals with metabolizing rates even faster than extensive metabolizers. They may experience no effect of drug activity. <br />
<br />
<br />
<br />
<br />
<br />
<br />
</span></p><ul type="disc"> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">Algorithms in <span style="FONT-SIZE: 9pt">Molecular Biology (http://www.almob.org/) </span></span></li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">Bioinformatics (http://bioinformatics.oxfordjournals.org/) </span></li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">BMC Bioinformatics (http://www.biomedcentral.com/bmcbioinformatics) </span></li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">Briefings in Bioinformatics (http://bib.oxfordjournals.org/) </span></li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">Genome Research (http://genome.cshlp.org/) </span></li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">Genomics and Informatics (<a href="http://www.genominfo.org/"><font color="#0000ff">http://www.genominfo.org</font></a>) </span></li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">The International Journal of Biostatistics (http://www.bepress.com/ijb/) </span></li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">Journal of Computational Biology (http://www.liebertpub.com/Products/Product.aspx?pid=31&AspxAutoDetectCookieSupport=1) </span></li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">Cancer Informatics (http://www.la-press.com/journal.php?pa=description&journal_id=10) </span></li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">Molecular Systems Biology (http://www.nature.com/msb/index.html </span></li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">PLoS Computational Biology (<a href="http://www.ploscompbiol.org/home.action">http://www.ploscompbiol.org/home.action</a>)</span> </li> <li style="TEXT-ALIGN: left"><span style="FONTfont-SIZEsize: 9pt;">International Journal of Bioinformatics Research and Applications (http://www.inderscience.com/browse/index.php?journalcode=ijbra)</span> </li>
</ul>
<span style="font-size: 9pt;"><br />
<strong>Sequencing DNA, Metagenomics, and Ecogenomics</strong><br />
Next generation sequencing methods will not only map genomes. They will be used to map the environment. This is called ecogenomics. To humans the environment can mean various microbial, plant, and animal interactions around us. Microbial interaction is especially critical to our health. Gut bacteria are a natural environment within us. Metagenomics is a methodology that sequences the whole set of microbes in our food tract. Researchers are realizing that the human genome is complemented by such environmental genomes. A new term, 'ecogenomics' is now used to describe these concepts. Metagenomics and ecogenomics are for mapping the variations of environmental genetic factors.<br />
<br />
<strong><span style="FONT-SIZE: 9pt">Sequencing Mapping Expression using DNA, Metagenomics, and Ecogenomics</span>sequencing</strong><span style="FONT-SIZE: 9pt"><br />Next generation DNA sequencing methods are not only technologies were mostly used for mapping genomesgenotypes. They can be However, they are now used to map the environmentRNA expression levels in cells. Cells produce various types of RNA. It mRNA is called ecogenomicsthe most abundant and important. Environment to humans can be various microbial, plantIn the past, microarray and animal interactions around us. Especially, microbial interaction is critical DNA chips were used to our healthmeasure expression levels. Gut bacteria They are natural environment to usnot accurate and take many bioinformatic adjustments before producing reliable expression data. New sequencing technologies can measure expression levels much more accurately. Metagenomics is a methodology that sequences By sequencing the whole set of microbes in our food tract. Researchers are realizing that human genome is complemented by such environmental genomes. A new termRNAs, 'ecogenomics' is we can now used to describe these conceptsquantify the expression levels by precisely knowing the RNA sequences. Metagenomics and ecogenomics are for mapping Sequencing technologies will restructure the expression analyses in the variation of environmental genetic factorsfuture.<br />
<br />
<strong>Linking Genome information On-line </strong><br />
Sequencing a genome is basically the production of data, whereas analyzing the whole genome takes human minds networking their hypotheses, proofs, and discoveries, i.e. genomics is a scientific endeavor beyond mechanical sequencing. Therefore, a worldwide effort is required to link all the genome information for proper management and utilization. The internet is the best infrastructure for genome information exchange. Bioinformatics resources should be available as freely as possible for all nations, including those underdeveloped and developing. Genome sequencing and associated analyses should be done freely in certain instances by the support of local governments and international organizations. For maximum efficiency, an adequate data and information license should also be required. Some researchers propose an openfree sharing of bioinformatics analysis tools, as well as the genome sequences (under proper permission). One such movement is Free Genomics (http://freegenomics.org). <br />
<br />
<br />
<strong/span><ul> <li>Linking Genome information On<span style="font-linesize: 9pt;">Genomics portal: http://genomics.org</strongspan><br /li>Sequencing a genome is one thing but analyzing the whole genome is another thing <li><span style="font-size: 9pt;">Personal Genome Project: http://personalgenomes. Therefore, a worldwide effort is required to link all the genome information for proper management and utilizationorg</span></li> <li><span style="font-size: 9pt;">openfree Genomics Project: http://personalgenome. The internet is the best infrastructure for genome information exchangenet</span></li> <li><span style="font-size: 9pt;">Personal Genome sequencing company: http://www. Bioinformatics resources should be available as freely as possible for underdeveloped and developing nationsknome. com</span></li> <li><span style="font-size: 9pt;">Personal Genome sequencing and associated analysese should be done freely in certain instances by the support of local governments and international organizationsSNP typing: http://decodeme. For maximum efficient, an adequote data and information license is also requiredcom</span></li> <li><span style="font-size: 9pt;">Google's Personal Genome Typing: http://23andme. Some researchers propose an openfree sharing of bioinformatics analysis tools as com</span></li> <li><span style="font-size: 9pt;well as the genome sequences (under proper permission)">The Sanger Centre: http://sanger.ac. One of such as movement is Free Genomics (uk</span></li> <li><a hrefspan style="font-size: 9pt;">General Omics site: http://freegenomicsomics.org</span></li> <li><span style="font-size: 9pt;">Korean Genome Data Site: http://freegenomicskoreagenome.org</aspan></li> <li><span style="font-size: 9pt;">)Korean Bioinformation Center: http://kobic. kr</span></li></ul><span style="font-size: 9pt;"><br /><strong>Conclusion</strong><br />We have examined the current trends in genomics and variomics. In 2009 and onwards, personal genome projects will produce an unprecedented amount of biological data. New bioinformatics technologies will be required to handle them. New sequencing technologies will drive the next decades of biology and transform medical practices. Fast sequencing brought us interesting and unexpected applications such as metagenomics and ecogenomics. <br />
<br />
<strong>Acknowledgements </spanstrong><divbr />SK was supported by Soongsil University Research Fund. JB, GH, and RR were supported by KRIBB/KOBIC fund from the MEST of Korea. The authors thank Maryana Bhak for editing the manuscript.<font size="2"br /><br /><strong>ConclusionReferences</strong><br /font>Anderson, S., A. T. Bankier, B. G. Barrell, M. H. de Bruijn, A. R. Coulson, J. Drouin, I. C. Eperon, D. P. Nierlich, B. A. Roe, F. Sanger, P. H. Schreier, A. J. Smith, R. Staden, and I. G. Young. (1981). Sequence and organization of the human mitochondrial genome. Nature 290:457-65.<br />Chan, E. Y. (2005). Advances in sequencing technology. Mutat Res 573:13-40.<font size="2"br />In 2009 and onwardsChurch, G. M. (2005). The personal genome projects will produce unprecedented amount of biological dataproject. Mol Syst Biol 1:2005 0030.<br />Gupta, P. K. New bioinformatics technologies will be required to handle them(2008). New Single-molecule DNA sequencing technologies will drive for future genomics research. Trends Biotechnol 26:602-11.<br />IHGSC. (2004). Finishing the next decades euchromatic sequence of biology and transform the medical practices in hospitalshuman genome. Fast sequencing unexpectedly brought us interesting applications such as metagenomics and ecogenomicsNature 431:931-45. <br /font><font size="2">We have examined Kim, T.-M., S.-H. Yim, and Y. Chung. (2008). Copy Number Variations in the current trends in genomics Human Genome: Potential Source for Individual Diversity and variomicsDisease Association Studies. Genomics & Informatics 6(1):1-7. <br /font>Levy, S., G. Sutton, P. C. Ng, L. Feuk, A. L. Halpern, B. P. Walenz, N. Axelrod, J. Huang, E. F. Kirkness, G. Denisov, Y. Lin, J. R. MacDonald, A. W. Pang, M. Shago, T. B. Stockwell, A. Tsiamouri, V. Bafna, V. Bansal, S. A. Kravitz, D. A. Busam, K. Y. Beeson, T. C. McIntosh, K. A. Remington, J. F. Abril, J. Gill, J. Borman, Y. H. Rogers, M. E. Frazier, S. W. Scherer, R. L. Strausberg, and J. C. Venter. (2007). The diploid genome sequence of an individual human. PLoS Biol 5:e254.<br />Mardis, E. R. (2008). The impact of next-generation sequencing technology on genetics. Trends Genet 24:133-41.<br /div>Metzker, M. L. (2005). Emerging technologies in DNA sequencing. Genome Res 15:1767-76.<divbr />Park, H., K. J-H., S.-I. Cho, J. Sung, H.-L. Kim, Y. S. Ju, G. Bayasgalan, M.-K. Lee, and J.-S. Seo. (2008). Genome-wide Linkage Study for Plasma HDL Cholesterol Level in an Isolated Population of Mongolia. Genomics & Informatics 6(1):8-13.<strongbr />Porreca, G. J., J. Shendure, and G. M. Church. (2006). Polony DNA sequencing. Curr Protoc Mol Biol Chapter 7:Unit 7 8.<font size="2"br />AcknowledgementsRing, H. Z., P. Y. Kwok, and R. G. Cotton. (2006). Human Variome Project: an international collaboration to catalogue human genetic variation. Pharmacogenomics 7:969-72.<br /font>Sanger, F., G. M. Air, B. G. Barrell, N. L. Brown, A. R. Coulson, C. A. Fiddes, C. A. Hutchison, P. M. Slocombe, and M. Smith. (1977). Nucleotide sequence of bacteriophage phi X174 DNA. Nature 265:687-95.<br /strong>Shendure, J., R. D. Mitra, C. Varma, and G. M. Church. (2004). Advanced sequencing technologies: methods and goals. Nat Rev Genet 5:335-44.<br />Sung, J., M. K. Lee, and J.-S. Seo. (2008). Inbreeding Coefficients in Two Isolated Mongolian Populations - GENDISCAN Study. Genomics & Informatics 6(1).<font size="2"br />SK was supported Wheeler, D. A., M. Srinivasan, M. Egholm, Y. Shen, L. Chen, A. McGuire, W. He, Y. J. Chen, V. Makhijani, G. T. Roth, X. Gomes, K. Tartaro, F. Niazi, C. L. Turcotte, G. P. Irzyk, J. R. Lupski, C. Chinault, X. Z. Song, Y. Liu, Y. Yuan, L. Nazareth, X. Qin, D. M. Muzny, M. Margulies, G. M. Weinstock, R. A. Gibbs, and J. M. Rothberg. (2008). The complete genome of an individual by Soongsil University Reserach Fundmassively parallel DNA sequencing. Nature 452:872-6. </font><br />
<br />
</div><div><font size="2"><strongspan>References</strong></font></div><div span style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">IHGSC (2004). Finishing the euchromatic sequence of the human genome. <em>Nature</em> 431(7011), 931-45.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Anderson, S., A. T. Bankier, <em>et al.</em> (1981). Sequence and organization of the human mitochondrial genome. <em>Nature</em> 290(5806), 457-65.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt9pt; TEXT-INDENT: -36pt"><font size="2">Chan, E. Y. (2005). Advances in sequencing technology. <em>Mutat. Res.<br /em> 573(1-2), 13-40.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Church, G. M. (2005). The personal genome project. <em>Mol. Syst. Biol.</em> 1<strongspan>,</strong> 2005.0030.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Gupta, P. K. (2008). Single-molecule DNA sequencing technologies for future genomics research. <em>Trends Biotechnol.</em> 26(11), 602-11.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Kim, T-M. <em>et al.</em> (2008). Copy Number Variations in the Human Genome: Potential Source for Individual Diversity and Disease Association Studies. <em>Genomics & Informatics</em> 6(1), 1-7.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Levy, S., G. Sutton, <em>et al</em>. (2007). The diploid genome sequence of an individual human. <em>PLoS Biol.</em> 5(10), e254.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Mardis, E. R. (2008). The impact of next-generation sequencing technology on genetics. <em>Trends Genet.</em> 24(3), 133-41.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Metzker, M. L. (2005). Emerging technologies in DNA sequencing. <em>Genome Res.</em> 15(12), 1767-76.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Park, H. <em>et al</em>. (2008). Genome-wide Linkage Study for Plasma HDL Cholesterol Level in an Isolated Population of Mongolia. <em>Genomics & Informatics</em> 6(1), 8-13.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Porreca, G. J., J. Shendure, <em>et al</em>. (2006). Polony DNA sequencing. <em>Curr. Prot. Mol. Biol.</em> Chapter 7: Unit 7 8.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Ring, H. Z., P. Y. Kwok, <em>et al</em>. (2006). Human Variome Project: an international collaboration to catalogue human genetic variation. <em>Pharmacogenomics</em> 7(7), 969-72.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Sanger, F., G. M. Air, <em>et al</em>. (1977). Nucleotide sequence of bacteriophage phi X174 DNA. <em>Nature</em> 265(5596), 687-95.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Shendure, J., R. D. Mitra, <em>et al</em>. (2004). Advanced sequencing technologies: methods and goals. <em>Nat.</em> <em>Rev. Genet.</em> 5(5), 335-44.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Sung, J. et al. (2008). Inbreeding Coefficients in Two Isolated Mongolian Populations - GENDISCAN Study. <em>Genomics & Informatics</em> 6(1): 14-17.</font></div><div style="MARGIN: 0cm 0cm 0pt 36pt; TEXT-INDENT: -36pt"><font size="2">Wheeler, D. A., M. Srinivasan, <em>et al</em>. (2008). The complete genome of an individual by massively parallel DNA sequencing. <em>Nature</em> 452(7189), 872-6.</font></div><p> </p>