Difference between revisions of "LECTURES"

From Biolecture.org
imported>Byeongeun Lee
imported>Byeongeun Lee
Line 5: Line 5:
 
<hr />
 
<hr />
 
<p><span style="font-size:14px">Bioinformatics&nbsp;is an interdisciplinary&nbsp;field that develops methods and software tools&nbsp;for understanding biological&nbsp;data. As an interdisciplinary&nbsp;field of science, bioinformatics combines computer science, statistics, mathematics, and engineering&nbsp;to analyze and interpret biological&nbsp;data. Bioinformatics has been used for in silico&nbsp;analyses of biological&nbsp;queries using mathematical&nbsp;and statistical techniques.</span></p>
 
<p><span style="font-size:14px">Bioinformatics&nbsp;is an interdisciplinary&nbsp;field that develops methods and software tools&nbsp;for understanding biological&nbsp;data. As an interdisciplinary&nbsp;field of science, bioinformatics combines computer science, statistics, mathematics, and engineering&nbsp;to analyze and interpret biological&nbsp;data. Bioinformatics has been used for in silico&nbsp;analyses of biological&nbsp;queries using mathematical&nbsp;and statistical techniques.</span></p>
 +
 +
<p><span style="font-size:14px">## Analysis ##</span></p>
 +
 +
<p><span style="font-size:14px">1) <strong>Analysis of gene expression</strong></span></p>
 +
 +
<p><span style="font-size:14px">The expression&nbsp;of many genes can be determined by measuring mRNA levels with multiple techniques including microarrays, expressed cDNA sequence tag&nbsp;(EST) sequencing, serial analysis of gene expression&nbsp;(SAGE) tag sequencing, massively parallel signature sequencing&nbsp;(MPSS), RNA-Seq,&nbsp;also known as &quot;Whole Transcriptome Shotgun Sequencing&quot; (WTSS), or various applications of multiplexed in-situ hybridization. All of these techniques are extremely noise-prone and/or subject to bias in the biological measurement, and a major research area in computational biology involves developing statistical tools to separate signal&nbsp;from noise&nbsp;in high-throughput gene expression studies.&nbsp;Such studies are often used to determine the genes implicated in a disorder: one might compare microarray data from cancerous epithelial&nbsp;cells to data from non-cancerous cells to determine the transcripts that are up-regulated and down-regulated in a particular population of cancer cells.</span></p>
 +
 +
<p><span style="font-size:14px">2) <strong>Analysis of protein expression</strong></span></p>
 +
 +
<p>&nbsp;</p>
 +
 +
<p><span style="font-size:14px">Protein microarrays&nbsp;and high throughput (HT) mass spectrometry&nbsp;(MS) can provide a snapshot of the proteins present in a biological sample. Bioinformatics is very much involved in making sense of protein microarray and HT MS data; the former approach faces similar problems as with microarrays targeted at mRNA, the latter involves the problem of matching large amounts of mass data against predicted masses from protein sequence databases, and the complicated statistical analysis of samples where multiple, but incomplete peptides from each protein are detected.</span></p>
 +
 +
<h3><span style="font-size:14px">3) <strong>Analysis of regulation</strong></span></h3>
 +
 +
<p><span style="font-size:14px">Regulation is the complex orchestration of events starting with an extracellular signal such as a hormone&nbsp;and leading to an increase or decrease in the activity of one or more proteins.&nbsp;Bioinformatics techniques have been applied to explore various steps in this process. For example, promoter analysis involves the identification and study of sequence motifs&nbsp;in the DNA surrounding the coding region of a gene. These motifs influence the extent to which that region is transcribed into mRNA. Expression data can be used to infer gene regulation: one might compare microarray&nbsp;data from a wide variety of states of an organism to form hypotheses about the genes involved in each state. In a single-cell organism, one might compare stages of the cell cycle, along with various stress conditions. One can then apply clustering algorithms&nbsp;to that expression data to determine which genes are co-expressed. For example, the upstream regions (promoters) of co-expressed genes can be searched for over-represented regulatory elements.&nbsp;</span></p>
 +
 +
<p>&nbsp;</p>
  
 
<p><span style="font-size:20px">GENOMICS</span></p>
 
<p><span style="font-size:20px">GENOMICS</span></p>

Revision as of 14:11, 17 June 2016

 

BIOINFORMATICS


Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data. As an interdisciplinary field of science, bioinformatics combines computer science, statistics, mathematics, and engineering to analyze and interpret biological data. Bioinformatics has been used for in silico analyses of biological queries using mathematical and statistical techniques.

## Analysis ##

1) Analysis of gene expression

The expression of many genes can be determined by measuring mRNA levels with multiple techniques including microarrays, expressed cDNA sequence tag (EST) sequencing, serial analysis of gene expression (SAGE) tag sequencing, massively parallel signature sequencing (MPSS), RNA-Seq, also known as "Whole Transcriptome Shotgun Sequencing" (WTSS), or various applications of multiplexed in-situ hybridization. All of these techniques are extremely noise-prone and/or subject to bias in the biological measurement, and a major research area in computational biology involves developing statistical tools to separate signal from noise in high-throughput gene expression studies. Such studies are often used to determine the genes implicated in a disorder: one might compare microarray data from cancerous epithelial cells to data from non-cancerous cells to determine the transcripts that are up-regulated and down-regulated in a particular population of cancer cells.

2) Analysis of protein expression

 

Protein microarrays and high throughput (HT) mass spectrometry (MS) can provide a snapshot of the proteins present in a biological sample. Bioinformatics is very much involved in making sense of protein microarray and HT MS data; the former approach faces similar problems as with microarrays targeted at mRNA, the latter involves the problem of matching large amounts of mass data against predicted masses from protein sequence databases, and the complicated statistical analysis of samples where multiple, but incomplete peptides from each protein are detected.

3) Analysis of regulation

Regulation is the complex orchestration of events starting with an extracellular signal such as a hormone and leading to an increase or decrease in the activity of one or more proteins. Bioinformatics techniques have been applied to explore various steps in this process. For example, promoter analysis involves the identification and study of sequence motifs in the DNA surrounding the coding region of a gene. These motifs influence the extent to which that region is transcribed into mRNA. Expression data can be used to infer gene regulation: one might compare microarray data from a wide variety of states of an organism to form hypotheses about the genes involved in each state. In a single-cell organism, one might compare stages of the cell cycle, along with various stress conditions. One can then apply clustering algorithms to that expression data to determine which genes are co-expressed. For example, the upstream regions (promoters) of co-expressed genes can be searched for over-represented regulatory elements. 

 

GENOMICS


 

TRANSCRIPTOMICS


Transcriptomics is the study of the transcriptome - the all set of RNA transcripts which are produced under specific circumstances in one cell or population of cells - using high throughout methods such as microarray analysis.

 

EPIGENOMICS


Epigenomcis is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome.

{ Epigenetic modifications = genomic modifications that alter gene expression that cannot be attributed to modification

                                       of the primary DNA sequence and that are heritalbe mitotically and meiotically are classified }

## Major two types of epigenemic modifications ##

1) DNA methylation

DNA methylation is the process of by which a methyl group is added to DNA by enzymes DNA methyltransferases (DNMTs) which are responsible for catalyzing this reaction. In eukaryotes, methylation is most commonly found on the carbon 5 position of cytosine residues (5mc) adjacent to guanine. DNA methylation patterns vary greatly between species and even with the same organisms.

2) Histone modification

In eukaryotes, genomic DNA is coiled into protein-DNA complexes called chromatin. Histones, which are the most prevalent type of protein found in chromatin, function to condense the DNA; the net positive charge on histones facilitates their bonding with DNA, which is negatively charged. The basic and repeating units of chromatin, nucleosomes, consist of an octamer of histone proteins. Many different types of histone modification are known, including acetylation, methylation, phosphorylation, ubiquitination etc. The DNA region where histone modification occurs can elicit different effects. Histone modifications regulate gene expression by two mechanisms : by disruption of the contact between nucleosomes and by recruiting chromatin remodeling ATPases.

## Epigenomic methods ##

1) Histone modification assay

The cellular processes of transcription, DNA replication and DNA repair involve the interaction between genomic DNA and nuclear proteins. It had been known that certain regions within chromatin were extremely susceptible to DNase I digestion, which cleaves DNA in a low sequence specificity manner. Such hypersensitive sites were thought to be transcriptionally active regions, as evidenced by their association with RNA polymerase and topoisomerase I and II. It is now known that sensitivity to DNAse I regions correspond to regions of chromatin with loose DNA-histone association. Hypersensitive sites most often represent promoters regions, which require for DNA to be accessible for DNA binding transcriptional machinery to function.

ChIP-Chip and ChIP-Seq

Histone modification was first detected on a genome wide level through the coupling of chromatin immunoprecipitation (ChIP) technology with DNA microarrays, termed ChIP-Chip. However instead of isolating a DNA-binding transcription factor or enhancer protein through chromatin immunoprecipitation, the proteins of interest are the modified histones themselves. 

① Histones are cross-linked to DNA in vivo through light chemical treatment.

② The cells are next lysed, allowing for the chromatin to be extracted and fragmented, either by sonication or treatment with a non-specific restriction enzyme.

③ Modification-specific antibodies in turn, are used to immunoprecipitate the DNA-histone complexes. 

④ Following immunoprecipitation, the DNA is purified from the histones, amplified via PCR and labeled with a fluorescent tag.

⑤ The final step involves hybridization of labeled DNA, both immunoprecipitated DNA and non-immunoprecipitated onto a microarray containing immobilized gDNA.

⑥ Analysis of the relative signal intensity allows the sites of histone modification to be determined. 

2) DNA methylation arrays

Techniques for characterizing primary DNA sequences could not be directly applied to methylation assays. For example, when DNA was amplified in PCR or bacterial cloning techniques, the methylation pattern was not copied and thus the information lost. The DNA hybridization technique used in DNA assays, in which radioactive probes were used to map and identify DNA sequences, could not be used to distinguish between methylated and non-methylated DNA.

Non genome-wide approaches

The earliest methylation detection assays used methylation modification sensitive restriction endonucleases. Genomic DNA was digested with both methylation-sensitive and insensitive restriction enzymes recognizing the same restriction site. The idea being that whenever the site was methylated, only the methylation insensitive enzyme could cleave at that position. By comparing restriction fragment sizes generated from the methylation-sensitive enzyme to those of the methylation-insensitive enzyme, it was possible to determine the methylation pattern of the region. This analysis step was done by amplifying the restriction fragments via PCR, separating them through gel electrophoresis and analyzing them via southern blot with probes for the region of interest. Different regions of the gene were known to be expressed at different stages of development. Consistent with a role of DNA methylation in gene repression, regions that were associated with high levels of DNA methylation were not actively expressed.

This method was limited not suitable for studies on the global methylation pattern, or methylome. Even within specific loci it was not fully representative of the true methylation pattern as only those restriction sites with corresponding methylation sensitive and insensitive restriction assays could provide useful information. Further complications could arise when incomplete digestion of DNA by restriction enzymes generated false negative results.

Gemone widei approaches

DNA methylation profiling on a large scale was first made possible through the Restriction Landmark Genome Scanning (RLGS) technique. Like the locus-specific DNA methylation assay, the technique identified methylated DNA via its digestion methylation sensitive enzymes. However it was the use of two-dimensional gel electrophoresis that allowed be characterized on a broader scale. However it was not until the advent of microarray and next generation sequencing technology when truly high resolution and genome-wide DNA methylation became possible. As with RLGS, the endonuclease component is retained in the method but it is coupled to new technologies. One such approach is the differential methylation hybridization (DMH), in which one set of genomic DNA is digested with methylation-sensitive restriction enzymes and a parallel set of DNA is not digested. Both sets of DNA are subsequently amplified and each labelled with fluorescent dyes and used in two-colour array hybridization. The level of DNA methylation at a given loci is determined by the relative intensity ratios of the two dyes. Adaptation of next generation sequencing to DNA methylation assay provides several advantages over array hybridization. Sequence-based technology provides higher resolution to allele specific DNA methylation, can be performed on larger genomes, and does not require creation of DNA microarrays which require adjustments based on CpG density to properly function.

 

PROTEOMICS