Alignment

From Biolecture.org
Revision as of 00:53, 11 December 2015 by imported>Eunjin RYU

Sequence Alignment

 

 

What is sequence alignment?

 

A sequence alignment is a method of arranging the sequences of DNA, RNA, or protein to identify similar regions in order to study the relationships between the sequences in functional, structural, or evolutionary aspects.

 

How sequence is aligned?

 

Many researchers have been trying to construct algorithms to produce high-quality sequence alignments. Computational methods of alignment generally categorized into two methodglobal alignments and local alignments. A global alignment finds globally optimized alignment, forcing the alignment to span the entire length of all query sequences. On the other hand, local alignments find similar region within long sequences that are overally divergent. Local alignments are more prefered, but it is more difficult to calculate since similar region should be identified. In order to deal with sequence alignment, many computational algorithms have been applied. For example, dynamic programming is slow but is formal method to correct alignment. Also, there are efficient algorithms or method utilizing probability for large-scale database searching, but it doesn't guarantee best fitted alignment. Hybrid methods, known as semi-global or glocalshort for global-localmethods, tries to find the best alignment. This method is useful for the sequence whose downstream part is overlapped with the upstream part of the other sequence. In this case, both global and local alignment is entirely inappropriate, since a global alignment force the alignment to extend beyond the overlap, while a local alignment don't fully cover the overlap.

 

How sequence alignment be used?

 

The relationship between phylogenetics and sequence alignment is close since both fields need to evaluate sequence relatedness. Sequence alignment is useful for construction or interpretation of phylogenetic trees. phylogenetic trees is used to classify the evolution between homologs in the genomes of diverged species. Roughly, high score identity in sequences means that the sequences have a comparatively close recent common ancestor, while low identity suggests the divergence happened more ancient era.

 

References

  1. https://en.wikipedia.org/wiki/Sequence_alignment
  2. Valery, O. P., Mikhail A, R., & Vladimir G. T. (2011). Comparative analysis of the quality of a global algorithm and a local algorithm for alignment of two sequences. Algorithms Mol Biol. doi:  10.1186/1748-7188-6-25.
  3. Philippe O., & Olivier B. (2010). Where Does the Alignment Score Distribution Shape Come from? Evolutionary Bioinformatics. 6: 159–187.