61
edits
Changes
Created page with " Sequence alignment<br/> In [https://en.wikipedia.org/wiki/Bioinformatics bioinformatics], a '''sequence alignment''' is a way of arranging the sequences of&nbs..."
Sequence alignment<br/> In [https://en.wikipedia.org/wiki/Bioinformatics bioinformatics], a '''sequence alignment''' is a way of arranging the sequences of [https://en.wikipedia.org/wiki/DNA DNA], [https://en.wikipedia.org/wiki/RNA RNA], or protein to identify regions of similarity that may be a consequence of functional, [https://en.wikipedia.org/wiki/Structural_biology structural], or [https://en.wikipedia.org/wiki/Evolution evolutionary] relationships between the sequences.<sup id="cite_ref-mount_1-0">[https://en.wikipedia.org/wiki/Sequence_alignment#cite_note-mount-1 [1]]</sup><sup id="cite_ref-2">[https://en.wikipedia.org/wiki/Sequence_alignment#cite_note-2 [2]]</sup> Aligned sequences of [https://en.wikipedia.org/wiki/Nucleotide nucleotide] or [https://en.wikipedia.org/wiki/Amino_acid amino acid] residues are typically represented as rows within a [https://en.wikipedia.org/wiki/Matrix_(mathematics) matrix]. Gaps are inserted between the [https://en.wikipedia.org/wiki/Residue_(chemistry) residues] so that identical or similar characters are aligned in successive columns<br/> <br/> Interpretation<br/> If two sequences in an alignment share a common ancestor, mismatches can be interpreted as [https://en.wikipedia.org/wiki/Point_mutation point mutations] and gaps as [https://en.wikipedia.org/wiki/Indel indels] (that is, insertion or deletion mutations) introduced in one or both lineages in the time since they diverged from one another. In sequence alignments of proteins, the degree of similarity between [https://en.wikipedia.org/wiki/Amino_acid amino acids] occupying a particular position in the sequence can be interpreted as a rough measure of how [https://en.wikipedia.org/wiki/Conservation_(genetics) conserved] a particular region or [https://en.wikipedia.org/wiki/Sequence_motif sequence motif] is among lineages.<br/> <br/> Alignment methods<br/> human knowledge is applied in constructing algorithms to produce high-quality sequence alignments, and occasionally in adjusting the final results to reflect patterns that are difficult to represent algorithmically (especially in the case of nucleotide sequences). Computational approaches to sequence alignment generally fall into two categories: ''global alignments'' and ''local alignments''. Calculating a global alignment is a form of [https://en.wikipedia.org/wiki/Global_optimization global optimization] that "forces" the alignment to span the entire length of all query sequences. By contrast, local alignments identify regions of similarity within long sequences that are often widely divergent overall. Local alignments are often preferable, but can be more difficult to calculate because of the additional challenge of identifying the regions of similarity.<sup id="cite_ref-Polyanovsky2011_5-0">[https://en.wikipedia.org/wiki/Sequence_alignment#cite_note-Polyanovsky2011-5 [5]]</sup> A variety of computational algorithms have been applied to the sequence alignment problem. These include slow but formally correct methods like [https://en.wikipedia.org/wiki/Dynamic_programming dynamic programming]. These also include efficient, [https://en.wikipedia.org/wiki/Heuristic_algorithm heuristic algorithms] or [https://en.wikipedia.org/wiki/Probability probabilistic] methods designed for large-scale database search, that do not guarantee to find best matches.<br/> <br/> Structural alignment<br/> '''Structural alignment''' attempts to establish [https://en.wikipedia.org/wiki/Sequence_homology homology] between two or more [https://en.wikipedia.org/wiki/Polymer polymer] structures based on their shape and three-dimensional [https://en.wikipedia.org/wiki/Tertiary_structure conformation]. This process is usually applied to [https://en.wikipedia.org/wiki/Protein protein] [https://en.wikipedia.org/wiki/Tertiary_structure tertiary structures] but can also be used for large [https://en.wikipedia.org/wiki/RNA RNA] molecules. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no ''a priori'' knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard [https://en.wikipedia.org/wiki/Sequence_alignment sequence alignment] techniques. Structural alignment can therefore be used to imply [https://en.wikipedia.org/wiki/Evolution evolutionary] relationships between proteins that share very little common sequence. However, caution should be used in using the results as evidence for shared evolutionary ancestry because of the possible confounding effects of [https://en.wikipedia.org/wiki/Convergent_evolution convergent evolution] by which multiple unrelated [https://en.wikipedia.org/wiki/Amino_acid amino acid] sequences converge on a common [https://en.wikipedia.org/wiki/Tertiary_structure tertiary structure].