Changes

From Biolecture.org

Algorithm

3,139 bytes removed, 03:02, 7 December 2018
no edit summary
<p>&lt;p&gt;{{Technical|date=September 2013}} {{Infobox algorithm |name= &lt;<!-- Defaults to article name --&gt; > |class= [[Sequence alignment]] |image= |caption= |data= |time= O(mn) |best-time= |average-time= |space= O(mn) }} The &amp;#39;&amp;#39;&amp;#39;Needleman&amp;ndash;Wunsch algorithm&amp;#39;&amp;#39;&amp;#39; is an [[algorithm]] used in [[bioinformatics]] to [[sequence alignment|align]] [[protein]] or [[nucleotide]] sequences. It was one of the first applications of [[dynamic programming]] to compare biological sequences. The algorithm was developed by Saul B. Needleman and Christian D. Wunsch and published in 1970.{{cite journal | journal=Journal of Molecular Biology | volume=48 | issue=3 | pages=443&amp;ndash;53 | year=1970 |author1=Needleman, Saul B. |author2=Wunsch, Christian D. |last-author-amp=yes | title=A general method applicable to the search for similarities in the amino acid sequence of two proteins | url=http://linkinghub.elsevier.com/retrieve/pii/0022-2836(70)90057-4 | pmid=5420325 | doi = 10.1016/0022-2836(70)90057-4 }} The algorithm essentially divides a large problem (e.g. the full sequence) into a series of smaller problems and uses the solutions to the smaller problems to reconstruct a solution to the larger problem.{{cite web|title=bioinformatics |url=http://www.britannica.com/EBchecked/topic/1334661/bioinformatics/285871/Goals-of-bioinformatics#ref1115380|accessdate=10 September 2014}} It is also sometimes referred to as the [[optimal matching]] algorithm and the [[Sequence alignment#Global and local alignments|global alignment]] technique. The Needleman&amp;ndash;Wunsch algorithm is still widely used for optimal global alignment, particularly when the quality of the global alignment is of the utmost importance. [[File:Needleman-Wunsch pairwise sequence alignment.png|framed|right|Figure 1: Needleman-Wunsch pairwise sequence alignment&lt;/p&gt;</p>
<p>&lt;pre&gt;<br />Results:</p>
<p>Sequences &nbsp; &nbsp; Best alignments<br />--------- &nbsp; &nbsp; ----------------------<br />GCATGCU &nbsp; &nbsp; &nbsp; GCATG-CU &nbsp; &nbsp; &nbsp; GCA-TGCU &nbsp; &nbsp; &nbsp; GCAT-GCU<br />GATTACA &nbsp; &nbsp; &nbsp; G-ATTACA &nbsp; &nbsp; &nbsp; G-ATTACA &nbsp; &nbsp; &nbsp; G-ATTACA</p>
<p>Interpretation of the initialization step:</p>
<p>One can interpret the leftmost column in the above figure like this (putting a &amp;quot;handle&amp;quot; before each sequence, annotated as + here):</p>
<p>Alignment: &nbsp; +GCATGCU<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; +GATTACA<br />Score: &nbsp; &nbsp; &nbsp; 0 &nbsp; // Handle matches handle, doesn&amp;#39;t win any score</p>
<p>Alignment: &nbsp; +GCATGCU<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; +GATTACA<br />Score: &nbsp; &nbsp; &nbsp; 0 &nbsp; // 1 gap, &nbsp; score -1</p>
<p>Alignment: &nbsp; +GCATGCU<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; +GATTACA<br />Score: &nbsp; &nbsp; &nbsp; 0 &nbsp; // 2 gaps, score -2</p>
<p>Alignment: &nbsp; +GCATGCU<br />&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; +GATTACA<br />Score: &nbsp; &nbsp; &nbsp; 0 &nbsp; // 3 gaps, score -3</p>
<p>Alignment: &nbsp; +GCATGCU<br />&nbsp; &nbsp; &nbsp; &nbsp; +GATTACA<br />Score: &nbsp; &nbsp; &nbsp; 0 &nbsp; // 4 gaps, score -4</p>
<p>...</p>
<p>The same thing can be done for the uppermost row.<br />&lt;/pre&gt;</p>
<<
<pblockquote>&lt;blockquote&gt;A penalty factor, a number subtracted for every gap made, may be assessed as a barrier to allowing the gap. The penalty factor could be a function of the size and/or direction of the gap. [page 444]&lt;</blockquote&gt;</p>
<p>&lt;p&gt;A better dynamic programming algorithm with quadratic running time for the same problem (no gap penalty) was first introduced{{cite journal | doi=10.1073/pnas.69.1.4 | journal=Proceedings of the National Academy of Sciences of the USA | volume=69 | issue=1 | pages=4&amp;ndash;6 | year=1972 | author=Sankoff D | title=Matching sequences under deletion/insertion constraints | url=http://www.pnas.org/content/69/1/4.abstract | pmid=4500555 | pmc=427531}} by David Sankoff in 1972. Similar quadratic-time algorithms were discovered independently by T. K. Vintsyuk{{cite journal | journal=Kibernetika | volume=4 | pages=81&amp;ndash;88 | year=1968 | author=Vintsyuk TK | title=Speech discrimination by dynamic programming|url=https://link.springer.com/content/pdf/10.1007/BF01074755.pdf}} in 1968 for speech processing ([[Dynamic time warping|&amp;quot;time warping&amp;quot;]]), and by Robert A. Wagner and [[Michael J. Fischer]]{{cite journal |vauthors=Wagner RA, Fischer MJ | journal = [[Journal of the ACM]] | title=The string-to-string correction problem | volume=21 | issue=1 | year=1974 | pages=168&amp;ndash;173 | doi=10.1145/321796.321811}} in 1974 for string matching. Needleman and Wunsch formulated their problem in terms of maximizing similarity. Another possibility is to minimize the [[Levenshtein distance|edit distance]] between sequences, introduced by [[Vladimir Levenshtein]]. Peter H. Sellers showed{{cite journal | doi=10.1137/0126070 | title=On the theory and computation of evolutionary distances | author=Sellers PH | journal = SIAM Journal on Applied Mathematics | volume = 26 | issue = 4 | pages = 787&amp;ndash;793 | year = 1974}} in 1974 that the two problems are equivalent. The Needleman&amp;ndash;Wunsch algorithm is still widely used for optimal [[Sequence alignment#Global and local alignments|global alignment]], particularly when the quality of the global alignment is of the utmost importance. However, the algorithm is expensive with respect to time and space, proportional to the product of the length of two sequences and hence is not suitable for long sequences. Recent development has focused on improving the time and space cost of the algorithm while maintaining quality. For example, in 2013, a Fast Optimal Global Sequence Alignment Algorithm (FOGSAA),{{cite journal|last1=Chakraborty|first1=Angana|last2=Bandyopadhyay|first2=Sanghamitra|title=FOGSAA: Fast Optimal Global Sequence Alignment Algorithm|journal=Scientific Reports|date=29 April 2013|volume=3|doi=10.1038/srep01746|url=http://www.nature.com/srep/2013/130429/srep01746/full/srep01746.html|accessdate=11 September 2014|pmid=23624407|pmc=3638164}} suggested alignment of nucleotide/protein sequences faster than other optimal global alignment methods, including the Needleman&amp;ndash;Wunsch algorithm. The paper claims that when compared to the Needleman&amp;ndash;Wunsch algorithm, FOGSAA achieves a time gain of 70&amp;ndash;90% for highly similar nucleotide sequences (with &amp;gt; 80% similarity), and 54&amp;ndash;70% for sequences having 30&amp;ndashndash;80% similarity. ==Global alignment tools using the Needleman&amp;ndash;Wunsch algorithm== * [http://www.ebi.ac.uk/Tools/psa EMBOSS Needle and EMBOSS Stretcher Global Alignment Tools] * [https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&amp;amp;PROG_DEF=blastn&amp;amp;BLAST_PROG_DEF=blastn&amp;amp;BLAST_SPEC=GlobalAln&amp;amp;LINK_LOC=BlastHomeLink Needleman-Wunsch alignment for two nucleotide sequences] * [http://www.mathworks.com.au/help/bioinfo/ref/nwalign.html MathWorks - Globally align two sequences using Needleman-Wunsch algorithm] * [https://github.com/bitkeeper-scm/bitkeeper/blob/master/src/libdiff.c#L948 BitKeeper &amp;ndash; Source Control Management Software] ==Applications outside bioinformatics== ===[[Computer stereo vision]]=== Stereo matching is an essential step in the process of 3D reconstruction from a pair of stereo images. When images have been rectified, an analogy can be drawn between aligning nucleotide and protein sequences and matching [[pixels]] belonging to [[scan lines]], since both tasks aim at establishing optimal correspondence between two strings of characters. The &amp;lsquo;right&amp;rsquo; image of a stereo pair can be seen as a mutated version of the &amp;lsquo;left&amp;rsquo; image: noise and individual camera sensitivity alter pixel values (i.e. character substitutions); and different view angle reveals previously occluded data and introduces new occlusions (i.e. insertion and deletion of characters). As consequence, minor modifications of the Needleman&amp;ndash;Wunsch algorithm make it suitable for stereo matching.Dieny R., Thevenon J., Martinez-del-Rincon J., Nebel J.-C. (2011) &amp;quot;[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.220.9699 Bioinformatics inspired algorithm for stereo correspondence]&amp;quot;. International Conference on Computer Vision Theory and Applications, March 5&amp;ndash;7, Vilamoura - Algarve, Portugal. Although performances in terms of accuracy are not state-of-the-art, the relative simplicity of the algorithm allows its implementation on [[embedded systems]].Madeo S., Pelliccia R., Salvadori C., Martinez-del-Rincon J., Nebel J.-C. (2014) &amp;quot;[https://pure.qub.ac.uk/portal/files/12586298/stereovision_end.pdf An optimized stereo vision implementation for embedded systems: application to RGB and Infra-Red images]&amp;quot;. Journal of Real-Time Image Processing. Although in many applications image rectification can be performed, e.g. by [[camera resectioning]] or calibration, it is sometimes impossible or impractical since the computational cost of accurate rectification models prohibit their usage in [[Real-time computing|real-time]] applications. Moreover, none of these models is suitable when a camera lens displays unexpected [[distortions]], such as those generated by raindrops, weatherproof covers or dust. By extending the Needleman&amp;ndash;Wunsch algorithm, a line in the &amp;lsquo;left&amp;rsquo; image can be associated to a curve in the &amp;lsquo;right&amp;rsquo; image by finding the alignment with the highest score in a three-dimensional array (or matrix). Experiments demonstrated that such extension allows dense pixel matching between unrectified or distorted images.Martinez-del-Rincon J., Thevenon J., Dieny R., Nebel J.-C. (2012) &amp;quot;[https://www.researchgate.net/profile/Jean-Christophe_Nebel/publication/257928290_Dense_pixel_matching_between_unrectified_and_distorted_images_using_dynamic_programming/links/0c9605266f2d98c719000000.pdf Dense Pixel Matching Between Unrectified and Distorted Images Using Dynamic Programming]&amp;quot;. International Conference on Computer Vision Theory and Applications, 24&amp;ndash;26 February, Rome, Italy. ==See also== * [[Smith&amp;ndash;Waterman algorithm]] * [[Sequence mining]] * [[Levenshtein distance]] * [[Dynamic time warping]] * [[Sequence alignment]] ==References== {{Reflist}} ==External links== {{External links|date=May 2017}} * [http://zhanglab.ccmb.med.umich.edu/NW-align NW-align: A protein sequence-to-sequence alignment program by Needleman-Wunsch algorithm (online server and source code)] * [[:File:ParallelNeedlemanAlgorithm.pdf | Parallel Needleman-Wunsch Algorithm for Grid]] - Implementation by Tahir Naveed, Imitaz Saeed Siddiqui and Shaftab Ahmed - Bahria University * [http://chneukirchen.org/blog/archive/2006/03/dynamic-programming-in-haskell.html Needleman-Wunsch Algorithm as Haskell Code] * [http://ds9a.nl/nwunsch A live Javascript-based demo of Needleman&amp;ndash;Wunsch] * [http://alfehrest.org/sub/nwa/index.html An interactive Javascript-based visual explanation of Needleman-Wunsch Algorithm] * [http://baba.sourceforge.net/ B.A.B.A.] &amp;mdash; an applet (with source) which visually explains the algorithm. * [https://web.archive.org/web/20140729133238/http://www.science.marshall.edu/murraye/Clearer%20Matrix%20slide%20show.pdf A clear explanation of NW and its applications to sequence alignment] * [http://technology66.blogspot.com/2008/08/sequence-alignment-techniques.html Sequence Alignment Techniques at Technology Blog] * [https://web.archive.org/web/20091106224548/http://svitsrv25.epfl.ch/R-doc/library/Biostrings/html/00Index.html Biostrings] R package implementing Needleman&amp;ndash;Wunsch algorithm among others * [https://gist.github.com/jonasmalacofilho/5226596 Needleman-Wunsch Algorithm as Haxe Code] {{Strings}} {{DEFAULTSORT:Needleman-Wunsch Algorithm}} [[Category:Bioinformatics algorithms]] [[Category:Sequence alignment algorithms]] [[Category:Computational phylogenetics]] [[Category:Dynamic programming]] [[Category:Articles with example pseudocode]]&lt;/p&gt;</p>
<p>&lt;p&gt;&amp;nbsp;&lt;/p&gt;<br />&nbsp;</p>
Anonymous user

Navigation menu