Difference between revisions of "DNA Alignment Algorithm"

From Biolecture.org
imported>Joowon Yoon
imported>Joowon Yoon
Line 2: Line 2:
  
 
<p>&quot;Algorithms are very important in computer Science. The best chosen algorithm makes sure computer will do the given task at best possible manner. In cases where efficiency matter a proper algorithm is really vital to be used. An algorithm is important in optimizing a computer program according to the available resources. &quot; (https://www.linkedin.com/pulse/importance-algorithm-its-types-shibaji-debnath)</p>
 
<p>&quot;Algorithms are very important in computer Science. The best chosen algorithm makes sure computer will do the given task at best possible manner. In cases where efficiency matter a proper algorithm is really vital to be used. An algorithm is important in optimizing a computer program according to the available resources. &quot; (https://www.linkedin.com/pulse/importance-algorithm-its-types-shibaji-debnath)</p>
 +
 +
<p>&nbsp;</p>
 +
 +
<p>It is hard to think the alignment algorithm&nbsp;myself, so I searched on Google.&nbsp;</p>
 +
 +
<p>There was a very powerful algorithm. Scoring the alignment and showing the best scored alignment results.</p>
 +
 +
<p>This argorithm is called &quot;Needleman-Wunsch algorithm&quot;. If you want to know detail, follow the links below.</p>
 +
 +
<p><u>https://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm</u> (Wikipedia)</p>
 +
 +
<p><u>http://web.skhu.ac.kr/~mckim1/Lecture/DS/dna/class13/class13_04.html&nbsp;</u><br />
 +
<u>Video Lecture :&nbsp;https://www.youtube.com/watch?v=-0gG_rOhcT8</u></p>
  
 
<p>&nbsp;</p>
 
<p>&nbsp;</p>
Line 11: Line 24:
 
&#39;AGACATACTAATTCGGTCCATTATA&#39;;</p>
 
&#39;AGACATACTAATTCGGTCCATTATA&#39;;</p>
  
<p>Because they have <strong>same length(25b)</strong>, the algorithm is simple,</p>
+
<p>(1) Start with 5&#39;end base of sequences.&nbsp;</p>
  
<p>(1) Start with 5&#39;end base of first sequence.&nbsp;</p>
+
<p>(2)&nbsp;Compare it with base of one other sequence whether they are same or not.&nbsp;</p>
  
<p>(2)&nbsp;Compare it with base of second sequence whether they are same or not.&nbsp;</p>
+
<p>(3)&nbsp;If they are same, &quot;Score&quot;+1.&nbsp;If they are different, &quot;Score&quot;-1. Do this work until the 3&#39;end base.</p>
  
<p>(3)&nbsp;If they are same, remain the base. If they are different, put &quot;-&quot; instead of existing base.</p>
+
<p>(4) Repeat (2) and (3) with putting gaps. Whenever putting gaps, &quot;Score&quot;-2,</p>
  
<p>(4)&nbsp;Repeat (2) and (3) wih third sequence.</p>
+
<p>(5) Find the case with maximum score. That sequences are best alignment.</p>
 
 
<p>(5) Repeat (2),(3),(4) with next base until the 3&#39;end base.</p>
 
  
 
<p>&nbsp;</p>
 
<p>&nbsp;</p>
  
<p>I thought it could be utilized&nbsp;in very&nbsp;restrictive condition, so I wanted to make &quot;Master Code&quot; that could be utilized in any condition.</p>
+
<p>Perl code was available on google (https://www.perlmonks.org/?node_id=819506)</p>
  
<p>It is hard to think myself, so I searched on Google.&nbsp;</p>
+
<p>But it was not work because of some errors. I just fixed some parts of it.&nbsp;<br />
 +
[[Code for alignment]]</p>
  
<p>There was a very powerful algorithm. Scoring the alignment and showing the best scored alignment results.</p>
+
<p>&nbsp;</p>
 
 
<p>This argorithm is called &quot;Needleman-Wunsch algorithm&quot;. If you want to know detail, follow the links below.</p>
 
 
 
<p><u>https://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm</u> (Wikipedia)</p>
 
 
 
<p><u>http://web.skhu.ac.kr/~mckim1/Lecture/DS/dna/class13/class13_04.html&nbsp;</u><br />
 
<u>Video Lecture :&nbsp;https://www.youtube.com/watch?v=-0gG_rOhcT8</u></p>
 
  
 
<p>&nbsp;</p>
 
<p>&nbsp;</p>

Revision as of 19:07, 23 November 2018

Before starting coding, we need to think about "Algorithm". 

"Algorithms are very important in computer Science. The best chosen algorithm makes sure computer will do the given task at best possible manner. In cases where efficiency matter a proper algorithm is really vital to be used. An algorithm is important in optimizing a computer program according to the available resources. " (https://www.linkedin.com/pulse/importance-algorithm-its-types-shibaji-debnath)

 

It is hard to think the alignment algorithm myself, so I searched on Google. 

There was a very powerful algorithm. Scoring the alignment and showing the best scored alignment results.

This argorithm is called "Needleman-Wunsch algorithm". If you want to know detail, follow the links below.

https://en.wikipedia.org/wiki/Needleman%E2%80%93Wunsch_algorithm (Wikipedia)

http://web.skhu.ac.kr/~mckim1/Lecture/DS/dna/class13/class13_04.html 
Video Lecture : https://www.youtube.com/watch?v=-0gG_rOhcT8

 

Let's start with our homework, Align these 3 DNA sequences

'AAGAATAGTATTTCGCTTTTTTATA';
'AGAAATAGTATTTCGGTTAATTATA';
'AGACATACTAATTCGGTCCATTATA';

(1) Start with 5'end base of sequences. 

(2) Compare it with base of one other sequence whether they are same or not. 

(3) If they are same, "Score"+1. If they are different, "Score"-1. Do this work until the 3'end base.

(4) Repeat (2) and (3) with putting gaps. Whenever putting gaps, "Score"-2,

(5) Find the case with maximum score. That sequences are best alignment.

 

Perl code was available on google (https://www.perlmonks.org/?node_id=819506)

But it was not work because of some errors. I just fixed some parts of it. 
Code for alignment