PERL

From Biolecture.org

Basic of Perl


1) Variable

Variable is a place to store a value, so we can refer to it or manipulate it throughout program. Perl has three types of variables; scalars, arrays and hases.

Scalar ($) 

Scalar variable stores a single (scalar) value. Perl scalar names are prefixed with a dollar sign ($), so for example, $x, $y, $z, $username, and $url are all examples of scalar variable names. A scalar can hold data of any type, be it a string, a number, or whatnot.

ex) 

$name = "Byeongeun Lee";

Array (@)

An array stores a list of values. While a scalar variable can only store one value, an array can store many. Perl array names are prefixed with an at-sign (@). In Perl, array indices start with 0, so to refer to the first element of the array @colors, you use $colors[0]. Note that when you're referring to a single element of an array, you prefix the name with a $ instead of the @. The $-sign again indicates that it's a single (scalar) value; the @-sign means you're talking about the entire array.

ex)

@Grades = ("A","B","C");

 

Hash (%)

A hash is a special kind of array - an associative array, or paired group of elements. Perl hash names are prefixed with a percent sign (%), and consist of pairs of elements - a key and a data value.

ex)

my %courses = (
    "Cell bio" => "prof.P",
    "Micro" => "prof.M",
);

2) Function

Functions are blocks of codes that are given names so that we can use them as needed. Functions help to organize code into pieces that are easy to understand and work with. They help to build program step by step, testing the code along the way.

substr( )

The substr() function is used to return a substring from the expression supplied as its first argument. substr() has a variable number of arguments, it can be told to start at an offset from either end of the expression, so it can supply a replacement string so that it replaces part of the expression as well as returning it, and it can be assigned to.

substr( ) have three parameters; ( character, offset : location from "0", length of character which will be extracted ). And it is also possible to add fourth parameter, the character which wants to exchage with. 

ex1) 

my $str = "The black cat climbed the green tree";

say substr $str, 4, 5;               # black

ex2)

my $z = substr $str, 14, 7, "jumped from";

say $z;                   # climbed

say $str;                 # The black cat jumped from the green tree

 

Assignment study


Translate combinations of triple bases into amino acids

 

<strong>$text = "aaatgaccgatcagctacgatcagctataaaaaccccggagctacgatcatcg";</strong>

%convertor = (
    'TCA' => 'S',    # Serine
    'TCC' => 'S',    # Serine
    'TCG' => 'S',    # Serine
    'TCT' => 'S',    # Serine
    'TTC' => 'F',    # Phenylalanine
    'TTT' => 'F',    # Phenylalanine
    'TTA' => 'L',    # Leucine
    'TTG' => 'L',    # Leucine
    'TAC' => 'Y',    # Tyrosine
    'TAT' => 'Y',    # Tyrosine
    'TAA' => '_',    # Stop
    'TAG' => '_',    # Stop
    'TGC' => 'C',    # Cysteine
    'TGT' => 'C',    # Cysteine
    'TGA' => '_',    # Stop
    'TGG' => 'W',    # Tryptophan
    'CTA' => 'L',    # Leucine
    'CTC' => 'L',    # Leucine
    'CTG' => 'L',    # Leucine
    'CTT' => 'L',    # Leucine
    'CCA' => 'P',    # Proline
    'CCC' => 'P',    # Proline
    'CCG' => 'P',    # Proline
    'CCT' => 'P',    # Proline
    'CAC' => 'H',    # Histidine
    'CAT' => 'H',    # Histidine
    'CAA' => 'Q',    # Glutamine
    'CAG' => 'Q',    # Glutamine
    'CGA' => 'R',    # Arginine
    'CGC' => 'R',    # Arginine
    'CGG' => 'R',    # Arginine
    'CGT' => 'R',    # Arginine
    'ATA' => 'I',    # Isoleucine
    'ATC' => 'I',    # Isoleucine
    'ATT' => 'I',    # Isoleucine
    'ATG' => 'M',    # Methionine
    'ACA' => 'T',    # Threonine
    'ACC' => 'T',    # Threonine
    'ACG' => 'T',    # Threonine
    'ACT' => 'T',    # Threonine
    'AAC' => 'N',    # Asparagine
    'AAT' => 'N',    # Asparagine
    'AAA' => 'K',    # Lysine
    'AAG' => 'K',    # Lysine
    'AGC' => 'S',    # Serine
    'AGT' => 'S',    # Serine
    'AGA' => 'R',    # Arginine
    'AGG' => 'R',    # Arginine
    'GTA' => 'V',    # Valine
    'GTC' => 'V',    # Valine
    'GTG' => 'V',    # Valine
    'GTT' => 'V',    # Valine
    'GCA' => 'A',    # Alanine
    'GCC' => 'A',    # Alanine
    'GCG' => 'A',    # Alanine
    'GCT' => 'A',    # Alanine
    'GAC' => 'D',    # Aspartic Acid
    'GAT' => 'D',    # Aspartic Acid
    'GAA' => 'E',    # Glutamic Acid
    'GAG' => 'E',    # Glutamic Acid
    'GGA' => 'G',    # Glycine
    'GGC' => 'G',    # Glycine
    'GGG' => 'G',    # Glycine
    'GGT' => 'G',    # Glycine
    );


for ($s=0; $s<3; $s++) {
        $scrap = substr($text,0,$s);
        $main = substr($text,$s);
        $main =~ s/(...)/"$convertor{uc $1}" || "?"/eg;
        print "$scrap$main\n";
        }

 

%convertor = ( ... );

= used for giving information for translation (codon into amino acids)

for ($s=0; $s<3; $s++)

= to recognzie triple bases

substr($text,0,$s);

= return the characters from specific location

= return characters from text file from 0 location for length $s