BuKyung Create a flat text file database of protein sequences with hash function in Perl
Back to Baik BuKyung
Source code:
#!/usr/bin/perl
use strict;
use warnings;
open FH, ">", "outerl.txt" or die "$!\n";
my %sequence=();
my $seq_name;
my $seq;
while(<>){
if($_=~ />/){
$seq_name=$_;
$seq_name=~ s/\n//;
}
else{
$seq=$_;
$seq=~ s/\n//;
}
$sequence{$seq_name}=$seq;
}
foreach my $key (sort keys %sequence) {
print FH $key, " : ", $sequence{$key}, " \n";
}
Result
After execution of 10.pl with outer.fasta, the outerl.fasta file is generated.
The content of outer.fasta is
>0
LIEYMVYQVHECCMKNIKKSQVSARMRARGHMVQLYYEDWEPIISDQRNSAANRSDDRVIESQSKQNVKHSNWEQCMCWFKILINMWLGQMREPPIYEDI
>1
KHGGRDNLQSMPSLMNDNERRSMRSQRDWHGFWQVLRFMPFHGNNNMHQDCNSHSDQGFIRMDHCKHHRVNGLVISRRRPDHPNQFISWRYGDDSIQFYQ
>2
YWCYISQDNRAERASYYKEVQPNPPNGNRGFPWEPFDQCGVALNAMWKLCIHVNGNRPQNPGQGPYLKHMRVAVDELRSDPAVYFKEDKVDCRHEKFGDK
>3
KAHIQRVRQNNKRSIWGCKRAHGCQEWYNGMFWNHKCIWCREGGEESRPHNNEQIRPDMSGQRKAISPELAPLEGWMEYQCFRKDPKANEMRVNLEMAHM
>4
SRVRVCFKPMYGMIKHHSVHQECGIKDPSYGWLGRPEASHICIWGQHGNNINFMYGKIYRQSYRIPCEDKCPPAPAPLVIQEVWLAPAHRNNKLHKRRGR
generated by the BuKyung Randomly generate five 100 AA long protein sequences and store them in a FASTA file assignment program.
The contents of the outerl.txt is
>0 : LIEYMVYQVHECCMKNIKKSQVSARMRARGHMVQLYYEDWEPIISDQRNSAANRSDDRVIESQSKQNVKHSNWEQCMCWFKILINMWLGQMREPPIYEDI
>1 : KHGGRDNLQSMPSLMNDNERRSMRSQRDWHGFWQVLRFMPFHGNNNMHQDCNSHSDQGFIRMDHCKHHRVNGLVISRRRPDHPNQFISWRYGDDSIQFYQ
>2 : YWCYISQDNRAERASYYKEVQPNPPNGNRGFPWEPFDQCGVALNAMWKLCIHVNGNRPQNPGQGPYLKHMRVAVDELRSDPAVYFKEDKVDCRHEKFGDK
>3 : KAHIQRVRQNNKRSIWGCKRAHGCQEWYNGMFWNHKCIWCREGGEESRPHNNEQIRPDMSGQRKAISPELAPLEGWMEYQCFRKDPKANEMRVNLEMAHM
>4 : SRVRVCFKPMYGMIKHHSVHQECGIKDPSYGWLGRPEASHICIWGQHGNNINFMYGKIYRQSYRIPCEDKCPPAPAPLVIQEVWLAPAHRNNKLHKRRGR