DATA extraction from FASTA

FASTA format is an arranged sequence data file.

FASTA file starts from > in the first row.

This row contains basic information such as sequence ID.

1. Go to the NCBI (http://www.ncbi.nlm.nih.gov/) , National Center for Biotechnology Information

2. Choose the database that you want and search the information that you want to find

Ex) Database - Gene / Search - Homo sapiens chromosome 2 / Choose what you want to look at

3. Choose FASTA from the 'Go to nucleotide' section

4. Using <STDIN>, we can open the file in a perl program

Ex) print "Please type the filename to extract: ";

$DNAfilename = <STDIN>;

chomp $DNAfilename;

open(DNAFILE, $DNAfilename);

@FileData = <DNAFILE>;

close DNAFILE;

print @FileData;

exit;

Navigation menu