DATA extraction from FASTA
FASTA format is an arranged sequence data file.
FASTA file starts from > in the first row.
This row contains basic information such as sequence ID.
1. Go to the NCBI (http://www.ncbi.nlm.nih.gov/) , National Center for Biotechnology Information
2. Choose the database that you want and search the information that you want to find
Ex) Database - Gene / Search - Homo sapiens chromosome 2 / Choose what you want to look at
3. Choose FASTA from the 'Go to nucleotide' section
1)Display settings: FASTA
Send to: File
You get a link like this:
http://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?tool=portal&sendto=on&log$=seqview&db=nuccore&dopt=fasta&val=51466650&extrafeat=0&maxplex=1
4. Using <STDIN>, we can open the file in a perl program
Ex) print "Please type the filename to extract: ";
$DNAfilename = <STDIN>;
chomp $DNAfilename;
open(DNAFILE, $DNAfilename);
@FileData = <DNAFILE>;
close DNAFILE;
print @FileData;
exit;