DATA extraction from FASTA

From Biolecture.org

FASTA format is an arranged sequence data file.

FASTA file starts from > in the first row.

This row contains basic information such as sequence ID.

1. Go to the NCBI (http://www.ncbi.nlm.nih.gov/) , National Center for Biotechnology Information

2. Choose the database that you want and search the information that you want to find

Ex) Database - Gene / Search - Homo sapiens chromosome 2 / Choose what you want to look at

3. Choose FASTA from the 'Go to nucleotide' section

1)Display settings: FASTA

Send to: File

You get a link like this:

http://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?tool=portal&sendto=on&log$=seqview&db=nuccore&dopt=fasta&val=51466650&extrafeat=0&maxplex=1

 

4. Using <STDIN>, we can open the file in a perl program

Ex) print "Please type the filename to extract: ";

$DNAfilename = <STDIN>;

chomp $DNAfilename;

open(DNAFILE, $DNAfilename);

@FileData = <DNAFILE>;

close DNAFILE;

print @FileData;

exit;

Reference

1) http://seqanswers.com/forums/showthread.php?t=18354

 

KSH_0608 Bioinformatics with Bioperl