Header line of FASTA format

The header line, which begins with ‘>’, gives a name and/or a unique identifier for the sequence, and often lots of other information too. Many different sequence databases use standardized headers, which helps when automatically extracting information from the header. The header line may contain more than one header, separated by a ^A (Control-A) character (as in [1]).

In the original Pearson FASTA format, one or more comments, distinguished by a semi-colon at the beginning of the line, may occur after the header. Most databases and bioinformatics applications do not recognize these comments and follow the NCBI FASTA specification. An example of a multiple sequence FASTA file follows:

Tags:

Leave a Reply

Related Courses

No related posts