NLM logo


How BLAST Works


We will take a high-level view of the steps performed by BLAST to generate an alignment, with an emphasis on the "words" used to seed BLAST alignments, and we'll briefly discuss Expect values.

For more detail, see this explanation of the Blast process.

Global versus local alignments

Comparison of Global versus Local Alignments

BLAST overview

    • read in the query, database, and search parameters
    • apply query filters, e.g., low complexity and repeats
    • make a lookup table of query “words”
Preliminary search
    • scan the database for word matches
    • gap-free extensions
    • gapped extensions, minus deletions/insertions
    • gapped extensions, calculate the deletions/insertions

Nucleotides: Word size, and Summary

Nucleotide query word size

BLASTn Overview image

Proteins: Word size, and Summary

Protein query word sizeProtein 2 word match requirement image

Protein neighborhood word matches

BLASTp overview

Expect values

E = number of database hits you expect to find by chance, ≥ S

Read about:  The Statistics of Sequence Similarity Scores

Example E-value that is found by chance.

Example E-value not due to chance

BLAST Expect Value (In a Nutshell)
  • E = number of database hits you expect to find by chance
  • As the database size increases .... E increases
  • As the score increases .... E decreases 

Limits, Errors and Warnings

Web BLAST Search Limits
  • 5,000 - maximum number of target sequences
  • 1,000,000 - maximum sequence length for nucleotide queries
  • 100,000 - maximum sequence length for protein queries

BLAST News Feed    |   NCBI Insights Blog about BLAST settings

Error Messages
Error message for CPU limit exceeded

Error message for Process limit exceeded

Warning Message
Warning message: IP address - large amount of server CPU time

Previous Section Next Section

Last Reviewed: September 30, 2022