BLAST

From WikiMD's WELLNESSPEDIA


BLAST (Basic Local Alignment Search Tool) is a bioinformatics algorithm used for comparing primary biological sequence information, such as the amino acids of different proteins or the nucleotides of DNA sequences. A BLAST search enables a researcher to compare a query sequence with a library or database of sequences, and identify library sequences that resemble the query sequence above a certain threshold.

Functionality[edit]

BLAST is one of the most widely used bioinformatics tools, due to its speed and versatility. It can be used for several kinds of sequence comparisons, including:

  • Nucleotide BLAST (blastn): for comparing nucleotide sequences.
  • Protein BLAST (blastp): for comparing protein sequences.
  • blastx: for comparing a nucleotide query sequence translated in all reading frames to a protein database.
  • tblastn: for comparing a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames.
  • tblastx: for comparing the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.

Algorithm[edit]

The BLAST algorithm has several key features that make it faster than exhaustive sequence comparison strategies:

  • It looks for matches between word-length sequences (words) in the query and the database sequences.
  • It uses a scoring matrix (like BLOSUM62 or PAM) to identify high-scoring pairs (HSPs) of words.
  • It extends these HSPs using a heuristic approach to find longer matches that might be biologically significant.

Applications[edit]

BLAST is used extensively in molecular biology, for tasks such as:

  • Identifying species
  • Locating domains
  • Establishing phylogeny
  • Comparing genes across different organisms

Limitations[edit]

While BLAST is powerful, it has limitations:

  • It is less effective for sequences that have diverged significantly.
  • It may miss significant alignments if they are not the highest scoring ones.
  • It is not designed for finding structural similarities or alignments involving long gaps.

See also[edit]

External links[edit]