BLAST
BLAST (Basic Local Alignment Search Tool) is a bioinformatics algorithm used for comparing primary biological sequence information, such as the amino acids of different proteins or the nucleotides of DNA sequences. A BLAST search enables a researcher to compare a query sequence with a library or database of sequences, and identify library sequences that resemble the query sequence above a certain threshold.
Functionality[edit]
BLAST is one of the most widely used bioinformatics tools, due to its speed and versatility. It can be used for several kinds of sequence comparisons, including:
- Nucleotide BLAST (blastn): for comparing nucleotide sequences.
- Protein BLAST (blastp): for comparing protein sequences.
- blastx: for comparing a nucleotide query sequence translated in all reading frames to a protein database.
- tblastn: for comparing a protein query sequence against a nucleotide sequence database dynamically translated in all reading frames.
- tblastx: for comparing the six-frame translations of a nucleotide query sequence against the six-frame translations of a nucleotide sequence database.
Algorithm[edit]
The BLAST algorithm has several key features that make it faster than exhaustive sequence comparison strategies:
- It looks for matches between word-length sequences (words) in the query and the database sequences.
- It uses a scoring matrix (like BLOSUM62 or PAM) to identify high-scoring pairs (HSPs) of words.
- It extends these HSPs using a heuristic approach to find longer matches that might be biologically significant.
Applications[edit]
BLAST is used extensively in molecular biology, for tasks such as:
- Identifying species
- Locating domains
- Establishing phylogeny
- Comparing genes across different organisms
Limitations[edit]
While BLAST is powerful, it has limitations:
- It is less effective for sequences that have diverged significantly.
- It may miss significant alignments if they are not the highest scoring ones.
- It is not designed for finding structural similarities or alignments involving long gaps.