Position-specific scoring matrix

From WikiMD's Wellness Encyclopedia

Position-specific scoring matrix

A position-specific scoring matrix (PSSM), also known as a position weight matrix (PWM), is a commonly used representation of motifs (patterns) in biological sequences. PSSMs are widely used in bioinformatics for tasks such as sequence alignment, motif finding, and protein structure prediction.

Overview[edit | edit source]

A PSSM is a matrix that describes the probability of each possible residue (nucleotide or amino acid) occurring at each position in a sequence motif. Each column of the matrix corresponds to a position in the motif, and each row corresponds to one of the possible residues. The values in the matrix are typically log-odds scores, which represent the log of the ratio of the observed frequency of a residue at a position to the expected frequency of that residue.

Construction[edit | edit source]

To construct a PSSM, one typically starts with a set of aligned sequences that are believed to contain the motif of interest. The frequency of each residue at each position is calculated, and these frequencies are converted into scores. The scores can be calculated using the formula:

\[ S_{ij} = \log_2 \left( \frac{f_{ij}}{p_j} \right) \]

where \(S_{ij}\) is the score for residue \(i\) at position \(j\), \(f_{ij}\) is the frequency of residue \(i\) at position \(j\), and \(p_j\) is the background frequency of residue \(i\).

Applications[edit | edit source]

PSSMs are used in various applications in bioinformatics:

  • Sequence Alignment: PSSMs are used in BLAST and other sequence alignment tools to score alignments based on the likelihood of observing certain residues at specific positions.
  • Motif Finding: PSSMs are used to identify conserved motifs in DNA, RNA, or protein sequences, which can be indicative of functional or structural elements.
  • Protein Structure Prediction: PSSMs can be used to predict secondary and tertiary structures of proteins by identifying conserved patterns that correspond to structural features.

Advantages and Limitations[edit | edit source]

PSSMs provide a simple yet powerful way to represent sequence motifs. They are easy to interpret and can be used to score sequences quickly. However, PSSMs assume independence between positions, which may not always be the case in biological sequences. More complex models, such as hidden Markov models (HMMs), can capture dependencies between positions but are computationally more intensive.

Also see[edit | edit source]


WikiMD
Navigation: Wellness - Encyclopedia - Health topics - Disease Index‏‎ - Drugs - World Directory - Gray's Anatomy - Keto diet - Recipes

Search WikiMD

Ad.Tired of being Overweight? Try W8MD's physician weight loss program.
Semaglutide (Ozempic / Wegovy and Tirzepatide (Mounjaro / Zepbound) available.
Advertise on WikiMD

WikiMD's Wellness Encyclopedia

Let Food Be Thy Medicine
Medicine Thy Food - Hippocrates

Medical Disclaimer: WikiMD is not a substitute for professional medical advice. The information on WikiMD is provided as an information resource only, may be incorrect, outdated or misleading, and is not to be used or relied on for any diagnostic or treatment purposes. Please consult your health care provider before making any healthcare decisions or for guidance about a specific medical condition. WikiMD expressly disclaims responsibility, and shall have no liability, for any damages, loss, injury, or liability whatsoever suffered as a result of your reliance on the information contained in this site. By visiting this site you agree to the foregoing terms and conditions, which may from time to time be changed or supplemented by WikiMD. If you do not agree to the foregoing terms and conditions, you should not enter or use this site. See full disclaimer.
Credits:Most images are courtesy of Wikimedia commons, and templates Wikipedia, licensed under CC BY SA or similar.

Contributors: Prab R. Tumpati, MD