Generalized linear model

From WikiMD's Wellness Encyclopedia

Generalized Linear Model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables to have error distribution models other than a normal distribution. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

Introduction[edit | edit source]

Generalized Linear Models are a way of extending the linear model so that the target variable, Y, can have a non-normal distribution. They are used in various fields such as Biostatistics, Machine Learning, and Econometrics to model complex relationships between variables. The GLM consists of three components:

  • The random component specifies the probability distribution of the response variable (e.g., normal, binomial, Poisson).
  • The systematic component specifies the linear predictor, which is a linear combination of unknown parameters and known covariates.
  • The link function specifies the relationship between the linear predictor and the mean of the distribution function.

Mathematical Formulation[edit | edit source]

Given a dataset with n observations: \((y_1, x_1), (y_2, x_2), ..., (y_n, x_n)\), where \(y_i\) is the response variable and \(x_i\) is a vector of covariates for the ith observation, the GLM posits that:

\[g(E(Y|X)) = \beta_0 + \beta_1X_1 + ... + \beta_pX_p\]

where \(E(Y|X)\) is the expected value of \(Y\) given \(X\), \(g(\cdot)\) is the link function, and \(\beta_0, \beta_1, ..., \beta_p\) are the coefficients to be estimated.

Types of GLMs[edit | edit source]

GLMs can be categorized based on the distribution of the response variable and the link function used. Common types include:

  • Linear Regression: Normal distribution with identity link.
  • Logistic Regression: Binomial distribution with logit link, used for binary outcomes.
  • Poisson Regression: Poisson distribution with log link, used for count data.

Estimation[edit | edit source]

The parameters of a GLM are usually estimated using the method of Maximum Likelihood Estimation (MLE). The goal is to find the parameter values that maximize the likelihood of observing the given data.

Applications[edit | edit source]

GLMs have a wide range of applications, including:

Advantages and Limitations[edit | edit source]

The main advantage of GLMs is their flexibility in modeling different types of data. However, they also have limitations, such as the assumption of linearity between the transformed response and predictors, and the need for correct specification of the link function and distribution.

See Also[edit | edit source]

References[edit | edit source]


WikiMD
Navigation: Wellness - Encyclopedia - Health topics - Disease Index‏‎ - Drugs - World Directory - Gray's Anatomy - Keto diet - Recipes

Search WikiMD

Ad.Tired of being Overweight? Try W8MD's physician weight loss program.
Semaglutide (Ozempic / Wegovy and Tirzepatide (Mounjaro / Zepbound) available.
Advertise on WikiMD

WikiMD's Wellness Encyclopedia

Let Food Be Thy Medicine
Medicine Thy Food - Hippocrates

WikiMD is not a substitute for professional medical advice. See full disclaimer.
Credits:Most images are courtesy of Wikimedia commons, and templates Wikipedia, licensed under CC BY SA or similar.

Contributors: Prab R. Tumpati, MD