Logistic Regression

Logistic regression is a variation of ordinary regression that is used when the dependent (response) variable is dichotomous (i. e., takes two values). The dichotomous variable represents the occurrence or non-occurrence of some outcome event, usually coded as 0 or 1, and the independent (input) variables are continuous, categorical, or both (i.e., in a medical study, the patient survives or does not survive).

Unlike ordinary linear regression, logistic regression does not assume that the relationship between the independent and dependent variables are linear. Nor does it assume that the dependent variable or the error terms are distributed normally. In logistic regression, a binary logistic model is used to estimate the probability of a binary response based on one or more predictor or independent variables.

The binary logistic model is displayed as in the following

where, p is the probability that Y=1 and X₁, X₂,.. .,X_k are the independent variables (predictors). b₀ , b₁, b₂, .... b_k are known as the regression coefficients, which are estimated from the data. Logistic regression estimates the probability of a certain event occurring.
Logistic regression thus forms a predictor variable (log (p/(1-p)) that is a linear combination of the explanatory variables. The values of this predictor variable are then transformed into probabilities by a logistic function. Such a function has the shape of an S. The values of the predictor variable are displayed on the horizontal axis, and the probabilities are on the vertical axis.

Logistic regression produces Odds Ratios (OR) associated with each predictor value. The odds of an event are defined as the probability of the event occurring divided by the probability of the event not occurring. In general, the OR is one set of odds divided by another. The odds ratio for a predictor is defined as the relative amount by which the odds of the outcome increase (OR greater than 1.0) or decrease (OR less than 1.0) when the value of the predictor variable is increased by 1.0 units. In other words, (odds for PV+1)/(odds for PV) where PV is the value of the predictor variable.