logistic regression and standard errors

alphabeta · December 2, 2010, 3:24pm

Since the basic equation for a logistic regression does not include an error term, where does the standard error come from in software that calculates it? Is this a matter of no error in theory, but an error in the method used to approximate the estimate?

cards · December 2, 2010, 3:53pm

From the SAS web site, this might help:

alphabeta · December 2, 2010, 6:12pm

Interesting, but doesn’t seem to be on topic. Or at least I cannot see how.

OldGuy · December 2, 2010, 6:57pm

It’s been a while since I did any, but my recollection is a logistic regression is just a linear regression on transformed data usually the probability, p, of something happening is estimated as

ln(p/(1-p)) = b[sub]0[/sub] + b[sub]1[/sub]X[sub]1[/sub] + … + b[sub]n[/sub] + e

e is the error term and its standard error is used to determine the standard errors of the b estimates.

There may be other transformations as well, but the basic idea is to use some function f§ which maps p in (0,1) to (-inf, +inf). But whatever transformation you use, the standard errors of the regression coefficients, b, comes from basic OLS theory (though you could use other regression methods like weighted least squares or GLS).

The standard error of an estimated p would be different as the estimated p would not have a normal distribution; rather z = ln(p/(1-p)) would. But you could still do all your significance tests directly on z. The rejection probabilities would be the same as standard.

cards · December 2, 2010, 7:35pm

Oldguy, is correct, I was just trying to provide a larger overview of looking at model validity.

ultrafilter · December 2, 2010, 9:00pm

That sounds more like a probit model to me.

statsman1982 · December 3, 2010, 4:45am

Simple logistic regression models the probability of success of an event as a function of predictors. You can write out the model just like you would a normal-theory regression model:

Y_i = beta0 + beta1*X_i + e_i.

The e_i are random variables (the betas and X are assumed to be constants); they’re just not normally distributed.

statsman1982 · December 3, 2010, 5:08am

Missed the edit window:
I meant to include that usually, we specify the model as E(Y_i) = p_i = P(Y_i = 1) = exp(beta0+beta1X_i)/(1+beta0+beta1X_i). The parameters can be estimated using numerical maximum likelihood estimation. Maximum likelihood estimators, in turn, have limiting normal distributions, so the estimated variances of the parameter estimates can be found.

OldGuy · December 4, 2010, 2:40am

No the “logit” refers to the log of the odds ratio Z = log(p/(1-p)).

A probit regression is:

N[sup]-1/sup = b[sub]0[/sub] + b[sub]1[/sub]x[sub]1[/sub] + … + b[sub]n[/sub]x[sub]n[/sub]

where N() is the cumulative normal distribution.

Topic		Replies	Views
Weighted Standard Error of the Mean(desperate for help) Factual Questions	6	9052	January 18, 2012
Statistics--SAS help Factual Questions	6	1027	October 14, 2000
Statistics People? Factual Questions	10	1733	May 8, 2011
MS Excel Q: Errors in Linear Regression? Factual Questions	0	677	February 4, 2008
Different R-Squared depending on MS Excel/Powerpoint? Factual Questions	6	1976	December 28, 2009

logistic regression and standard errors

Related topics