Since the basic equation for a logistic regression does not include an error term, where does the standard error come from in software that calculates it? Is this a matter of no error in theory, but an error in the method used to approximate the estimate?
From the SAS web site, this might help:
Interesting, but doesn’t seem to be on topic. Or at least I cannot see how.
It’s been a while since I did any, but my recollection is a logistic regression is just a linear regression on transformed data usually the probability, p, of something happening is estimated as
ln(p/(1-p)) = b[sub]0[/sub] + b[sub]1[/sub]X[sub]1[/sub] + … + b[sub]n[/sub] + e
e is the error term and its standard error is used to determine the standard errors of the b estimates.
There may be other transformations as well, but the basic idea is to use some function f§ which maps p in (0,1) to (-inf, +inf). But whatever transformation you use, the standard errors of the regression coefficients, b, comes from basic OLS theory (though you could use other regression methods like weighted least squares or GLS).
The standard error of an estimated p would be different as the estimated p would not have a normal distribution; rather z = ln(p/(1-p)) would. But you could still do all your significance tests directly on z. The rejection probabilities would be the same as standard.
Oldguy, is correct, I was just trying to provide a larger overview of looking at model validity.
That sounds more like a probit model to me.
Simple logistic regression models the probability of success of an event as a function of predictors. You can write out the model just like you would a normal-theory regression model:
Y_i = beta0 + beta1*X_i + e_i.
The e_i are random variables (the betas and X are assumed to be constants); they’re just not normally distributed.
Missed the edit window:
I meant to include that usually, we specify the model as E(Y_i) = p_i = P(Y_i = 1) = exp(beta0+beta1X_i)/(1+beta0+beta1X_i). The parameters can be estimated using numerical maximum likelihood estimation. Maximum likelihood estimators, in turn, have limiting normal distributions, so the estimated variances of the parameter estimates can be found.
No the “logit” refers to the log of the odds ratio Z = log(p/(1-p)).
A probit regression is:
N[sup]-1/sup = b[sub]0[/sub] + b[sub]1[/sub]x[sub]1[/sub] + … + b[sub]n[/sub]x[sub]n[/sub]
where N() is the cumulative normal distribution.