# Correlation coefficient in polynomial regression

I’ve figured out how to do multiple linear regression, i.e. regression to find the best the best fit polynomial (of a certain degree). For example, you can run quadratic regression and cubic regression.

For linear regression, there’s a pretty straightforward way of finding the correlation coefficient. How does it work with polynomial regression? Sinusoidal regression? Logistic regression?

Unless you’re talking about something unusual when you say “correlation coefficient”, it’s exactly the same for any pair of variables.

I’m guessing he’s wondering how you handle it when you’ve got a regression with multiple regressors. As best as I understand coefficients of correlation, however, by definition they only measure correlation between exactly two variables. I think there were other tests that can be used, for example, to test if all of the coefficients on your regressors are jointly insignificant, but it’s been little while since I’ve taken any statistics courses.

However, with a multiple regression, I’m pretty sure you would have to calculate r for each regressor against the regressand. That’s my poor memory of it, anyway. I guess it will have to suffice (with a massive grain of salt) until someone with a little fresher knowledge of statistics happens by.

Oh yeah, it would not, however, change anything if you did a logarithmic transformation or anything like that. r for X and log(Y) is still r, but of course is different from the coefficient of correlation between X and Y. Just imagine that all you did was swap out “Y” for “Q” except Q happens to be log(Y).

Ah, so the OP is looking to test the hypothesis that [symbol]b[/symbol][sub]i[/sub] = 0 for each i. There is a test for that, but it’s been a while since I did much of this, so I’m not sure I can find it.

Here’s my question, stated a little more clearly:

When you do linear regression, you find a “least-squares function” that is the “best fit” line. There’s a formula that tells you how well this “best fit line” correlates to the actual data (“R”).

I’m wondering if there’s a variable like this if you’re running regression for, say, a quintic function. Or a sinusoidal function, or a logistic function, etc.

That is, “R” is the coefficient, and there’s a formula to find R.

You’d use R[sup]2[/sup] in every case, or at least that’s what my linear regression professor said. It wasn’t an advanced class, so maybe there’s something better, but I don’t know of it.

Rather than thinking of the equation as y = ax[sup]2[/sup] + bx + c, think of it as y = ax[sub]1[/sub] + bx[sub]0[/sub] + c, with x[sub]1[/sub] = x[sub]0[/sub][sup]2[/sup].

Oh, I see. And to calculate R^2 you do (SSdev-SSres)/SSdev, right? Thanks!

Sorry to bump this old thread, but can anyone help me figure out how to do sinusoidal regression?

This makes some sense but I don’t know how to apply it.

I’d guess that you just pick the amplitude and phase of the sinusoid to minimize the RMS error between the sinusoid and the data. That’s how you write a fourier series, at any rate.

You’d pick the sinusoid A*cos(Bx+c)+D.

But how do you do that? The EE in me would just find the fourier series representation, find the lowest frequency peak, and grab the constants from that. But I’m sure there’re other ways.

What you need to do is to take the expression sum((a * cos(bx[sub]i[/sub] + c) + d - y[sub]i[/sub])[sup]2[/sup], 1 < i < n), take the partial derivatives with respect to each coefficient, and set them simultaneously equal to 0. Solve, and you’ll have the least-squares coefficients.

Please notice how you don’t see me doing this. It’s gonna be nasty.

OK, so just for fun, I tried to find the coefficients. You’ll need a computer algebra system of some kind to handle it–it’s too complicated to handle by hand.