Remember Me?

 Straight Dope Message Board Remember Me?

#1
12-02-2009, 04:12 PM
 mischievous Charter Member Join Date: Mar 2001 Posts: 1,307
Fairly basic statistics question.

Okay, so as a professional biologist it pains me to admit this, but I know little to nothing about statistics. I can do a Chi-squared analysis on Mendelian ratios and a Mann-Whitley U-test for a single parameter, but that's about it.

So I currently have a data set that has two parameters - age of the mouse embryo, and length of a particular tissue - with three different mutant genotypes. When I do a scatter plot of the data, it looks like this. The different colors represent different genotypes (I had Excel add "Trend lines", whatever the hell those are.)

So, to my eye, there is no difference between the different genotypes (colors), but I need someone to point me towards the right test to use to actually get a p-value out of that subjective judgment. Anyone want to lend a hand?

Thanks,
mischievous
#2
12-02-2009, 04:30 PM
 cjepson Guest Join Date: Oct 2007 Posts: 3,303
I assume you're trying to determine whether the association between age and tissue length differs across the three genotypes. If that's correct, you want to do a regression model (probably a linear regression) in which the dependent variable (the thing you are trying to predict) is presumably tissue length, and the predictor variables are age, genotype, and the interaction of age by genotype. (The interaction is the effect you are interested in.) The one thing that is a bit tricky is that the genotype variable has to be represented by a set of two indicator, or "dummy", variables, which will mean that the interaction effect will also be represented by two dummy variables. It might be better to find someone who can walk you through that in person. (You could avoid that problem by running a set of three models, each comparing one pair of genotypes -- A vs. B, B vs. C, and A vs. C... that's not the standard way of doing it, though, because it raises the issue of what is called "multiple comparisons". Generally, when you do that, you correct for it by using a more stringent criterion of significance than you otherwise would.)

Hope this helps...
#3
12-02-2009, 04:42 PM
 mischievous Charter Member Join Date: Mar 2001 Posts: 1,307
Actually, the thing I'm trying to establish is whether genotype makes a difference, i.e. whether the length of the tissue is longer or grows faster in some mutants than in others.

I think (if I understand it correctly) that linear regression will tell me how fast the tissue is growing in each genotype (i.e. the slope of the line), but will not tell me if the lines for each genotype are the same. Is that correct?
#4
12-02-2009, 04:57 PM
 cjepson Guest Join Date: Oct 2007 Posts: 3,303
Quote:
 Originally Posted by mischievous Actually, the thing I'm trying to establish is whether genotype makes a difference, i.e. whether the length of the tissue is longer or grows faster in some mutants than in others. I think (if I understand it correctly) that linear regression will tell me how fast the tissue is growing in each genotype (i.e. the slope of the line), but will not tell me if the lines for each genotype are the same. Is that correct?
If you do the linear regression model I outlined -- i.e., with the interaction of genotype by age included, as well as the main effects of genotype and age -- then the main effect of genotype will tell you if the tissue length (averaged across all ages) is greater for one genotype than another, and the interaction effect will tell you if the speed of growth (i.e., the degree of association between age and tissue length) is greater for one genotype than another.

Last edited by cjepson; 12-02-2009 at 04:58 PM.
#5
12-02-2009, 06:00 PM
 ultrafilter Guest Join Date: May 2001 Location: In another castle Posts: 18,988
cjepson's suggestion to use dummy variables is the simplest way to approach this, but you can't do multiple regression in Excel. What else do you have access to?
#6
12-02-2009, 06:08 PM
 mischievous Charter Member Join Date: Mar 2001 Posts: 1,307
I'm not sure - I don't even know what to look for. I work at the NIH, which has site licenses to a fair amount of software, if you could suggest some names I could go looking for.
#7
12-02-2009, 06:35 PM
 footballisplayedwithyourfeet Guest Join Date: Oct 2008 Location: Netherlands Posts: 761
look for stata, spss, sas, R. The problem is, you need to know what you are doing (ie know the software). For the record I would also try the proposed regression model, just make sure you have one dummy less than you have categories (so in your case only 2 dummies and interaction effects) the result you get will tell you hwo that particular category does compared to the one you didn't include...this also means that one model will not tell you whether the two categories that you did give a dummy are different from each other. In order to know this you need to run another model where one of the other categories is the one that is the base (so not with a dummy). I must say that at a glance there seems to be little difference...but if your sample is large enough there might still be significant outcomes.

ps I think I once heard somebody talk about doing regressions in excel, so it might be possible, don't ask me how tough.
#8
12-02-2009, 07:16 PM
 xash Ogministrator Charter Member Join Date: Jan 2001 Location: Palo Alto, CA Posts: 4,133
Quote:
 Originally Posted by mischievous I'm not sure - I don't even know what to look for. I work at the NIH, which has site licenses to a fair amount of software, if you could suggest some names I could go looking for.
Look for SPSS or JMP.

Last edited by xash; 12-02-2009 at 07:18 PM.
#9
12-02-2009, 07:21 PM
 ultrafilter Guest Join Date: May 2001 Location: In another castle Posts: 18,988
Quote:
 Originally Posted by mischievous I'm not sure - I don't even know what to look for. I work at the NIH, which has site licenses to a fair amount of software, if you could suggest some names I could go looking for.
We academic statisticians use R. There's a fairly steep learning curve, but it's extremely powerful and infinitely extensible.
#10
12-02-2009, 07:23 PM
 footballisplayedwithyourfeet Guest Join Date: Oct 2008 Location: Netherlands Posts: 761
Quote:
 Originally Posted by ultrafilter We academic statisticians use R. There's a fairly steep learning curve, but it's extremely powerful and infinitely extensible.
Don't forget to mention it's open source. You can get it for free anywhere and everywhere.
#11
12-02-2009, 09:42 PM
 mischievous Charter Member Join Date: Mar 2001 Posts: 1,307
Okay, tomorrow I'll go looking for some software. I'm sure I'll have a million questions once I get that far.

Dammit, isn't there an easy way?
#12
12-02-2009, 10:00 PM
 thelurkinghorror Guest Join Date: Jun 2006 Location: Venial Sin City Posts: 12,376
You might have trouble looking for new versions of SPSS. It's called PASW now.
#13
12-02-2009, 10:16 PM
 CookingWithGas Charter Member Join Date: Mar 1999 Location: Tysons Corner, VA, USA Posts: 11,477
Quote:
 Originally Posted by ultrafilter cjepson's suggestion to use dummy variables is the simplest way to approach this, but you can't do multiple regression in Excel. What else do you have access to?
You can do multiple regression in Excel, unless I misunderstand what you mean (multiple independent variables, one dependent variable, right?). I did this in a course I took in forecasting a few years back but haven't used it since. There is a regression tool built into the Analysis ToolPack which ships with Excel, but isn't installed by default. It's more powerful than the TREND function and will give you a sheet with all the parameters for the model, like R2. I think you can even do multiple regression with TREND if you set the columns up right. But frankly this is a little like removing a screw with a pair of pliers.
#14
12-03-2009, 10:02 AM
 mischievous Charter Member Join Date: Mar 2001 Posts: 1,307
Okay, well, I'm an idiot. It turns out that my facility has full-time statistics support.

I have an appointment in an hour with a statistician who says he'll lead me through the process step-by-step. Bless him.

Thanks for all of the ideas, guys, and I'll keep them in mind for the next time I run into trouble.

 Bookmarks

 Thread Tools Display Modes Linear Mode

 Posting Rules You may not post new threads You may not post replies You may not post attachments You may not edit your posts BB code is On Smilies are On [IMG] code is Off HTML code is Off Forum Rules
 Forum Jump User Control Panel Private Messages Subscriptions Who's Online Search Forums Forums Home Main     About This Message Board     Comments on Cecil's Columns/Staff Reports     General Questions     Great Debates     Elections     Cafe Society     The Game Room     Thread Games     In My Humble Opinion (IMHO)     Mundane Pointless Stuff I Must Share (MPSIMS)     Marketplace     The BBQ Pit

All times are GMT -5. The time now is 10:34 AM.

 -- Straight Dope v3.7.3 -- Sultantheme's Responsive vB3-blue Contact Us - Straight Dope Homepage - Archive - Top

(Your direct line to thousands of the smartest, hippest people on the planet, plus a few total dipsticks.)

Publishers - interested in subscribing to the Straight Dope?