Me and SPSS, we do not get along. I have some number dyslexia, which makes thinking through numerical math wicked hard. SPSS is not my friend. I am about to try and answer my own question via reading through some books here, but if anyone can help me understand faster / answer the question so I can check my research skills, that would be wicked helpful.
Question: I have two columns of data to be used in a one-way ANOVA. Column 1 contains the numbers 11, 12, 21, 22, 31, 32, 41, 42. Each of these numbers is a group designation. This means I have 1 IV with 4 levels. Column 2 contains scores on a measure. How do I tell SPSS to do a one-way ANOVA to compare these 4 groups on the measure scores?
Well, I suppose there is more than one way to skin a cat. Or computer statistics. I’d figure GQ was for more… fact-y questions. Like, “How much ear wax do elephants produce?”
From memory, you definitely go to Analyse first and I would think you need to go to sub option compare means. Here you should be able to click on ANOVA and get a dialogue box where you can enter which variable is your Dependent and which is your Factor (the groups). It might be necessary that your factor variable is identified as being Nominal or Ordinal.
SPSS still seems like it was designed for Windows 3.1. They need to update their shit. Like if you want to run Repeated measures data, you need to completely reformat your data. Or if you paste text in a new column, it forces it to be a string and you can’t past anything else without changing first, and more annoyingly, it forces the number of displayed figures to be arbitrarily small.
For your question: I am not sure what you are doing. In column 1, is the first number a level, and the second number a subject/sample? If so, don’t. Column 3 can be your sample number. You may not use this column, but it can be FYI. The first column should only contain your level. You can use 1-4; 0-3; 1, 2, 3, 10, it doesn’t matter. You can define which ones you want later.
So instead of Column 1 = 11, 12, 21, 22, 31, 32, 41, 42
It should be:
Column 1 = 1 1 2 2 3 3 4 4
Column 3 = 1 2 1 2 1 2 1 2
Run 1-way on Column 2 as the data, column 1 as the groups (forget the terminology).
If I understand correctly, you have four cases (I’m going to call them “cases”, although you call them “groups”) with two observations each (i.e., each case provided two data points on your outcome variable). If that’s true, then you have a repeated measures design. When you have repeated measures, you can set your data up in either of two ways – “wide” format (one row of data per case, with observations 1 and 2 in separate variables), or “long” format (one row of data per case per observation, with all the data in one variable). Your data are set up in “long” format. As long as they are set up that way, most SPSS procedures can’t handle them. The one that can is the MIXED procedure, which I think is available only in the Advanced Models module, so you may or may not be able to use that procedure. The MIXED procedure is very flexible, which means it can be a pain to work with.
If you can set up your data in “wide” format, you can use MANOVA, which is a bit simpler, but still a little tricky – basically, no repeated measures procedures are really simple. Wide format would look like this:
1 5 2
2 3 5
3 12 10
4 2 2
The first column is the group number; the second is the value of the outcome variable from observation 1 (i.e., from rows 11, 21, 31, and 41 in the original dataset); the third is the value of the outcome variable from observation 2 (i.e., rows 12, 22, 32, and 42).
There is an SPSS procedure (CASESTOVARS) that is used to convert data from long to wide format.
Do you only have two data points per level of the IV? If so, the uncertainty in your inferences is going to swamp whatever signal you have unless the latter is so large that you don’t need statistics to pick it up.
So what is the second number? e.g. what makes 11 different from 12? If it’s truly 8 different conditions, then run as is. If the 2nd number is a different variable (so 4x2 ANOVA), then you’d need to split it into two columns, and run a 2-way ANOVA or Repeated/Mixed design as needed.
OK. So, next question: given that there are actually eight groups, is this really a one-way ANOVA? I would think of it as a two-way ANOVA, where there’s a first independent variable that takes four levels, and a second that takes two.
Rereading the op, apparently you have 4 groups, so you need to recode your ‘group’ variable so you actually have 4 different groups. Then go to analyze and do a one way anova, it’s just clicking the right button.
I think what I need help with is organizing in the spreadsheet to get it all set up for ANOVA to run properly. I’m going to try what Polar suggested, but it here’s what it is:
Column A Column b
11 {insert 234 rows of scores on a measure here}
12
21
22
31
32
41
42
{etc- imagine a total of 234 rows, each of which belongs to one of these 8 types}
I think I’m beginning to get it. Tell me if I’m right:
There are about 30 horizontal rows with the value “11” in Column A, about another 30 rows with the value “12”, etc., so the total number of rows is 234.
If this is right, then it should be a simple matter to do an ANOVA, assuming that both Column A and Column B are numeric variables (i.e., contain no non-numeric characters). If all you care about is whether any of the eight groups differ from each other (or more specifically, if any group mean on the outcome variable differs significantly from the overall mean across all groups), you just do a one-way ANOVA specifying “Column_B” (or whatever name you assign to that variable) as the DV, and “Column_A” as the factor.
As pointed out by other posters, however, the values in Column A suggest that your eight groups are actually defined by two orthogonal factors, one with four different levels and the other with two different levels. For example, suppose that the first factor was educational level (grade school, high school, college, grad school) and the second was sex (male, female). Group 11 would then consist of grade school males, Group 12 would be grade school females, Group 21 would be high school males, Group 22 would be high school females, and so on. In this case, you might want to look at the effects of each factor separately, plus their interaction. To do this, you would basically have to split Column A into two columns (putting the first digit into the first column and the second digit into the second column). For each case, the value in the first column now represents their value on factor 1, and the second represents factor 2. Now you can do a two-way ANOVA.
Normally, in datasets like this, there’s another column containing ID numbers, so that you know which row of data came from which specific case. You don’t really need that to do this kind of analysis, however. (You would need it if you had more than one row of data per case – for example, if Groups 11 and 12 were actually the same people – but from your description it sounds like that’s not true.)
The levels of your independent variable look a lot like someone took two independent variables with two levels each and concatenated them together. That may not be the case, but if it is, the one-way ANOVA is the wrong model. You’ll need to either find whoever recorded the data and ask them what happened, or tell us what the levels of the independent variable mean, and we can make suggestions.