Statisical Confusion

Alternate Title: Girl looks to be deeply impressed by mathmatical wizard.

Here’s the situation. This looks much better in flow chart form, but I will do my best with text.

                                   Group A<---Total Group N(t)---->Group B
                                    |        |                                          |        |
                                    X(1) Y(1)                                     X(2)   Y(2)

Of the total group, N(t), some fall into Group A and some fall into Group B. Of these two groups, part of them, X(1), answer ‘yes’ to a question. Another part, X(2), also answers ‘yes’ to a question (or rather, 600 questions, but that is another tale of woe and spreadsheets). The Y variables are the proportion that answer ‘no’ to the question. There are no other options.

My questions here are (1) What is the p-value for X(1) vs Y(1)? and (2) What is the P-value for X(1) vs X(2)?

I tried to use a chi-squared test, but I wasn’t sure how to determine the expected values, my data is stuff like “did you use lotion A? did you use soap B?” as well as basic demographical data.

The other situation I’m looking at is:

                       a(1) Group A <---Total Group N(s) --> b(1) Group B

In this situation, N(s) is a subgroup of N(t). For example, N(s) would be how many African Americans answered yes to Question X in Group A vs Group B. I’m also trying to determine the p-value for this situation.

My n is well over 50 for Total Group, although possibly n<50 for some of the subgroups. Which also confuses me…I can’t use the Fisher exact is n<50, right?

Essentially, I’m not sure which method I should use to calcuate my p-values. I’ve been looking and reading some materials, but I’m very concerned I will chose the wrong test.

And before anyone gets the wrong idea, this isn’t a marketing study of some kind. If I had some kind of corporate funding (or hell, any kind of funding) I would track down an actual statistician to help me out. Life isn’t quite that kind though.

Thank you to anyone who replies.

Darn. That didn’t work. Okay, just pretend X(1) and Y(1) are subgroups of Group A and X(2) and Y(2) are subgroups of Group B. If it was a flow sheet, little arrows would connect them together.

Sorry… :frowning:

  1. How can we figure out a p-value without the data? Experimental results have a p-value, not an experimental setup.

  2. I’ve read it three times, but I still don’t understand what your setup is. All I can tell is that you asked a lot of yes-no questions to some overlapping groups. Maybe a little less algebra and a few more specific nouns?

  1. Oh…I don’t want you to do my work for me. I really do have quite a few of these things to calculate. I was just hoping for some suggestions regarding Chi2 vs Fisher vs T-tests. I can throw around some example numbers if that would be helpful?

  2. Hmm. Okay, well I can give it a try. Let’s say 500 people are in Total Group. 300 of them are then part of Group A and 200 are part of Group B. Then the Total Group is posed a question, like…do you like soup? Part of Group A likes soup and part of Grou B likes soup. Everyone else is anti-soup. So, how do I determine the p-values if the soup-loving subset of Group A is significant and if the p-value of the soup-lovers of Group A v Group B is significant?

The second scenero is only the soup lovers. How many of the soup-lovers are over 40? How many use lotion B? is any of this significant?

Does this help?

Thanks, that helps a lot. In your soup scenario, what are you trying to figure out? Is it whether Group A likes soup more than Group B?