I have run into a problem creating a survey instrument. The question I intend to pose to the respondents is sensitive, so I thought I’d use a randomized response technique. The sensitive question has at least four choices, so the recorder notes only a) b) c) or d).
I thought I could use seasonal birth variations as the coin flip question, thinking data for the respondents would be easy to find. It isn’t!
So my questions would be:
Is there a better set of four values out there for my purposes?
If not, know of any whiz-bang sources (yes, I’ve gone through a number of library databases) for seasonal birth patterns?
I would have thought your best bet would be to use some definitely random variable rather than complicate your life by using something that probably has some complicated statistical skew that you will have to adjust for later. How are you administering the survey? Can you just have them use a dice or something? And why not just use a coin?
If you want something simple, how about take the birthdates & and convert them to the day of week? It’d be pretty easy to spot any skew there, and if there isn’t any, you’re in business.
Alternatively, why not number your target roster? ie if you want 5 categories, number each person in the roster 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, etc. That numbering will have zero correlation with either their willingness to participate in the survey, nor with the answers they give.
Okay, here’s the deal, since I wasn’t explicit enough, sorry.
Say the survey is set up so respondents sit down and in response to a coin flip, answer either the sensitive question or another question about which data is already available. The researcher only receives the answer a) b) c) or d), and has no idea which question the respondent answered. This allows respondents (presumably) to be more frank if they receive the sensitive question. (For the sake of example, say it’s something like: “I fantasize about having sex with a) sheep b) dogs c) children d) adults of my own gender.”)
However, the researcher is able to compare data for the known question against variant responses for the unknown questions, and should be able to see where responses to the sensitive question comes in.
For instance, say (completely bogus) 40% of the population is born in spring, with winter, summer, and autumn weighing in at 20% each. If spring is b) and I get a huge number of endorsements for a), those responses were more likely triggered by the sensitive question, which means Hal Briston is one of my respondents.
If you add all this to my OP, it makes more sense. I hope.