Social Desirability
Dear Dr. Pete:
I read that it is possible to overcome social desirability bias by getting people to answer one of two questions where the researcher doesn't know which question they asked. How can this be?
~D. P. Topper
Social desirability bias is when research respondents provide answers they feel are socially desirable or acceptable. Social desirability also applies to wanting to look good in front of an interviewer. Survey answers may not be accurate and can cause bias in research. The impact of social desirability bias is stronger in face-to-face interviewing and telephone, and less pronounced in postal and online surveys.
Dr. Pete replies...
Dear D. P.:
This is not quite as strange as you might imagine. The "Randomized Response Technique" comes in a variety of flavors. The easiest one to understand is "the unrelated question with the known distribution" design (Greenberg et al 1969).
The researcher presents two possible questions to the respondent and only one answer list. One of the questions is sensitive in which you are interested; the other is a question to which you know how the answers ought to distribute (birthday month, last digit of telephone number, etc.). The answer list is Yes/No or True/False.
The respondent uses a randomization technique (i.e. flipping a coin) to decide which question they should answer and ticks the appropriate box. The researcher has the answer to the question, but no idea what question was answered. The respondent is aware the researcher doesn't know which question is being answered and can be 100% honest, even if they have to answer the sensitive question.
Since I've used the word "random" in the above explanation, you know that somewhere in the next few lines the word "probability" is going to rear its ugly head. And here it is - it comes down to knowing the two essential probabilities, which are 1) the probability of answering the sensitive question and 2) the probability of saying "yes" if the non-sensitive question is answered.
Here's an example:
We ask a sample of 100 people to toss a coin. If they get heads, they answer the sensitive question; if they get tails, they answer the non-sensitive question. The non-sensitive question has a known probability of a "yes" answer of 60%. We get our data back - 70/yes, 30/no. We know that in 50% of cases, a tail was thrown. The non-sensitive question has a 60% probability of being a "yes," so that means 30 of the "yes" answers are answers to the non-sensitive question and the remaining 40 "yes" answers are to the sensitive question. Voila! This equals 80% "yes" answers to the sensitive question.
Of course this approach isn't without its drawbacks. We don't actually know which of our 100 people answered the sensitive question so we can't do any sub-analyses or use this answer as an analysis variable. In addition, the technique has been shown to increase non-response since people don't trust either how it works or how it protects their anonymity. People could still be lying to themselves to protect their own self-esteem. This reduces the effective sample size by 50% and, because of the number of random elements involved, the variance is high - meaning we can have less confidence in the answers.
If the last point confuses you, you'll remember from statistics that there is no guarantee that 50 of the people tossed heads. None of them might have -- unlikely but possible. It's also not guaranteed that our sample splits 60:40 on the non-sensitive question. It could just as easily split 50:50. If that had happened, then what would our data say? It would say 50% "yes" and we'd take out the 30 "yes" answers we expect from the non-sensitive question. This would leave 20 assumed "yes" answers to the sensitive question and we'd proudly state that the answer to the sensitive question was 40%! Whoops.....
Dr. Pete