Multi-Country Studies: Assessing Data Validity

By Pete Cape, Global Knowledge Director

I train all the new employees that SSI take on in Europe. Every one of them, irrespective of what position they are taking up – from finance through media buying to sales or project management, they all get an exposure to Market Research – after all it’s what 99.9999% of our clients are engaged in.

You’ll notice I don’t say one-hundred-percent – one of the things I warn my trainees about when looking at market research data is to always to be wary of a statistic that says one-hundred-percent (or zero-percent). There is always someone who does, or someone who doesn’t do everything – even if only in error. I actually also warn them to be wary of newspaper or magazine articles that start “we all do X, Y or Z nowadays…” – normally that just means the journalist and a few of their close friends. But that’s a post for another day.

Anyway, I teach them all an outline of market research; that includes an introduction to validity in estimates and why it is that researchers can “sort of” tell when a number isn’t right in their results. To illustrate this I ask the class a question. Since most of the trainings take place in Timisoara, Romania I say to them: “If I told you that eighty-percent of Romanians drive a BMW would that be the truth?” Immediately, they all tell me that would definitely be false; it’s a crazy number and they know it. I then lower the percentage to a more reasonable ten-percent. Now some of the room is not so sure, and some are still positive that it is wrong. Those that are “positive” are much more likely to be Romanian than those that are “not so sure.” You have to get to a number below about one-percent before the Romanians in the room are “not sure” if it is true or not.

This exercise illustrates two things. Firstly that you only need eyes in your head to recognise an egregious error in data. Good researchers tend to use those eyes to do dumb things like count BMWs in the supermarket carpark, or read voraciously anything containing a percentage sign in it, and, importantly, remember it. As I tell the delegates, makes us researchers good for pub quiz teams but not often re-invited back for dinner parties…

Secondly, and importantly for today’s topic, is that the assessment task on the validity of a number is very much easier when it is closer to home. Assessing the validity of “foreign” data is much harder.

So how can you? Certainly it is best not to rely on prejudice and stereotype. Sure, a lot of people do drive Dacia cars in Romania, but there are plenty of Volkswagens, Skodas and yes, even BMWs on the streets. So the answer is never “everyone.”

Of course we don’t have access to an infinite resource of market research studies, all we have is the internet. And so the intrepid researcher “Googles” it, just like everyone else. Two things set the good “Googler” apart from the rest of the world. The first is the construction of the search term in the first place. I have learnt you have to avoid terms like “sales” lest you be inundated with offers for fridges, automobiles, luxury shirts or whatever else it is you are searching for. The second, and more important skill perhaps, is deciding which of the myriad answers you get is right or how to adjust the answers you get to be closer to what you want.

As I tell my trainees, this judgement of rightness is a skill we researchers have always had. We know, almost instinctively it seems, whether our survey results are right – whether they have validity. We need to apply the same judgement on our search results. Wanting to know the incidence of Ford cars in the UK I can search for “Ford market share in the UK.” This brings up a number results. Three of the results give the answer sixteen-percent, sixteen-percent and fourteen-percent. So that seems fairly clear. But a closer inspection reveals that one of the figures is from 1997 and the smaller figure doesn’t include Van sales (that’s English Van by the way, a Cargo Van). That all three bits of data were dated was lucky, the absence of dates on webpages that quote statistics is a major problem in web-sourcing statistics.

But is around fifteen-percent right? The data is for current vehicle shares, that is shares of all cars sold in the last period. And that is new car sales, which is only about four-percent of all cars in the UK. We want shares of all cars on the road. A brand could be launched today and gain a fifteen-percent market share, but if it is a durable good the share of the installed base would be tiny for many years. The real number must be lower – but how much lower? Since Ford is long established market leading brand that enjoys a fifteen-percent sales share and has done for many years I would estimate not much lower. Certainly it cannot be higher. I would be confident going to a client and saying Ford has a market share of around 10% and if that is what my survey said, I’d be happy.

Notice what it has been necessary to do to interpret these statistics. One is to work out what is the base of the sample, the other is to work out what the “question” is that has been asked. I’d essentially found the answer to “what is the brand of the new car you bought in the past 12 months?” asked to a sample of new car buyers. What I’d wanted was “what is the brand of your car?” asked to all bar drivers. It is this leap form what you can find to what you want that is most difficult. Sometimes it is impossible. When mobile phone statistics tell you that a country has a one-hundred-twenty-percent penetration of mobile phones, you know they are talking a different language to us…