Welcome to the Brave New World of AI Opinion Polling
Opinion polling is dead. Long live opinion polling!
I’ve never been all that interested in opinion polls. But I do love data. So after reading a recent paper on using generative AI engines to simulate survey results, my only thought was: “Where do I sign up?”
Keep in mind the growing problems traditional survey takers have been facing, including how to deal with the fact that most people now use only mobile phones, and that more and more of us refuse to answer any call coming from unfamiliar numbers.
Also, remember that traditional public opinion polls are not synonymous with public opinion. Polls use the stated opinions of a tiny slice of a population as a proxy for public opinion. But professional pollsters understand that the people who responded to their polls don’t perfectly represent their neighbors or compatriots. So the pros use their knowledge of demographics to adjust survey responses using statistical weights to estimate a better match for the real world.
The idea behind this new AI-driven approach is that we can drop the non-representative survey responses altogether and focus exclusively on the demographics. Will an AI simulation be perfect? Of course not. But it might sometimes give us a quality that’s comparable to telephone surveys were in their glory years. And at the very least, it’ll provide a new data resource.
Oh, and it’s free. And fun.
My first resource for demographics was Elections Canada election results from both the 2019 and 2021 federal elections. I then grabbed the huge Statistics Canada 2021 census dataset. That gives us 2,631 data points for each of the 338 electoral districts (ridings) across Canada. For this experiment, I selected these measures:
Median income
Median age
Percentage living in detached home
Percentage married
Average number of children
Gini index (income inequality)
Percentage owning their home
Percentage living in government housing
Percentage immigrants
Percentage non-permanent residents
Percentage earned bachelors degree
Percentage employed
Percentage speak English at work
Percentage speak French at work
Percentage work at home
Once I’d dropped all that data into a spreadsheet, I had robust profiles for each riding across the country. My next step was to feed everything to the friendly AI of my choice.
From here, simulating a highly-representative survey of a wide swath of Canadians will be easy. Based on a recent post in The Audit, I began with this survey question:
Over the past 15 years, Canada has spent more than three billion dollars funding the international organization: 'Global Fund to Fight AIDS, Tuberculosis and Malaria'. The organization has had some success, but their job isn't complete. Do you think we should: increase our funding, decrease our funding, or leave the funding where it is?
The results suggested that 121 “ridings” wanted Canada to increase program funding, 119 wanted funding reduced, and 98 preferred things the way they are.
But as you all know, the way you ask the question largely determines the answers you get. How might people respond when they’ve got some more to chew on? Here’s an alternate version of our survey question:
Over the past 15 years, Canada has spent more than three billion dollars funding the international organization: 'Global Fund to Fight AIDS, Tuberculosis and Malaria'. The organization has had some success, but they've consistently failed to meet many of their objectives. Critics suggest that our money could be put to better use. Do you think we should: increase our funding, decrease our funding, or leave the funding where it is?
The “increase funding” crowd shrank from 121 down to 82, while “leave it where it is” climbed from 98 to 137. Interestingly, “decrease funding” remained steady at 119.
How did responses to the second question vary between provinces? Take a look for yourself:
“Decrease funding” seems to dominate in Nova Scotia and “leave funding where it is” appears to have a higher proportion of support in Quebec than anywhere else.
Do those results make sense? For whatever it’s worth, I suspect that if we were to dive deeply into results from a few more survey questions we’d discover a reasonably strong relationship with real-world opinions. Of course, it might help things if I beefed up the demographics a bit. I could, for instance, add metrics representing urban vs rural ridings and dominant ethnic population groups.
I’m only beginning to appreciate where this could go. For example, I could imagine these tools being helpful for targeting public service, political, or commercial communications. Policy officials could use the data to more effectively deliver government services. And this kind of data could reduce the costs and resources required for important research projects.
Excellent read...predicting is always a fun sport. Even for AI.
Your experiment demonstrates how sensitive the "responses" are to the way the question is asked. But more fundamentally: Having built this more or less comprehensive socio-econom ic profile of each riding, where exactly do the "opinions" come from? Something's missing here.