How the Data Kinetics consumer Insights engine weights & combines multiple sources

I first released Data Kinetics in 2021, it was the first Audience Intelligence system to attempt to deploy a ‘multi modal’ approach insight analysis, bringing multiple sources of insight together and using Machine Learning and AI to find a pattern through the diverse (often conflicting) Data sets.

Im often asked how do we weight different data sets and combine them, so this month im going to go into a little detail about the sources of information and those weights by using a real world example deployed for a global consumer company.

The key differences between Data kinetics and most of the other marketing Insight systems is that we that start from the basis that all human opinion is ‘kinetic’, or fluid, that is, opinions slightly shift depending on who we in front of and the platform we are on, we can have different opinions more or less at the same time.

Traditional research tend to ignore those contradictions and focus on one or two sources, our approach is to understand that these differences exist, to find the patterns in the unconscious bias depending on the platform and audience we are infront of and weight theses opinions up or down - seeking the most powerful path through those divergent motivations.

This enables us to work with multiple sources of consumer insight and data. Below are the key sources we sources we work. .

While multiple sources are excellent for getting us that ‘Total Human Insight, you need to understand that each has source has different strengths and weaknesses, accuracies and inaccuracies - which we will go into in a minute.

They also require different levels of volume to find accurate patterns and have different strengths and weaknesses. For example, the source ‘ethnography’, that is observing consumers in real time, delivers accurate data about product use in real time, however it is low in volume and can miss trends due to lack of scale - also people know they are observed, so behaviour does change.

Survey data has scale, but the answers are skewed by the necessity to use consumers who have the inclination and time to fill out surveys. The act of payment also encourages panels to give answers – even if they don’t have an opinion - and again they know they are being judged - and nothing changes behaviour more than the knowledge we are being judged by our peers.

Here is an overview of the levels of volume we seek to obtain from each source as a core minimum.


Strengths and weaknesses of consumer research data sources.

When it comes to different strengths and weaknesses of sources, we understand these and weight depending on the client and campaign. For example, there are sources where consumers do not understand they are being monitored or reviewed - so they ask questions that perhaps they are embarrassed to ask on social, or highlight fears or barriers that they wont talk about in public. These include links that individuals click on, but also questions that are asked in Google and other search services.

When it comes to understanding trends, social media and social listening are a good source to understand the direction of travel, but you have to understand that whilst the loudest voices on social media are important and tend to define the general public perception of a product of service, they are still only one persons opinion. In other words one person saying the same thing 100 times to thousands of people, is still only one persons opinion, most social listening systems are not built to take this into account, Here are the strengths - as we see them - of different sources.

And here are the weaknesses.

Weighting the data

In the end however you still need to weight these sources. This weighting can change per project, below is an example of a basic weighting for a real research project.

And this is where it starts to come together.

Sources and their strengths and weaknesses.

Social media as a source: for example you can see how we explain that social media is excellent for consumers identifying themselves as category participants, enabling us to find them and extract them. Using AI, we are able to give a highly accurate analysis of these consumers affinities, psychology, opportunities and issues and channel usage – income level is however inferred and not accurate on this channel

When it comes to Ethnography, it provides strong observational and direct analysis but it is not relied on for trend as volume inhibits accurate pattern detection and trend direction prediction.

Survey enables strong answers to specific questions, but it is skewed by the knowledge the survey giver is being evaluated, so it is used as overlay to support / detract from Social & Ethnography sources.

Google Trends and Adwords are not used as primary sources, but provide colour on hidden motivations and reweight opportunities and Issues found from Social / Ethnography / Focus group studies.

Advertising data - showing who clicks on, has an interest in, and engages with certain ads, provided by Meta, is highly accurate for Demographic, income and interest data, however Meta controls all validation and does all the language analysis, so this is second order extraction, first order analysis based on primary data.

Reviews and forum such as Reddit and Quora discuss Issues and Opportunities in volume and these affect public perception due to ‘influencers’, however it is difficult to know if all participants are in category, as such data is used to validate trends only and colour opportunities / issues.

Conclusion

Research really is about understanding of the complexity of the human condition but also the general direction of travel of consumer opinion, my belief really is that the more we ingest - even if contradictory - the more accurate we can be in predicting the opinions that will dominate culture..