What is Machine Learning?

The term machine learning has become increasingly popular in recent years. At its core, machine learning is a field of computer science in which computers are able to learn from data and information without explicit programming. 

Machine learning can be tracked back to the 1950s when Arthur Samuel wrote a computer learning program that could play the game of checkers and the computer was able to improve its strategy the more it played the game. Machine learning has been applied to many disciplines and has far ranging applications including data security, finance, healthcare, search algorithms, and even smart cars. 

Machine Learning for Market Research

A team from MIT consisting of Artem Timoshenko, a doctoral student at MIT, and John Hauser, his advisor, has developed a new methodology that uses a form of machine learning called "convolutional neural networks" to find customer needs and insights in user-generated content (UGC). The details of this approach are published in the study "Identifying Customer Needs from User-Generated Content" featured in peer-reviewed academic journal Marketing Science.

This innovative machine learning algorithm allows one to harvest readily available UGC to identify key insights in existing data. It draws on literatures from natural language processing and involves convolutional neural networks (CNNs) as well as dense word and sentence representations. 



How it Works: Using Machine Learning in Market Research

In simple terms, the machine learning algorithm is able to mine big data for insights. It can transform an abundance of existing data on a product or service into a detailed list of insights in customers' own language. The process of using machine learning to identify consumer insights is as follows:

1. Identify data sources and extract content: Identify the data sources to mine and extract relevant content from the sources. Then, prepare the data for analysis which involves splitting the UGC into individual sentences and other tasks to clean the data.

2. Train the algorithm: Train word embeddings and apply a convolutional neural network (CNN) to filter out non-informative sentences from informative sentences. Informative sentences are those that contain important consumer insights or customer wants and needs. 

3. Run the machine: The machine clusters sentence embeddings and selects sentences from different clusters to produce a final database of statements.

4. Final output from machine: The machine outputs a list of approximately 2,000 informative sentences that are diverse in insights. 

5. Analysis by trained professional: A trained professional analyst reviews the sentences and identifies a unique set of insights.

This machine learning approach benefits from computer science as well as the advantages of human analysis. Humans are needed to train the machine at the outset and also to analyze the machine's data output. The machine can learn from human training, processing half a million pieces of data or more.

Mining Big Data For Insights

With machine learning, one is not limited to a single data source. The machine can process many different types of data and it can be incorporated in the analysis within one project. 

Types of User-Generated Content Data Sources

  • Product reviews on e-commerce sites
  • Product review sites
  • Online discussion forums
  • Social media
  • Call center data
  • Open-ended survey data

Incorporating multiple data sources can lead to an even more diverse and/or informative set of insights as each data source provides a slightly different perspective.

User-Generated Content Data Requirements

With any type of data source, one must ensure that it adheres to certain guidelines for the machine learning algorithm to perform successfully. The content should be:

1. High Quality: 2,000 sentences (or more) of data

2. Substantive Submissions: At least 10 words per entry

3. Text-Based Data: The content is not in pictures, charts, or other types of graphics

4. Rich, Informative Content: Data in which desired attributes, wants & needs, problems, opinions and solutions are mentioned

Advantages of Machine Learning

There are many advantages to using machine learning as a research tool. Some advantages include:

  • User-generated content is virtually free
  • Machine learning can draw on comments from thousands of people
  • Content contains insights that are freely volunteered at moments of truth
  • Machine-based analysis can overcome human bias
  • Ability to identify infrequently mentioned and unique insights

Machine Learning vs. Traditional Qualitative Research

Companies often conduct research to understand key product attributes, attitudes, opinions, features and solutions to gain a deep appreciation for the customer journey and the associated touchpoints. These insights help companies to drive successful innovation whether designing new products or improving the customer experience. However, research in these areas is often limited to the highest priority or highest risk areas since traditional qualitative research can require significant time, expense and effort to complete. With machine learning, one can conduct qualitative research (using data that already exists) in fewer steps, saving significant time, money and research effort.


Additional Resources

Have questions about machine learning? 
Read our Machine Learning FAQs post or contact us today to get answers!

Machine Learning FAQs 

Contact Us