You are now in the main content area

INNOVATION
Issue 43: Fall 2025

Using AI to catch AI: research puts large language models to work detecting fake e-commerce reviews

360 Degrees

Using AI to catch AI: research puts large language models to work detecting fake e-commerce reviews

In this animated image, a magnifying glass moves over a collage of review comments to reveal either checkmarks or Xs.

Looking to get the lowdown on a product you might buy, or wondering what to order at a local restaurant? Maybe you’re searching for a new show to watch on your preferred streaming service. Whatever it is, chances are you’ll read some online reviews to help inform your decision.

The problem? Not all online reviews can be considered truthful, accurate or even human.

New research from Toronto Metropolitan University (TMU) uses AI-powered large language models to detect whether a review can be believed, or whether it is a computer-generated “fake.”

The ratings, reviews and suggestions we read online are often the product of a recommender system. Powered by algorithms that analyze reams of user data to predict our preferences, recommender systems help digital businesses introduce people to products or content that might otherwise go overlooked.

However, these systems are vulnerable to “shilling attacks,” where fake or misleading information is deliberately posted with malicious intent. The goal can be to promote one product with a flood of positive reviews, or diminish another by burying it in negative feedback.

E-commerce is a fast-growing, multi-billion-dollar segment of Canada’s economy. According to Statistics Canada, year-on-year sales rose seven per cent to $67.7 billion in 2023, extending a “sustained shift” towards digital commerce. In such an environment, protecting the integrity of recommender systems is an essential part of maintaining customer trust in online retailers and service providers. 

Building safeguards against “silent infiltration”

“Shilling attacks are designed to silently infiltrate recommender systems, manipulating outcomes without raising any red flags,” explained project supervisor Rasha Kashef, a professor of electrical, computer, and biomedical engineering. “Our research aims to build intelligent safeguards that can detect these covert e-commerce manipulations before they distort user trust and platform integrity.”

Attackers often use generative AI technologies such as ChatGPT to quickly create huge quantities of fake reviews. Working with professor Kashef, PhD student and lead researcher Dina Nawara developed a novel framework that uses AI-powered large language models to distinguish human-written reviews from those generated by a computer.

“It’s like building a shield, a defence mechanism,” Dina Nawara said of the framework. “We want to give people the chance to make an informed decision whenever they are online.”

The research began with more than 20 million real-world reviews from popular e-commerce sites such as Amazon, TripAdvisor and Yelp. This massive dataset was used to identify the linguistic and writing styles of human-written reviews. 

Based on the characteristics of that human-written content, the researchers used generative AI to create a set of highly realistic fake reviews that mimicked the originals. Finally, these new reviews were mixed with the authentic ones, creating a dataset ideal for testing and training their detection methods.

The results were impressive: the framework’s two most effective models were both able to accurately identify more than 97 per cent of fake, computer-generated reviews.

A better defence against shilling attacks

Professor Kashef attributes the framework’s success to its multi-modal approach, explaining that most fake review detection methods rely solely on surface-level text features or statistical anomalies. In comparison, the TMU framework provides a more robust defence against fake content by combining linguistic elements, metadata indicators and time-aware patterns.

By examining the time stamp and metadata attached to reviews, for instance, the framework can identify whether an attacker has loaded a large amount of fake content into a recommender system within a brief time period, or from the same IP address. It also checks whether a reviewer is also a verified purchaser. 

As for the actual language used in reviews, a major red flag for fakes is flawless spelling and grammar.

“Is it perfect, or is it human?” Dina Nawara asks. “Humans are not perfect when it comes to reviews. Some computer-generated reviews are very sophisticated when it comes to wording and phrases.”

Further research is needed to expand the framework, enhancing its ability to detect increasingly sophisticated fake review content. However, Professor Kashef, who has led several previous projects examining the trustworthiness of recommender systems, said the current framework represents “a significant leap forward” in defending the integrity of e-commerce platforms.

“The results highlight how AI can effectively flag fake reviews,” she said, “even those generated by increasingly sophisticated language models.”

Read the paper, “A dual-phase framework for detecting authentic and computer-generated customer reviews using large language models (external link, opens in new window) ,” in the Decision Analytics Journal.

It’s like building a shield, a defence mechanism. We want to give people the chance to make an informed decision whenever they are online.

This research has been supported by the Natural Sciences and Engineering Research Council of Canada (NSERC).