Yelp Ratings Exploration

I’ve had an obsession with Yelp ratings since I started my data science capstone project at Flatiron School. Yelp ratings can be useful, but as a restaurant/bar gets older, the ratings become more useless. An example would be the Yelp score for Lombardi’s, which claims to be NYC’s oldest pizza joint.

Lombardi's Yelp score over time.
Lombardi’s Yelp score over time and a 250 review rolling average. Click the image to read more about how I created this graph.

The black line shows Lombardi’s Yelp score over time, the green horizontal line shows the average score as of September 2019. In September of 2006, Lombardi’s hit 4 stars and hasn’t changed since. You can see from the black line that the average has been slowly trending towards 3.5 stars, but the rating was 0.0385 stars above the threshold. The rolling mean tells a more complete story, having drifted into the 3.5 star territory in 2015.

How many of those 6,000 reviews are quality reviews?

In order to see the impact of filtering reviewers, I took a look at the Yelp reviews of Young Ethel’s, a bar that I love in Brooklyn, that has 19 reviews and 5 stars. 15 five star reviews, 4 four star reviews, a 4.7895 star score.

Since Young Ethel’s is relatively new, we can assume that their friends and family will write some reviews to help them get started.

A questionable review

A few red flags for this review. No friends, no user photo, a low number of reviews. I know the opening date of the bar (August 22, 2019), so this is also within a month of opening.

Of the 19 reviewers, 3 are missing user photos and 3 have no friends. 5 have neither.

All reviews: 4.7895 stars

Reviewers with Photos: 4.75 stars (-0.0395 difference)

Reviewers with Friends: 4.75 stars (-0.0395 difference)

Reviewers with Photos and Friends: 4.7143 stars (-0.0752 difference)

An overview of an Elite reviewer’s profile. Are all 3,800 reviews honest?

Filtering out the reviewers that have no photos or friends, we’ve lowered the star rating by half a star. There’s one more group I believe that should be filtered out, which are Yelp Elite Squad members. Elite members are rewarded for interactive on the site with exclusive events. New businesses would be prime for new reviews from Elite members. Since I’m looking to filter out reviews that aren’t genuine, I’d be remiss in leaving Elite reviews in the score.

Non-Elite Reviewers with Photos and Friends: 4.8 stars (0.0105 difference)

By removing Elite reviews, the score is up in comparison to the score. Without the Elite members and suspect reviewers, there are 10 reviews left.

My next step would be to see what impact this has on a larger scale, but for now, it’s interesting to see how weeding out reviews can impact the overall score.