less than 1 minute read

This is one of our mini-projects for our Data Mining and Wrangling class in collaboration with Jasper Pangan and Cedric Corro.

Summary

Surprise is a small, quiet city in Arizona that has attracted many elderly and new families due to relatively low-cost of living. But over the years, the city has become heavily commercialized which caused many small business owners to go out of business. Current small business owners though could still compete by looking at current trends and expanding their offerings and improving services. These opportunities can be identified by going through customer reviews from websites like Yelp. Yelp publishes crowd-sourced reviews about local businesses. In this study, we examine the customer reviews, extracted from Yelp, of local businesses in Surprise, Arizona. To identify the major topics consumers are talking about, the review corpus extracted from Yelp were cleaned, stemmed and vectorized. Before clustering using 𝑘-medians technique, dimensionality reduction was applied using singular value decomposition (SVD). This resulted to 6 main topics with food and services as the over-arching themes: happy hour experience, casual dining reviews, home, professional, beauty and mixed services feedbacks.