Sunday, June 2, 2019

Applied Data Science Capstone - Report

 Applied Data Science Capstone Report

INTRODUCTION
Is a new pizza restaurant in Vermont a viable business proposition?  If so, where should it be located?

Vermont is a state well known for its green mountains, ski resorts, colorful foliage, and landscapes.  It's also famous for its cheddar cheese, maple syrup, and Ben & Jerry's ice cream.  While its pizza is not as famous as New York or Chicago - there are many chain and local pizza restaurants in the major cities and towns.  I was born and raised in rural Vermont, moved to the Burlington area in 1996 and have lived in several of the largest cities/towns in the state since then.

The purpose of this project is to investigate the existing restaurants and other venues in Vermont - How many are there?  Where are they?  Is there one or a few general areas where businesses (specifically restaurants) seem to thrive?  Are the pizza restaurants evenly distributed?  Is there a heavily populated area where it appears a new pizza restaurant would succeed?

The Target Audience for this investigation would be any current restaurant owner who wants to understand more about his competition or a prospective entrepreneur who wants to open a new pizza place.


PROJECT DATA
1. WEB SCRAPING
  - Research WIKIPEDIA and United States Census Bureau sites to understand population hubs in Vermont
  - I will use this data to narrow down the initial search to the most populated area of the state
2. GEOCODERS
  - Use this library to turn addresses into GPS coordinates that can be easily mapped with Folium
  - The resulting latitude and longitude coordinates can also be passed into FourSquare to return Venues in the areas
3. FOURSQUARE
  - Use FourSquare Venue and Location data to see what restaurants and bars already exist in the most populated areas of Vermont
  - Map the FourSquare venues to visualize the results
  - Apply machine learning and clustering techniques to group the data and find the best location for a new pizza restaurant


METHODOLOGY
1. Web scrape Vermont city and town lists with population statistics
2. Lookup each city's GPS coordinates and map the data
3. Find the most densely populated areas of Vermont to focus the rest of the project analysis
4. For each city in the target area - use FourSquare to return the pizza restaurants within ~2km from the city center and again map/overlay the results
  * NOTE : Some cities required a larger search range because the city center is not close enough to the city "population" center - in these cases the search area was increased to ~6km from the city center
5. Use FourSquare to return ALL the food and drink venues in the target cities
6. Cluster the target cities using MachineLearning and K-Means Clustering based on venue categories
7. Map the final results - clustered cities, population metrics, and pizza restaurants in those areas


ANALYSIS 
Web scrape a list of Vermont cities - pull the data into a dataframe, then use the geocoders library to lookup the GPS coordinates of each city and map the results


 



Not much we can determine from the first map... let's try again with markers that have a size relative to population

Burlington and its surrounding towns account for ~25% of the Vermont residents. 
https://en.wikipedia.org/wiki/Chittenden_County,_Vermont

POPULATION STATS (via United States Census Bureau 2017)
* Vermont                                                = 624,525
* Chittenden County                                = 162,372
* 1st Largest City = Burlington               = 42,239
* 1st Largest Town = Essex                     = 21,519
* 2nd Largest City = South Burlington    = 19,141
* 2nd Largest Town = Colchester            = 17,309

https://vtdigger.org/2019/05/30/woolf-population-growth-mostly-towns-within-50-miles-burlington/
It’s a well-known fact that Vermont’s population isn’t growing.
From 2010 to 2018 the state’s population grew by only 0.1%, just a fraction of the national rate of 5.8%.  More than half of all cities and towns in Vermont had fewer residents in 2018 than in 2010.
Chittenden County was the only county in the state where every town gained population.

Although growth was universal in all Chittenden County towns and cities, Burlington and Winooski grew by only 1% (over 8 years 2010-2018)
The big gainers in Chittenden County were Essex, Williston, South Burlington, Shelburne and Milton.
Essex and Williston, two of the largest towns in the county, and indeed in the state, experienced double-digit growth.

Areas of population growth are ideal locations for opening a new business.
Let's use FourSquare to find all the pizza restaurants in Chittenden County




Now let's go back to the FourSquare data and collect all the Food and Drink venues in the area so we can do some clustering analysis

 Breaking the data down into 5 clusters - we see a couple cities that are similar to Burlington (Shelburne and Winooski)

City Cluster Map
Cluster Map with venue overlay

And now let's look at some statistics :
PizzaCount = # of pizza restaurants in the City
PopulationPerArea =  Population / Area (square miles)
PopulationPerPizza = Population / PizzaCount

- I did just a few minutes of research into the 3 Winooski pizza restaurants : Domino's Pizza, Pizza Putt, and Pizzeria Ida
  * Pizza Putt is actually closed permanently
  * Domino's Pizza (the well known chain) is delivery only
  * Pizzeria Ida is on the border with Burlington - not in the downtown Winooski area
This information is updated in the graph below

We are looking primarily for a HIGH PopulationPerPizza and less importantly a HIGH PopulationPerArea (dense population - more foot traffic) 
NOTE : Population and PizzaCount were adjusted to map in the 0-7000 range for easier comparison




RESULTS AND DISCUSSION
What did we learn in our research about Vermont population and restaurants?
- A large portion of Vermont's population is centered around the Burlington City in Chittenden County(~25% ~160,000 people)
- This group of cities and towns already contains a large number of pizza restaurants, but given that population numbers are rising here while they decline elsewhere in the state - it seems like a good area to invest
- 57 pizza places exist today in the 10 towns around Burlington, Vermont
- Burlington is THE big city of Vermont - businesses thrive there, including pizza restaurants with the most in the area(14).   We could open another business in Burlington, but perhaps a better solution is to find a city that is most similar to Burlington
- A couple towns in the area are similar in "most common venues" to Burlington
         The similar towns are : Shelburne and Winooski
- The city with the highest PopulationPerArea (population density) is Winooski but since it's only 1.5 square miles - it maybe difficult to find a location
- The city with the highest PopulationPerPizza (# of residents per pizza restaurant) was originally close between Colchester and Hinesburg... but after adjusting Winooski for the closed/delivery options it also is the winner in this category


FURTHER ANALYSIS
If time permitted, the next steps of the investigation would include :
- Comparing real estate prices, property taxes, and real estate availability in each city/town
- Investigate existing venues to ensure that they are are still up and running - it seems the FourSquare data might be slightly outdated
- Compare the type of restaurants that exist today in several categories :
   service (delivery/slices/dine-in/takeout), price, parking availability, etc..


CONCLUSION
For anyone who is interested in opening a pizza restaurant in Vermont - they should certainly look to the Burlington area - with 25% of the state population and all towns growing consistently.
While they could look for real estate right in Burlington - I suspect it is the most expensive in the area and they would also face the most competition.  I would instead recommend looking at the surrounding towns.

Winooski and Shelburne have the most similar venues to Burlington (per our machine-learning cluster analysis).  Shelburne, however has a low population density while Winooski has the highest in the area.  Winooski also has the highest population per # of pizza restaurants after adjusting for the recently-closed and delivery-only-restaurant. 
So, in conclusion - I would recommend opening a new pizza restaurant in Winooski.


No comments:

Post a Comment