Analyzing My 2020 Google Location History Data | by Justin Spitzer | Dec, 2020
For the purpose of this article, I’ll be focusing only on data for 2020. The COVID-19 pandemic made 2020 an interesting year to say the least, and I’m curious what sorts of trends can be drawn from my location history.
The first thing I did was start to look at the data for “places,” which listed the address of the location where Google said I had arrived. Not surprisingly, my home address appeared at the highest frequency each month. I also plotted here the frequency of arrivals to each of the two work locations I work out of:
As the COVID-19 pandemic began to take its hold in the US, you can see my traveling behavior changed, and I left the house a lot less frequently from March through May. In April, 92% of my trips outside of arriving home were for traveling to one of two work locations. The other 8% was grocery shopping. While my pre-pandemic shopping habits typically involve more frequent trips to the store for fewer items (walking distance to my house), I instead opted for fewer trips of a larger quantity of groceries during the March-May timeframe to lower the risk of transmission.
As more data emerged about the pandemic and health officials started to piece together the effective IFR rate, my traveling behavior started increasing beginning mid to late May (while still taking the proper health precautions and avoiding large groups of people, of course).
Outside of logged events for home and work, I wanted to see what trends were noticed for other repeating logged addresses. I generated a new dataframe to show months as an index, and each location as a column. Before any filtering, the dataframe gave me 329 unique locations. That sounds somewhat accurate, although I would have predicted a larger number of entries. To filter out the majority addresses with one visit, I used the following filter:
The output of this gave me a new dataframe with 25 locations, or 22 if you discount work 1, work 2, and home.
Now let’s look at a graph showing some of the most frequently visited addresses in 2020.
Once again, it’s clear to see how limited my travel outside the home was in April and May. I remember heading to a different store to get larger bulk quantities of groceries in March to prepare for the covid quarantine, so Vons and Trader Joes were not visited then. I’m surprised to see 5 visits in January to the hiking area I enjoy near my house.
In addition to seeing where I traveled, I wanted to dig into the coordinate data provided in the Google location data to find out how far I traveled.
To help us get the distance from coordinates, we can use the python module vincenty, aptly named after Vincenty’s formulae for calculating the distance between two points on the surface of a spheroid. Fun fact: Vincenty’s methods are accurate to within half a millimeter on the Earth ellipsoid!
The distance traveled for March and April also reflects the address/location data. I pretty much kept travel to work and the store, both of which are relatively short distances. April showed a total of 445.8 miles traveled. Averaging across 30 working days, we yield an average of 14.8 miles per day for those limited activities.
The outlier of July involved a family reunion trip and a backpacking trip, where I traveled further from home. For the reunion trip, one way involved driving while the way back was flying. The Google data algorithm provides estimates for travel type — does my location data match up to that?
Flying is in there! Although their estimation isn’t perfect — the algorithm estimated 960 miles of flying, while the real flight was closer to 600 miles. I imagine this is due to disconnection/reconnection from location data as we took off and landed. 209 miles of walking also seems extraordinarily high. While I did a lot of hiking that month during the backpacking trip, it wasn’t anything close to 200 miles.
Google’s algorithm can even predict if the travel type is skiing, which I was doing in February of this year. We took a bus to and from the lodge to the mountain, which also shows up here. Pretty powerful:
At just over 20,000 miles logged by Google, passenger vehicle was the greatest type of travel I used in 2020. At first review of that number and comparing to the increase on my odometer for this year, that number seems to be a massive overestimation. However, my phone is with me every time I’m in a friend’s car as well, and so each of those instances would be logged. Here is a breakdown of the travel types used, according to my Google location data.
Google has a lot of data on us. If you’re like me and have your location data enabled on your phone, despite the fact that we’re not actively searching for most places we go, every one of those places is being logged, down to the method used to get there. Compiling this data across millions of active monthly users gives Google a lot of data to work with. The accuracy of ETAs and traffic delays down to the minute starts to make sense when thinking about the powerful models and thousands of terabytes of data Google processes daily.
Despite the ominous nature of the information being collected on us, it is intriguing to review our own contribution to that data and draw inferences on our behavior.
Read More …