How to Scrape Google Trend Data with Python (Full Code)

If you’re familiar with SEO or run a blog, you might be familiar with Google Trends. Google Trends is an incredible free tool that analyzes the popularity of top search queries in Google Search. The website uses graphs to compare the search volume of different queries over time.

When it comes to analyzing stocks, customer sentiment is a key component when determining price movement. As Warren Buffett says “you should be fearful when others are greedy and be greedy when others are fearful.”

Here is an example of searching how often “should i sell stocks” was searched on Google.

As you can see, the results spiked the week of February 24, 2020 which completely aligns with the beginning of the drop in S&P500 due to Covid-19.

This shows the power of Google Trends and by choosing the right search terms, it can be leading indicators to predicting the market.

Automating This Process

But if you’re like me, you don’t want to be sitting there everyday plugging in search terms to see what the results are. My recommendation is to leverage PyTrends to automate this scraping process.

Here are some functionalities that PyTrends enables you to pull from Google Trends:

  • Interest Over Time: returns historical, indexed data for when the keyword was searched most as shown on Google Trends’ Interest Over Time section.
  • Historical Hourly Interest: returns historical, indexed, hourly data for when the keyword was searched most as shown on Google Trends’ Interest Over Time section. It sends multiple requests to Google, each retrieving one week of hourly data. It seems like this would be the only way to get historical, hourly data.
  • Interest by Region: returns data for where the keyword is most searched as shown on Google Trends’ Interest by Region section.
  • Related Topics: returns data for the related keywords to a provided keyword shown on Google Trends’ Related Topics section.
  • Related Queries: returns data for the related keywords to a provided keyword shown on Google Trends’ Related Queries section.
  • Trending Searches: returns data for latest trending searches shown on Google Trends’ Trending Searches section.
  • Top Charts: returns the data for a given topic shown in Google Trends’ Top Charts section.
  • Suggestions: returns a list of additional suggested keywords that can be used to refine a trend search.

Full Code to Get Started

First, install the PyTrends library

pip install pytrends

Below is the python code that allows you to return the interest over time based on your search query. The result will return in a dataframe.

from pytrends.request import TrendReq

def PyTrend(keyword):
    keyword = [keyword]
    pytrends = TrendReq(hl='en-US', tz=360, timeout=(10, 25), retries=2)
    pytrends.build_payload(keyword, cat=0, timeframe='today 3-m', geo='', gprop='')
    df = pytrends.interest_over_time()
    df.columns = ['relevance', 'is_partial']
    return(df)

Practical Examples to Use This Code

How I have been using this code is finding the interest over time on phrases like “buy x stocks” or “sell x stocks” and replacing the x with the stock ticker I’m reviewing. It’s still in the research phases. I’m using this as an awareness indicator to know what the sentiment are, but not necessary using this information to trade.

My next steps will be to explore the most searched Google queries relating to a company or the market and correlate it with price data. With machine learning, I hope to identify search terms that have a correlation to the price movement of the stock and eventually turn this into a trading strategy.

Stay tuned for more updates to come.

Related