Extracting reviews from G2 Crowd to analyze pros and cons of your competitors

This guide will help you with creating offers, outreach emails, positioning, and marketing strategies based on competitor analysis.

We will crawl reviews on G2 crowd and parse Like and Dislike sections. Then we’ll use Google Sheets formula to sort the words used in reviews by popularity and filter out everything except nouns and adjectives to see what strengths and weaknesses our competitors have.

crm reviews on g2 crowd

Top-3 products in CRM category have more than 7000 reviews combined so we know there’s enough data to work with and we know that there’s too much data to analyze manually.

Extracting likes and dislikes

We need to download user-generated texts from “What do you like best?” and “What do you dislike?” sections:

g2 crowd review example

You’ll need Netpeak Spider and following regular expressions to do so:

  1. (?<=like best?)(.*)(?=What do you dislike?) — This RegExp will fetch text located between the “like best?” and “What do you dislike?” parts.
  2. (?<=What do you dislike?)(.*)(?=Recommendations) — This one will fetch texts between “What do you dislike?” and “Recommendations”.

Set up Spider’s Custom Search to fetch necessary data:

netpeak spider crawling settings

Don’t forget to disable all other crawling parameters. If you need help with settings read my post about Downloading Prices from Competitors’ Websites or contact the support at netpeaksoftware.com.

Get the list of URLs to extract data from

There are 6 reviews hosted on every page, which means you’ll need to crawl a lot of pages. Luckily, dynamic URLs on G2 Crowd contain page numbers:

https://www.g2crowd.com/products/salesforce-crm/reviews?focus_review=249310&page=2&product_id=salesforce-crm

It means you can generate these pages without any problems, here’s how I do it:

  • Pick a product you want to get reviews for and use filters if necessary, when I want to get negative reviews I prefer to look for one, two, and three-star reviews, that’s how I know that dislikes are there for a reason, and not just because it’s a mandatory field.negative reviews g2 crowd
  • Go to the second (or any other numbered page) and copy its URL:
    https://www.g2crowd.com/products/salesforce-crm/reviews?filters%5Bcomment_answer_values%5D
    =&filters%5Bnps_score%5D%5B%5D=3&filters%5Bnps_score%5D%5B%5D=2&filters%5Bnps_score%5D%5B%5D=1&order
    =g2_default&page=2
  • Paste the URL into Google Sheets spreadsheet, make 3 copies of it:

google sheets g2 crowd URLs

  • Change page numbers to consecutive ones in 2nd and 3rd rows, highlight the URLs and drag the cursor down to replicate URL addresses in succession:

url generation / replication

  • Generate as many URLs as you need:

generated URLs

Scrape the data

Change Spider’s crawling mode to List of URLs, paste you list and start crawling.

netpeak spider crawling mode

Once crawling is complete (it takes about a minute to crawl 200 pages) go to the Search tab and click “All”, to see the search results:

netpeak spider search results

Next, switch to All Values and export your file. If you don’t switch to All Values you’ll get one result per page only, instead of 6 results per page, so don’t forget to do it.

all values g2 crowd search

Every result starts and ends with bits of HTML. Nothing to worry about, you can remove these with Find and Replace feature in Google Sheets.

find and replace

Sort the data

Open your exported file in Google Sheets, remove HTML bits with Find and Replace feature.

You will find review parts you were looking for, you’ll have 6 columns with positive feedback and 6 columns with negative feedback.

g2 crowd reviews in google sheets

Get the most used words from these reviews.

words used sheets

You’ll need this formula:

=ArrayFormula(QUERY(TRANSPOSE(SPLIT(JOIN(” “,H2:H30,I2:I30,J2:J30,K2:K30,L2:L30,M2:M30),” “)&{“”;“”}),“select Col1, count(Col2) group by Col1 order by count(Col2) desc limit 500 label Col1 ‘Word’, count(Col2) ‘used'”,0))

Where H2:H30,I2:I30,J2:J30,K2:K30,L2:L30,M2:M30 is a set of ranges you want to analyze. You can also single range such as H2:M30 but I’ve found that Google Sheets has problems with this task and large ranges.

And desc limit 500 defines how many results to display. Feel free to change it.

Analyze data

Once you have your list of the most used words you can analyze it manually and see what words people use to complain about certain product or what words they use to praise it, or if you’re overly lazy you can use one of the available POS tagging scripts to filter words by parts of speech:

POS identifier

I’ve tried these two:

https://parts-of-speech.info/

https://cogcomp.org/page/demo_view/pos

They were mediocre, but useful if you need to work with a lot of words.

Related Posts

3 comments

A Power Business Digital está Inovando no Marketing Dgital com o IBM Watson Marketing

I think everything posted made a bunch of sense.
But, consider this, suppose you were to write a awesome post title?
I am not suggesting your content is not good, however suppose you added a post title that makes people desire more?
I mean Extracting reviews from G2 Crowd to analyze pros
and cons of your competitors – guerrilla hub is a little plain. You might peek at Yahoo’s home
page and note how they create article headlines to get viewers interested.
You might add a video or a picture or two to grab people interested about everything’ve got to say.
Just my opinion, it would make your posts a little bit more
interesting.

Hey, thanks for the tips. I try not to overthink things and spit them out as fast possible before my attention shifts to something useless.
The title I used won’t draw too much attention but it describes what I’m rambling about in this post. I’ll try to find the balance.
Cheers.

Leave a reply