How To Calculate Your Word of Mouth Coefficient
This post was co-authored by Yousuf Bhaijee, Phil Carter and Michael Taylor. Originally published on Reforge.com
Yousuf was most recently VP of Growth at Eaze, and has held growth leadership roles at Disney, Zynga, CSC Generation, and ClassDojo. He is currently focused on how to drive word of mouth, analytics, and growth experimentation.
Michael was a Co-founder at Ladder, a 50-person growth marketing agency, leading the operations, data science and product teams. He is currently working on something new in the marketing attribution space.
Phil is the Director of Growth @ Quizlet and a longtime mentor at Techstars in Boulder, CO. After stints in management consulting and venture capital, he was Director of Product at Ibotta and founded a company called Empath Labs that was acquired by InspiringApps.
Quick Recap Of The Word Of Mouth Coefficient
In our last post we introduced the metric of Word of Mouth Coefficient. There were a few critical points:
Word of Mouth is critical to growth as channel saturation, competition, and platform control are all increasing.
WOM is typically hard to measure.
Because it is hard to measure, it is therefore hard to influence.
As a result, we wanted to define a way to measure WOM that met three criteria:
It’s tied to active users - a user metric every company tracks
It’s a stable metric - you can use it to forecast confidently
It can be influenced - you can break it down to its inputs and influence it with product and marketing activities
The end result was the Word of Mouth Coefficient.
In this post, we want to walk you through, step by step, how to calculate the word of mouth coefficient for your product. Once you have the output of the WOM Coefficient, we can start to look at how to influence it, which we’ll do in a third part of this series.
The Two Outputs
There are two outputs for this step-by-step.
CORRELATION ANALYSIS FOR THE WOM COEFFICIENT
One of the key assumptions for the WOM Coefficient is that active users are predictive of new word of mouth users. To test that assumption, we will do a simple correlation analysis. If the R^2 is high (close to 1) then we know that we can use the metric to forecast goals that are realistic. If it is low, then we know that we need to do some additional refinement (which we’ll talk about later).
WOM Coefficient Over Time
The second chart that we want to produce is looking at the WOM Coefficient over time. This will help us start to identify changes in the coefficient and understand what impacts it at a high level. It may help us identify things like:
Changes in the WOM Coefficient with seasonality.
Changes in the WOM Coefficient as we make changes in other channels like paid acquisition.
Changes in the WOM Coefficient with product releases.
Calculating the Word of Mouth Coefficient Light To Heavy
As with most analyses, there are multiple ways to to get to the WOM coefficient, each with their pros and cons. For the WOM Coefficient, there are three primary methods:
Light Version - Using Google Analytics with some basic filters/segments.
Medium Version - Refining the definitions of Returning Users and New WOM Users using qualifying actions.
Heavy Version - Using Econometrics and multivariate regression analyses.
So how do you know which one to use? This primarily comes down to how complicated your marketing channel mix is. The more complicated your marketing channel mix, the more noisy your analysis can get, and as a result you need to use a heavier analysis in order to refine it to something predictable.
In general, as companies grow, their customer acquisition journey often gets more complicated. They start with using 1-2 major marketing tactics (e.g. Facebook advertising) and can scale up to 10+ channels or tactics.
Series A company: Figuring out 1-2 main digital marketing channels (e.g. Facebook ads, Adwords) and scaling
Series B company: Advertising on 3-6 large channels. Paid and unpaid marketing
Series C+ company: Introduce Out Of Home advertising (e.g. billboards, TV, etc.). Can reach $10M+ per month in ad spend, as well as brand marketing, etc.
Large single channel outliers: Large companies that rely on single channels or product-driven growth. Examples: Pinterest, Dropbox
In this post we will focus on series A/B startups and large single channel outlier products by doing walkthroughs of the light and medium versions of the analysis. Any company can and should start with the light version, though. It’ll tell you where you stand in terms of how ‘noisy’ your data is, and maybe give you interesting findings right away. Later in this post, we’ll ramp up the sophistication and flag scenarios where more advanced analysis might be needed.
Step by Step: How to find your Word of Mouth Coefficient - Light Version
Measuring word of mouth doesn’t have to be complex — for many businesses you can replicate our findings in 10 minutes with Google Analytics. Note that if you aren’t using Google Analytics you can do something similar in Mixpanel, Amplitude or your analytics tool of choice.
We’ll walk through 3 steps to get the word of mouth data:
Define The Denominator - Returning Users
Define The Numerator - Direct + Brand Search
Run The Analysis
We’ll then walk through an additional 3 steps to validate your WOM Coefficient:
Highlight The Data You Need
Create A Scatter Plot
Add Trend Line and Analyze R^2
To get started in Google Analytics you want to click into the Acquisition > All Traffic > Source/Medium report — we’ve found that to be the quickest, most useful page to export.
STEP 1: DEFINE THE DENOMINATOR — RETURNING USERS
For simplicity, we’re going to rely on Google Analytics’ definition of ‘Returning Users’, which is cookie-based, relying on a unique identifier storage on a New User’s first visit. If you have GA installed on multiple domains, you’ll need to aggregate data across all.
Our driving hypothesis in the light version is that repeat interactions with your brand drive referral behavior – every time someone visits your website, or uses your app, is a chance for them to decide to refer you.
In practice, you might want to only count users who have completed some action, like signing up or purchasing, which we’ll cover later in this post. However, you may be surprised how predictive this simplistic approach of using GA’s definition of ‘Returning Users’ often is.
To segment your data for Returning Users, go to the Segments section and click on + Add Segment to the right of the default ‘All Users’ segment (or click this link to import).
You’ll see a red button to add a new segment, and this is where you can set the definition. Give the segment an easy to remember name in case you want to come back to it. Under Advanced > Conditions choose a Sessions Include filter on User Type contains Returning Visitor (it should auto-complete).
Click save, and you have the first part, the denominator, for your Word of Mouth Coefficient calculation! This might seem too simple, and for some businesses it will be, but bear with us and you’ll be surprised at how far this gets you.
STEP 2: DEFINE THE NUMERATOR — DIRECT & BRAND SEARCH
To get the other half of the WOM Coefficient equation, we need an approximation of ‘word of mouth’ traffic. In Google Analytics we can simply choose ‘direct’ traffic, which is traffic without a marketing source, primarily consisting of sharing on ‘dark social’.
To do so, we click to create another segment, this time filtering for Source / Medium containing “(direct) / (none)” (start typing and it should auto-complete) and only New Users (click this link to import).
Even though we’re trying to keep it simple in this Light version, we know that direct traffic isn’t the whole picture — much of what we’d define as ‘word of mouth’ results in people searching for your brand name and clicking on the organic search result.
But how do we separate ‘brand’ searches from general organic search traffic? Google has made it difficult, by removing the ability to filter by search term, but a good proxy is to filter for organic search traffic to the homepage. If they searched on Google and landed on your homepage, it’s a pretty safe bet they came in from your brand term.
To replicate, first filter for New Visitors, who came from Google / Organic, who landed on “/”, or the homepage (or you can import by clicking this link). In this case we used ‘exactly matches’ just to make sure we don’t accidentally match to any other pages (given that homepage is denoted by just a forward slash). If you’re also running paid search ads on your brand term, then you’ll want to consider filtering for these too as an additional column for word of mouth (though you’ll have to adapt our template).
Of course the exact definitions might differ depending on your brand term and how you have your search campaigns set up (also remember Bing or other search engines!). For this analysis to work best, we want to explicitly separate out any marketing we’re doing intentionally, because by definition that’s not ‘word of mouth’.
If you’re doing offline advertising like billboards or TV, it’ll tend to show up as branded search and direct traffic. Unfortunately that gets us into advanced territory, where we’d need methods like Econometrics to properly separate out the impact. If that’s you, reach out to us for support and we’ll help point you in the right direction. However for now let’s continue with the ‘light’ version of the analysis – for which we’ve now got all the data we need.
If you struggled with creating the segments, clicking these links will import them to your GA.
STEP 3: RUN THE ANALYSIS — FILLING IN THE TEMPLATE
Now all we need to do is export the data, preferably with at least a year’s data (or as much as you have). You can find the export function on the top right just above where you chose the date range. Make sure you have the ‘All Users’ segment selected as well as the 3 new segments you made. Choose Google Sheets and make a copy of the WOM Coefficient Light Template so you can follow along.
Scroll down in the gsheet you exported to below the normal Source / Medium output — you’ll be ignoring that part. What you want is the daily breakdown by segment at the bottom. You have the Day Index (date), Date Range (not useful), Segment (your definitions for the different parts of the calculation) and Users (the important part!).
You just need to quickly transform this data into the right format by building a pivot table. Select the data (apart from the three total rows at the bottom) and go to Data > Pivot Table. Create it in a new sheet. Then put Day Index in the Rows section, put Segment in Columns and then Users in Values to get the data the way you need it. Totals rows aren’t needed so you can deselect them on the right hand side pivot builder.
From here you can simply copy and paste values (Paste Special > Values only, or CMD + SHIFT + V on a Mac, CTRL + SHIFT + V on Windows) across the different columns into the template. For example, day index goes in column A ‘Date’, and it should fill up to column D ‘Users’. The template should now be filled out completely, with all the calculations done for you based on the data you entered.
The completed template will automatically plot your WoM Coefficient for you, and do the correlation analysis for word of mouth vs returning. Note: if you have more rows than in the template, you’ll need to drag all formulas down, as well as change the range of the pivot table. You’ll have the raw data to manipulate, but will also see the functions we used to identify anomalies.
From this sheet, you should see whether the `light` version was predictive for your business. Look for a relatively stable green line (Word of Mouth Coefficient) on the chart on the left, and a high R^2 value for the trendline on the right chart (the blue dots should be mostly along the line).
If that’s what you’re seeing, then great, this simple analysis should be enough to interpret your word of mouth coefficient. If returning users is predictive of new word of mouth users, that means the best thing you can do for word of mouth is to drive repeat visitors. If the Word of Mouth Coefficient is trending up or down, then you can forecast where that leaves you versus your growth goals. Finally, if you’re seeing spikes or dips in word of mouth coefficient, those are anomalies you should investigate – what happened at those times that could explain it?
Step by Step: How to validate your Word of Mouth Coefficient
In this section we will show you how to validate the Word of Mouth Coefficient using a simple regression analysis in Google Sheets. We do this to make sure Returning and Non-WOM users predict the Word of Mouth driven users. This is reconstructing the template we shared with you from first principles. This will be a useful exercise for more advanced analysis and companies with a more sophisticated marketing mix, because you’ll learn how to build a custom model that works for your business.
The steps below will walk you through how to create a scatterplot and calculate R^2 in Google sheets for WOM Coefficient.
STEP 1 - HIGHLIGHT THE THE DATA YOU NEED
STEP 2 - CREATE A SCATTER CHART
Select ‘Scatter Chart’ as the chart type
Pick ‘Non-WOM (Returning + Non-WOM New)’ as the x-axis
Pick New-Word of Mouth users as the Y-axis
STEP 3 - ADD TRENDS LINE AND CALCULATE THE R^2
Navigate to the ‘customize’ tab
Open up ‘Series’
Select “Add Trendline”
Scroll down further and select “Show R^2”
And that's it. You can now analyze how well your data predicts new WOM users. An R^2 > 0.7 is considered highly predictive. Less than 0.7 requires more work and exploration – this means you’ll need to use the medium or heavy analysis to find better definitions for returning and new word of mouth users.
Calculating Word of Mouth Coefficient - Medium Version
As mentioned, sometimes the light version produces data that is not very predictive. This typically stems from a too simplistic definition of our numerator and/or denominator. This will show up as a low R^2 because you are including users who are not likely to generate Word of Mouth, mixed in with the users who are more likely to refer. To refine our analysis, we can think about how to define Returning Users or New Users more narrowly using qualifying actions. This link is a template for the medium analysis.
REFINING WHO IS CONSIDERED A RETURNING USER
Let's explore a few examples:
Visiting www.bananarepublic.com without buying anything - If a returning user goes to a store without buying anything, are they likely to generate word of mouth? And if not, should they be included as an active user? For many internet businesses, the default definition of daily active user (DAU) is someone who loads the website. Since there are many more users that visit and do not buy than those who visit and do buy, the non-buying visitors will dilute the Word of Mouth impact of the buyers. In this scenario, a more valuable definition of active user is someone who makes a purchase
Visiting www.tiffany.com without buying anything - For other products (even in the same category) the bar could be lower. Jewelry is a great example. You might go into a store with no desire to buy and find a sale on a limited edition collection of rose gold rings. Even if you do not have the means to buy it, you might want to tell your friends about it! In this case, even window shoppers could generate word of mouth.
Visiting https://theathletic.co.uk/ and hitting a paywall - Because you need to subscribe before being able to access content, the standard definition of a User as having visited the website, would likely perform horribly to predict word of mouth. Instead we’d want to filter for only Returning Users who have an active subscription, in order to improve the accuracy of our models.
Other ways to filter your active user definition could be combinations. Continuing the jewelry example you could define an active user as someone who has done at least one of multiple qualifying actions:
Added to cart
Added to wishlist
The point is to look at additional definitions thinking about your specific product and consumer and then validate by measuring each definition’s R^2.
REFINING WHO IS CONSIDERED A NEW USER
The definition for new users matters just as much as the definition for returning users. We found this to be the case in Edtech, when we partnered with a $1B+ company and found very different results based on how we defined active users.
For this company, which generally supports a weekly use case, the qualifying actions a user can take to be considered active include:
Starting a study session
Signing up for an account
We then came up with three specific definitions for “New WOM users” to explore. We did this to evaluate which was the most predictive - or had the highest R^2. Here are the definitions, from least restrictive to most restrictive:
Definition #1: New WOM Actives - All users who performed at least one of the qualifying actions (studying or signing up) for the first time during a specific week.
Definition #2: New WOM First-Time Visitors - Same as Definition #1, but with a technical exclusion that removed logged-out visitors who were actually just returning users. More technically: the definition for “New WOM First-Time Visitors” was all distinct new users who performed at least one of the qualifying actions (studying or signing up) for the first time during a specific week based on cookie ID, rather than a user ID (or lack of user ID). This exclusion removed potential noise from returning logged-out visitors
Definition #3: New WOM Sign Ups - All users who signed up during a specific week.
All WOM definitions: For all three of these definitions, only users traced to ‘direct’, ‘branded search’, ‘social sharing’, or ‘social’ acquisition were counted as WOM new users.
The company pulled data for each of these different definitions for New Active Users through Organic WOM, and inserted it into this data template, focusing on Weekly Active Users (WAU) to reflect the fact that their product generally supports a weekly use case. We then ran the same analysis as the light version separately for each definition.
This yielded the following results:
Insights from this:
New WOM Sign Ups had the lowest R^2 (i.e. least predictive power). This makes sense because the product has a lot of usage from logged-out visitors who perform the qualifying action of starting a study session but don’t immediately sign up. Given that, New WOM Sign Ups is an overly restrictive definition of New WAUs from Organic WOM, because it excludes users who actively visited and used the product without signing up during that visit.
The best definition did not have the highest R^2. The ‘New WOM Actives’ definition had the highestR^2, but for reasons that are not necessarily positive. Specifically, this definition included some logged-out users who were actually returning users from prior periods - which we determined by examining their browser cookies. As a result, some of the same users were in both the numerator (New WAUs from Organic WOM) and the denominator (Returning WAUs). This artificially inflated the number of New WAUs from Organic WOM (since some of them weren’t actually new), and artificially inflated the definition’s R^2.
Ultimately, the company selected Definition #2: New WOM First-Time Visitors. This definition exhibited an acceptably high R^2 value while also making the most conceptual sense, in that it captured all new users who performed any of the qualifying actions necessary for the user to be considered “active” while excluding returning logged-out users with the same cookie. This is a perfect example of why it’s important to use both quantitative and qualitative criteria in determining which metric definitions to use for this analysis.
To be thorough, we also did this exercise again using “daily” and “monthly” as the active user definition but found that weekly was the most meaningful.
When Light/Medium Word of Mouth Coefficient Isn’t Enough
As a more mature business with higher marketing spend, multiple product lines or offline marketing channels, you might find neither the Light or Medium versions work as is. This is because as your business’ complexity increases, there’s more ‘noise’ in the data and additional variables need to be accounted for.
Scenarios where this exists may include:
Complex marketing mixes with multiple overlapping channels
Businesses that run offline campaigns likely to drive word of mouth
Multi-national companies with varying brand awareness by country
Companies with multiple product lines or business models
Exogenous shocks on word of mouth (global pandemic, competitor behaviour)
For example, companies running less directly trackable marketing campaigns (TV, billboards, podcast ads, content, PR) require techniques to tease out how much ‘direct’ traffic was really attributable to those campaigns.
In scenarios like this, one method that has proven useful for us is Econometrics. This is essentially expanding the analysis to use multivariate regression. The template you used was single variable linear regression, and it’s possible even in GSheets to run multivariate regression (using the LINEST) function.
Typically used by large advertisers to predict sales (and therefore attribute performance to specific marketing channels), you can make new word of mouth users the dependent variable and regress against variables other than returning users that might drive word of mouth. Examples include flagging time periods when PR campaigns were running, separating out the impact of COVID-19 or incorporating product stock levels, pricing and availability.
Practically the work involves measuring more variables (not just performance but actions taken on campaigns), enriching and cleaning the data in creative ways, and then trying various models until you find a way to improve accuracy. Let the R^2 metric guide you, though the true test of the model will be how it performs in predicting future levels of word of mouth.
It’s likely that you’ll find measuring word of mouth difficult, particularly as your business scales in complexity. We recommend you start with the simplistic ‘light’ model we demonstrated earlier in this post as a good first step. You might find it surprisingly predictive, and can start to take actions to increase Word of Mouth. However, if you find yourself dealing with noisy data, struggling to find good definitions for Returning and Word of Mouth Users, or don’t know where to start with Econometrics, feel free to reach out to us (Yousuf and Michael) for support.