When it comes to distilling large volumes of data into actionable insights many businesses are faced with a challenge. Here at 6Consulting our Professional Services Team work hard at presenting clients with a full picture of their social media presence providing detailed insights into the data. Our Analysts assess conversations by taking a representative sample of data to analyze which in turn provides an understanding into the full data containing key information to assist in forming strategic decisions for social media.
In our ‘How To’ section we’ve decided to talk more about sampling. So how do we do it?
Sampling
Sampling itself is the process of selecting a representative selection of a population (i.e. all data returned in a Radian6 Topic Profile). Simply put, sampling allows us to draw conclusions about populations by directly observing a portion (or sample) of that population. We can draw conclusions and provide insight about an entire set of data from Radian6 based on a smaller sample of fully analyzed data whilst allowing us to preserve important information such as Share of Voice, trends in conversation, distribution of media types and sentiment breakdown.
Method
For example: Waterstone’s would like to understand the sentiment towards their brand in social media in relation to two of their main competitors Blackwell’s and Foyles over a one month period.
However, there is too much data in Radian6 (20,500 mentions across the month) to analyse for insight:
- 10,000 mentions of Waterstone’s
- 6,000 mentions of Blackwell’s
- 4,500 mentions of Foyles
A sample of 3,000 articles would provide enough data for insightful analysis. We could select 1,000 mentions for each of the companies; however the fact that Waterstone’s has a significantly larger amount of conversation than Blackwell’s and Foyles is valuable insight which should be preserved. Therefore a representative sample of data (i.e. one which will represent all 20,500 mentions in social media) must be taken.
First of all we must work out the Share of Voice:
- Waterstone’s = 49%
- Blackwell’s = 29%
- Foyles = 22%
Next we work out the distribution of the 3,000 articles across the three companies. This is calculated by using the Share of Voice for each of the companies (above):
- For Waterstone’s 49% of 3,000 = 1,470
- For Blackwell’s 29% of 3,000 = 870
- For Foyles 22% of 3,000 = 660
These breakdowns will be used as a limit on the number of relevant articles to be analyzed for each company. The chart below illustrates how the Share of Voice would appear with a representative and non-representative sample:
- Figure 1: Representative and Non-representative Samples
Though this calculation has preserved Share of Voice we need to ensure that the data actually reflects the patterns of conversation across the entire monitored period. If we only look at the first 1,470 articles about Waterstone’s from a total of 10,000 in date order we would likely only be examining conversation over a few days rather than over a month.
In order to overcome this we select a random sample of data from the total amount of conversation to analyze. Random sampling allows each piece of data to have an equal chance of being selected for analysis.
The chart below displays the whole data-set as a blue line. The red line represents what would be included if we only analyzed the first 1,470 mentions as they appeared in date order. The green line represents a correctly distributed number of mentions as selected through a random sample with peaks and troughs preserved:
- Figure 2: Random and Non-random sample
Samples will not be representative of all the conversation over a monitored period if they are too small. It’s not appropriate to assume that 1,000 mentions of a company will be an accurate representation of all conversation about them if there are 100,000 conversations about that company in social media. This will only provide a snapshot of their social media profile.
Summary
- Sampling allows us to select and analyze a small but representative amount of data.
- Sampling correctly will preserve Share of Voice, conversation volumes media type and sentiment breakdown.
- Random sampling is important to preserve trends across a monitored period.
We’re keen to hear your thoughts – how in-depth do you analyze your social media conversations?
The post 6Consulting's How To: Sampling appeared first on Salesforce Marketing Cloud.