What is real user monitoring (RUM) and how does it compare to synthetic monitoring?
This is a question that has come up frequently in our demos recently. Most organizations we converse with have some previous experience utilizing synthetic scanners and have asked us what advantages our unique, real user monitoring approach to data quality and consent validation provides.
Given these discussions and their frequency, we thought it would be a good idea to write a blog post inspired by a similar one by our friends at Blue Triangle about real user monitoring as it relates to performance monitoring.
One thing to acknowledge as we dive in – scanners do have a couple of clear advantages compared to real user monitoring:
- Onboarding – scanners are easier to get through the procurement process. Because synthetic scanners are not placed directly on the website as RUM platforms are, the process of procurement and security review can be much easier by comparison.
- Synthetic scanning for digital data validation is a known practice in the market. They have been in the market for almost 20 years now. The known-known in many cases can be easier to digest than the known-unknown.
So why did we set out to build a RUM platform for digital data quality and consent validation?
Very simply, it is a proven methodology utilized by some of the world’s largest digital data compliance monitoring platforms to drive meaningful value for their clients.
Why does this approach make sense for digital data quality? First, let’s define the synthetic approach. We can’t say it any better than Blue Triangle:
“Synthetic Monitoring is user activity emulated by bots to help identify snags and trends on your website without relying on real user traffic. In other words, bots mimic the actions of the customer journey and collect data on their experience.”
Think of your data as ingredients for a recipe, and your consent settings as the dietary restrictions or preferences of your guests.
The synthetic approach is like using fake ingredients to create a sample dish and then asking your guests if it meets their dietary needs. While this might seem like a good way to ensure everyone can eat the meal, it doesn’t guarantee that the dish will taste good or be satisfying. It’s all theoretical and not practical, because you haven’t actually tested the real ingredients in a real dish.
Similarly, using emulated bot traffic to test data quality and consent settings might tell you if the data is “right” or “wrong,” but it doesn’t ensure that you are collecting the correct data that matches visitor consent in the real world–which is what your data teams and privacy regulators care about.
It’s important to test your data collection methods and settings in real-world scenarios to ensure that you are gathering good data that aligns with your visitors’ preferences and consent.
The net/net is that a simulated user experience is never as informative, or as accurate as the real user experience.
Without access to that real user data there is a massive decrease in the ability to accurately answer the question: Are we collecting accurate, consented data? What you can answer is: in your simulated world, are we collecting accurate, consented data?
Unfortunately, this metaverse is not the real world. Relying on synthetic monitoring alone is simply not good enough.
What’s the alternative to synthetic scanners? Real User Monitoring.
To quickly outline why real-user monitoring is critical from a digital data perspective:
- Outliers matter
- The customer journey is chaotic and unpredictable
- Access to rich, real user data makes easier to find and fix issues
Outliers Matter
There are many scenarios in data collection where even getting it slightly wrong can be calamitous from a business perspective. Having access to 100% of real user data assists in the identification of these outliers which might have real user impact.
For example, we had a client who utilized a synthetic scanner for data quality monitoring, but because of the gaps in coverage, they were collecting email addresses for many months. They had to pay their marketing technology vendor in the mid-six figures to remove that data.
We had another client who also utilized a scanner who were inadvertently collecting email addresses only in a very specific scenario that resulted in 50 email addresses collected for a website that received billions of page views a year.
In both of those instances, access to 100% of data in real-time limits the gap between issue generation and an organization’s ability to be alerted to the issue.
The tiniest amount of PII data collection can be detrimental to your business, impacting both the bottom line and your reputation with customers. Having your marketing technology vendor remove that data can cost you hundreds of thousands of dollars and if the information becomes public, PII leaks can lead to embarrassing PR snafus. It’s better to catch issues like this in real-time so that you can fix it promptly.
Chaotic, Unpredictable Customer Journey
Synthetic scanners require manual identification of the steps a customer would need to take to ensure a specific output. If you’ve ever gone down the customer journey rabbit hole, you understand that the paths real users take are chaotic and not necessarily prescriptive.
For example, the checkout process typically follows a step-by-step process. Any cursory examination of user data will show that users can reach any of the pages in the checkout process from any page on the site. Those real user interactions will not fit into the synthetic monitoring process, impacting your ability to examine and analyze the data generated during those interactions.
If (and this is a BIG IF), the user follows the specific journey, utilizing the exact browsers specified, from the exact geo-location as the synthetic monitor, then, and only then, will your synthetic scan’s monitoring be effective.
For brands with millions of visitors across many different browser types coming from locations all over the world, building and maintaining scripts to accurately represent these users would require hundreds of thousands of hours each year. We’ve never seen an organization do this, and it’s unreasonable to expect it.
Access To 100% of Real User Data
Being able to access 100% of data has provided instant value time and time again for our clients. While the value is clear in general, we have run into many specific instances where this level of access to the data has proven useful.
We recently went live with a client who had a classic digital experience, similar to many of our consulting clients over the last 15+ years. For many months, their enterprise data warehouse (EDW) consistently showed a two to four percent discrepancy compared to their digital analytics transaction data. One day, out of the blue, this number jumped to ten percent… and stayed there.
Most of us who have consulted in this industry have seen this issue with our clients. Once we notice a discrepancy, we spend hundreds of hours and thousands of their dollars investigating – hoping to find and fix the issue. This effort causes headaches from an attribution perspective and also brings added pain when reporting to senior leadership.
With Sentinel Insights’ access to all of the real-user data the website was collecting, we were deliver a data export within 15 minutes of our call with the client. This data export provided each and every instance that included an issue with revenue reporting.
Looking at that data, we realized that (for certain transactions) the website pages were loading too quickly – the analytics platform was unable to set the purchase ID in time and that was causing their issues. Armed with this data and evidence of the problem, they were able to send that to the platform owners who fixed it within a day.
Within two weeks of implementing the Sentinel tag on our websites, we’d found the answer to a problem with missing data that had been vexing our team for months. Sentinel Insights paid for itself almost instantly.
That quote says it all!
We’d love to show you the power of our platform – just click Book a Demo and we’ll get something on the books.