Blog

Deterministic vs. probabilistic data

Programmatic advertising is full of complex terms and layered jargon. We’ll help translate some of the different aspects of the programmatic process for site owners.

deterministic vs probabilistic data

Google’s Privacy Sandbox is flipping the script for site owners across the globe. Where advertisers and publishers both relied on third-party cookies for targeted advertising and data collection, the deprecation of third-party cookies means site owners need to find solutions to the cookie collapse that allow them to maintain revenue and value within their audience.

This doesn’t mean that targeted advertising is going to go away. Advertisers are already adapting to the planned cookie deprecation by turning to first party data, offered by site owners and ad networks. But all data is not created equal. Deterministic and probabilistic data both pose interesting solutions for site owners and advertisers alike. Let’s uncover the pros and cons of each type of data.

Deterministic data

Deterministic data falls under the umbrella of first party data. It’s information that’s given freely by the user, whether by email input or user login. While it sounds attractive as a form of data, it’s rare for sites to see lots of reliable deterministic data. According to a Primis White Paper, deterministic data covers less than 10% of a given site’s user base. Coupled with the fact that deterministic data can decay in accuracy over time, it’s not exactly a solid solution for site owners.

Users are also less likely to just give out personal data to any site. Smaller sites will suffer a kind of data drought, while larger, more trusted sites will experience business as usual. This is due to user trust, which means smaller sites will have to find another data solution to serve to advertisers.

Probabilistic data

Probabilistic data is based on predictive patterns and algorithms created to segment users. Each user receives an identifier based on IP address or other circumstantial information. Using algorithms and machine learning, advertisers and publishers are able to create a kind of map of the user. Their habits, their wants and needs, etc. It’s third-party cookies without the tracking, but also without the pinpoint accuracy.

Probabilistic data can be seen as an educated guess about a user, one that’s typically correct. It’s created from data points, so advertisers can use one known piece of data that’s shared when the user visits a site (in this case, the IP address) and build a user profile off the back of that. Because it’s working from limited given data, probabilistic data is more accessible than deterministic data. It’s also often more accurate as it’s always updating based on overall behavior, rather than relying on the user to update the data.

We’ve got a wealth of blogs preparing site owners for the deprecation of third-party cookies. Not to mention our webinars and downloadable eBooks on our Resources page. If you’re interested in joining the Collective, you can apply here.