The Marketplace Owner Always Knows More
Originally published in The Times of India. This article is part 2 of a series called ‘Reality Check on Media Strategy’.
Over the last two decades, media companies have increasingly partnered with algorithmic marketplaces like Google and Meta to reach their audiences and monetize content. This collaboration has resulted in a profound information asymmetry in favor of the algorithmic platforms.
These platforms possess vast amounts of real-time, granular user data, which they utilize to optimize their services and dominate the advertising market. In contrast, media companies are left with only limited, aggregated, and often delayed insights, severely hindering their ability to compete effectively.
The transaction between algorithmic marketplaces and media companies is clear; hence there is no obligation on algorithmic marketplaces to share this data. This article examines the roots of this imbalance, its impact on media companies, and the strategies they can adopt to reclaim some control over their data and operations.
Importance Of First-Party Data
In the digital advertising ecosystem, the ability to target users with precision is crucial. The most profitable form of advertising involves identifying users who are most likely to make a purchase and guiding them through the transaction process, whether on a brand’s platform, an e-commerce site, or directly within the content itself.
This process hinges on access to first-party data — information collected directly from the user by the platform they interact with.
Enabling this monetization from performance marketing are three demand-side first-party data:
- The user’s identity (email or phone number)
- Having enough user behavior, preferences, and demographics to target a specific advertisement to specific users
- Owning the infrastructure to facilitate a transaction
Algorithmic marketplaces collect demand-side first-party data.
To fully grasp the extent of this imbalance, let’s take the example of Google.
Over the years, Google has invested billions of dollars in developing a suite of products and services that encourage users to remain logged into their Google accounts across multiple devices and platforms. These investments have given Google unparalleled access to first-party data, which it uses to maintain its dominance in the digital advertising space. Below are few examples:
- Email: In 2004, Google launched Gmail with a generous 1GB of free storage, far exceeding the capacity offered by incumbents like Yahoo! and Hotmail.
- Maps: Recognizing the importance of location data for advertisers, Google launched Maps in 2005.
- Office: In 2006, Google introduced free online productivity tools like Calendar and Docs & Spreadsheets, directly competing with Microsoft Office. These tools, later consolidated under Google Drive.
- Browser: Understanding the strategic importance of controlling the gateway to the internet, Google launched the Chrome browser in 2008. Its speed, efficiency, and integration with other Google services quickly made it a popular choice, ensuring users remained connected to Google’s network.
- Mobile Operating System: The rise of smartphones presented another critical choke point to the internet. Google’s response was to develop and open-source the Android operating system, enabling various smartphone manufacturers to compete with Apple while keeping users logged into Google’s services.
- Devices: Google expanded its reach further by launching or acquiring various low-cost devices like FitBit watches,Nest smart home devices, and Chromebooks. These devices further solidified user engagement with Google’s ecosystem, providing additional data collection opportunities.
User’s Identity
All of these products and services keep users logged into their Google ID for longer and across various aspects of their life, allowing Google to know more about you.
User Targeting
These investments also tell google about user’s interests and intent:
- Intent: Search, by definition, is the navigation infrastructure for the Internet and hence captures user intent.
- Interest: Beyond search, Google launched Google News in 2002 to capture user interests.
- Location History: Google Maps’s navigation apps, Android Operating System, and Chrome Browser collect user’s location history.
- Shopping History: In 2020, Amazon stopped printing purchased products in order receipt emails. Some speculated that it was to stop Google and other email scraping services to build first-party based on these receipt emails. Additionally, Google also launched its payments app in India and offered generous cash back so transactions happened via it.
- People Graph: Gmail, Google Contacts, Google Calendar, along with Android’s Contacts, SMS, and Phone call infrastructure should help Google build a people graph.
Google’s Investments To Improve Its Transaction Infrastructure
Additionally, Google has made investments to facilitate transactions:
- Payment: Investments in Google Pay and Wallet allow Google to make shopping seamless
- Google Play: On devices that use Android, users shop digital products — apps, movies, music, books, etc. — via Google Play. In fact, even subscriptions sold in apps on Android-powered phones Google charges a 15%-30% commission. In some countries, they mandate the use of Google Play’s payments infrastructure for in-app purchases.
- Listing: Google Search provides lists of products and flights when users search with a shopping intent.
Meta’s Situation
While much of the discussion has focused on Google, Meta’s situation is also worth examining. Meta’s strength lies in its ownership of key platforms — Facebook, Instagram, and WhatsApp — where it controls both the audience and the advertising environment.
However, this concentration also exposes Meta to vulnerabilities. For example, Meta does not control the devices that users rely on to access its platforms. This reliance on external hardware has made Meta susceptible to changes imposed by operating system providers like Apple and Google.
A notable example of this occurred in 2021 when Apple introduced the ‘Ask App Not to Track’ feature on iOS, which significantly curtailed Meta’s ability to collect first-party data. The impact was immediate and severe, costing Meta an estimated $10 billion in revenue in 2022.
To mitigate such vulnerabilities, Meta has made moves to diversify its data collection methods, such as acquiring Oculus (for Virtual Reality) and developing Meta Ray-Ban glasses.
The advent of Large Language Models (LLMs) has given Meta a once-in-a-decade opportunity. In 2007, when Google perceived a threat from Apple’s iOS (mobile OS) in a domain that wasn’t Google’s core business (search), Google responded by open-sourcing Android and became the de-facto mobile phone OS. In 2022, Meta found itself in a similar position viz-a-viz Google and OpenAI. Google and OpenAI operate closed-source LLM in their respective core businesses. Meta has open-sourced Llama LLM and Meta will most likely get into the hosted LLM space. It can help Meta collect insightful data about users.
For example, below is my profile that ChatGPT has built about me.
Algorithmic Marketplaces Collect Supply-Side First-Party Data
Algorithmic marketplaces also gain an advantage by collecting supply-side data. Many platforms engage in web scraping, using bots to extract content and data from websites across the Internet. While some adhere to ethical guidelines like respecting robots.txt files, others bypass these restrictions, further widening the data gap.
Just as regulatory bodies have unhindered access to granular data to ensure compliance and transparency, Google can access the details of any digital product through Google Analytics (GA), which they provide as a free service.
Google underwrites the cost of running such a vast Big Data operation because it serves as fundamental raw material for their digital advertising business. Similarly, businesses proactively go and maintain an updated Google Business Profile account with Google Maps for a better search experience.
Information Asymmetry viz-a-viz Media Products
It is in this context that the transaction between a media company and algorithmic platforms takes place.
The media company produces content that engages the audience, and the algorithmic platform owns the audience, drives traffic to the content, and monetizes the audience. The media company gets cash and high-level analytics in return.
Identity Attribution
Most media products have abysmally low login rates on their websites and apps. Then how is this advertising transaction happening?
Turns out, while the user might be logged out on media products, the user is most likely logged in to their Google ID on the browser (ideally Chrome), or an Android phone, or a Google-powered device.
This allows Google to become the identity broker between an advertisement on a media platform to a brand’s website. Hence, media companies do not know which of their audiences are clicking on advertisements and which advertisements.
Granular versus Blended Data
When algorithmic platforms do allow media companies to download data, they receive only a fraction of this data, often in aggregated and delayed formats.
In contrast, media companies receive only a fraction of this data, often in aggregated and delayed formats. This information asymmetry severely limits their ability to understand their audience, personalize content, and optimize their monetization strategies.
While Google has access to vast amounts of data on search trends and user behavior, the insights offered by Google Trends are constrained. The tool focuses primarily on the specific keywords entered, not providing context or related terms that might offer a broader understanding of user interests. Additionally, Google Trends doesn’t offer an absolute measure of user interest. Instead, it normalizes data against the peak interest point for a given keyword and time frame. This normalization can distort the true scale of trends, potentially misleading those who rely on it for detailed analysis.
Here’s the response when you search for the topic General Bipin Rawat.
Now, let’s compare with the traffic demand of Yogi Adityanath. How will you even compare these two numbers?
Similarly, platforms like Facebook and Amazon keep a tight grip on their granular data, leaving media companies with a fragmented and incomplete understanding of their audience’s needs and behaviors.
Why This Matters To Media Companies
This information asymmetry severely limits their ability to understand their audience, personalize content, and optimize their monetization strategies.
Personalization
The lack of access to real-time, granular data prevents media companies from effectively personalizing content and user experiences. Data exports from platforms like Google Analytics via BigQuery are often delayed by 24 hours, a significant disadvantage in the fast-paced news environment where trends can change within minutes.
Even when data is available in real-time, it is often aggregated and anonymized, making it difficult to identify individual user preferences and tailor content accordingly. This limitation hinders media companies’ ability to create engaging,relevant experiences that foster loyalty and drive repeat visits.
Roadblock Revenue Maximization
Information asymmetry also impacts media companies’ ability to maximize revenue.
While algorithmic marketplaces offer various price differentiation strategies, such as tiered memberships or sponsored content, most have transitioned to algorithm-driven approaches like bid-up pricing, surge pricing, and ‘purchasing power parity’ pricing.
However, media companies are often unable to implement these advanced pricing strategies due to the lack of granular data. They are forced to rely on broader segmentation and less precise targeting, missing out on opportunities to capture the full value of their audience.
For example, limitations in merging Google Analytics data with Google Ad Manager (GAM) data make it difficult for media companies to understand which content generates the most revenue. Additionally, restrictions on custom attributes in ad requests limit their ability to track revenue at the individual user level.
Additionally, Google’s privacy policies, while intended to protect users, can limit the ability of digital media companies to develop advanced ad targeting strategies. This is in contrast to Google’s own AdWords platform, which provides advertisers with extensive targeting options based on user behavior, creating an uneven playing field for other digital media firms.
Leveling the Playing Field
To address the challenges posed by information asymmetry, media companies must adopt proactive strategies to reclaim control over their data and improve their competitive position. Here are some key approaches:
- Invest in Alternative Web Analytics Tools: Media companies should explore data analytics tools, like Matomo (previously, Piwik), Plausible, or Mixpanel, that offer more granular and real-time insights into user behavior. By diversifying their data sources, they can reduce their dependence on algorithmic marketplaces.
- Improve Login Value Proposition: Establishing owned platforms and channels and building in sufficient utility that audiences find it worthwhile to login and stay logged-in.
- Features That Generate First Party Data. For example, CRED in India, ships free products with excellent user experience to pay credit card bills, send and receive money, manage car’s vitals, plan travel, etc. It also provides reward points to incentivise usage.
By taking these actions, media companies can regain control over their data, enhance their understanding of their audience, and develop more effective strategies for personalization and revenue maximization.
This article was part 2 of a series called ‘Reality Check on Media Strategy’.
—
Want to republish it? This post was released under CC BY-ND — you can republish it as is with the following credit and backlinks: ‘Originally published by Ritvvij Parrikh on The Times of India. The author retains the copyright and any other ancillary rights to the post.