Personalization algorithms are Editorial Products because they have Editorial Judgment.
Historically, newspapers were published to serve the needs of the highest common numerator. For example, The Times of India was written for the professional middle class person.
Once news moved online, publishers started creating verticals or sections, with each one targeting a separate target audience. For example, business section targeting professionals, a Tamil website for Tamil audience, travel for food enthusiasts, lifestyle content for the casual, etc. Without Artificial Intelligence, this is and was an easier way to Target advertisements.
With Social media came the era of personalization where each user would get what they individually want.
Finally, with Generative AI we are entering into a phase where each user gets a piece of content custom rewritten for their specific needs.
Why it matters:
- On platform personalization is a multiplier and will bump up Pages Per Session or Time Per Session.
- If connected with Push Notifications and Newsletters, it can help grow Sessions Per User.
- Most modern Social platforms with advanced personalization systems have a clear deal with audiences — When available, users give X minutes to the platform and the algorithm gives them content worth their time. This trains the user to directly check-in to the platform repeatedly, there by increasing DAU/MAU.
- User experience: Most people do not come to Escape Products to study one topic in depth. Hence, typically, Personalization feeds perform better than Related Items.
- Save editorial effort: For one individual, it isn’t that Artificial Intelligence models will have better Editorial Judgment than human editors. But at scale, the machine will out perform the human because the algorithm will give each user a more relevant homepage. With personalization, you can change the homepage 500 times a day, scaling for location, time of day, platform, and user.
- Programmatic control: As a product, you’ve the option to redirect specific parts of the traffic to branded content or revenue maximizing section (high eCPM sections).
It is extremely costly to run personalization in production at scale, especially for a Commoditized Business like news.
- Most old-school websites serve static webpages from CDNs. In contrast, personalization mandates serving content directly from inference servers which is vastly more expensive.
- Training collaborative filtering models at regular frequency requires costly GPU servers.
- Warehousing clickstream data at scale can also get expensive.
Hence, it might be advantageous to Deploy in-house servers.
- Agency: We should communicate to audiences when, where and how Artificial Intelligence is used. Allow users to unfollow items they don’t want.
- In news websites, users content preference changes based on time of day, platform, day of the week, etc. For such situations, you might want to add a re-ranking layer on top of personalization to handle business requirements.
- User experience: The algorithm should expand a users interests so they find more relevant content and within an interest, make the user go deep. Additionally, personalization works because it eliminates choice and gives users what they want.
- Avoid surprises: Give editors the control to decide what should not come on the homepage. This can take the form of blocklisted stories or rules like recency weights, reportage weightage, etc.
- Based on the type of the product, your model’s nuances would worry. For example, Netflix has high Shelf Life content. In contrast, news websites have extremely slow shelf life content. Hence, aging might kick in and old clickstream data might need to be purged.
- Similarly, in news, user interests change sharply based on current affairs.
- In news websites, you need to cold start clickstream data for new stories.
Product Management Perspective:
- System health: If not built well, then the algorithm will maximize CTR instead of user loyalty and revenue. Additionally, a click doesn’t mean the user has consumed the story. Hence, you might want to model it on time spent on the content.
- Model health: You ideally want your models to self-learn without biases. If you train a model to provide Related Items then that increasingly narrows the users choices. If you train a model on clickstream that isn’t position de-biased then stories that were placed on the top will keep repeating. A biased model can grow CTR initially but it eventually decays.
- There are two methods: Collaborative Filtering and Content Filtering. Based on your use case, you might have to implement one or the other, or both. For example, if you are personalizing product recommendation for a B2C pharma drugs, then you might not want to use Collaborative Filtering. That said, also do check the value of implementing both. For example, Ekstra Bladet had similar CTR between collaborative and content filtering.