Last year TOI turned up the notch on data. Here’s how | Ritvvij Parrikh Humane ClubMade with Humane Club

Last year TOI turned up the notch on data. Here’s how

Published May 10, 2022
Updated Dec 18, 2022

This post was originally published at

A year ago, we embarked on a project to increase the quantity and quality of data across

Well-written content
Nov 21, 2022

Good editorial products have novelty, evidence, speed to market, and user experience.

Use data in stories
Nov 21, 2022

A common component connecting all four attributes is data.

  • Insights from data can provide novelty.
  • Data, along with reporting and quotes, can give evidence.
  • With the right technology in place, we could quickly churn out the data as context for live news events.
  • Finally, data visualization allows for packaging data into fun and engaging experiences for the audience.

Flashback: It wasn’t that The Times of India website was without data so far. Back in 2017, we had built many interactive dashboards for elections, pollution, etc., across the website under the name DataHub. Back then, other newsrooms had just started experimenting with such experiences. And these DataHub pages consistently were the top pages on election days.

Limited audience interest now: The desk also loved using the DataHub. It provided them with a quick way to find an insight, take a screengrab, and put in a story. But over time, innovation on DataHub had stopped and with it traffic on these pages had dropped. Our audiences did not engage with interfaces that required them to slice/dice. Additionally, these pages lacked SEO authority as they were primarily client-side rendered pages.

From the onset we were clear that the change had to newsroom-wide. We did not want to build small side-projects. Hence, we eliminated few options.

  • We could not hire 3-4 expert data journalists and leave them to operate in a corner.
  • We could not mass-hire as there just isn’t enough supply.
  • We could not mass-train quickly because numeracy is a complicated skill.
  • We could not mass-build because this isn’t a data engineering (volume and velocity) problem, but an insight-as-a-service (variety and discovery) problem.

Where should we begin? From a product perspective, data journalism isn’t a monolith. At a high-level, there are four forms:

  1. Data-as-a-service: Scrape, clean, and warehouse various datasets that editorial would require and make it available to them when they need it.
  2. Charts-as-a-service: Mass-create charts for existing desk teams as pre-content to plug into everyday stories.
  3. Insights-as-a-service: Pair journalists and data science programmers to auto mass-generate insights from data.
  4. Visual Stories: Hire a specialist team of data journalists and visual storytellers to produce a few stunning stories a week or month.

Data-as-a-service has low immediate audience-impact because editorial still needs to plan to work on those stories.

In comparison, charts-as-a-service is better from product perspective because it not only provides editorial with ready pre-content but can also be exposed to audiences.

There is a strong attraction in the journalism community towards visual storytelling because it creatively explains complex issues. However, it takes quite a significant effort and skill to produce these.

Finally, Insights-as-a-service is a very powerful offering. It not only provides editorial with pre-content but also with signals/notifications. It can also be extended to auto generate text and stories for audiences. However, implementing auto-analysis with data science or AI/ML and then generating engaging stories is much harder.

Implementing option 1: Data-as-a-service

Our first design decision was that we would formally warehouse any data we would scrape or download in a Civic Data Warehouse. Warehousing ensures that all data connect with each other and get collected there for future story opportunities.

There are many data-as-a-service suppliers in the market. Hence, scrapped the data ourselves only in situations when we couldn’t purchase it.

Evaluating option 4: Visual Stories

We did not have any data to prove if our audiences would like interactive stories. Instead of trying data visualization, we expanded the goal to mix-media stories that could use data, audio, video, and photo, along with text.

Test launch 1: Hence in March 2021, at the peak of the farmer’s protest, we created a simple mix-media interactive called Talking Photos to showcase the protest sites. Three photographers took 1,500 pictures across three protest sites over 15 days. They recorded real-time audio at each shoot. We edited and ‘storified’ the photos and processed four hours of audio into the 12 minutes finally used.

CMS troubles: Once the story was ready, we realized that there was no meaningful way to inject interactive stories in our page-based CMS. Many publishers host their interactive stories on subdomains to navigate around this problem. Hosting interactive stories on a subdomain was not the right product decision for us as it would take our audiences away from the platform.

Test launch 2: By August 2021, we figured out how to make interactives work with our CMS and launched two new stories (Story 1 Story 2) in TOI Plus, our subscription product. These stories came just in time for India’s Independence Day news cycle.

Initial success: Both stories appear in the top 10 conversion-worthy stories of 2021. It was clear that option three would have worked for TOI Plus, our subscription product.

We decided to double down on such mix-media stories. Hence, this project was shifted to the technically-savvy Denmark team that builds/runs our industrial-scale inhouse CMS. In the coming weeks and months, we’ll launch new immersive formats in TOI Plus.

Building option 2: Charts-as-a-service

What is/are
Nov 21, 2022

The goal of charts-as-a-service was to mass-produce charts and make them available to the newsroom as pre-content so that many more on the desk can use them to substantiate their everyday stories.


Reduce the need for design skills. By now, we had already built the NewscardCMS. We mass-generated and inserted newscards across section pages as discovery elements for our stories.

Why Datawrapper: Earlier, was using HighCharts. Instead of reinventing the wheel, we invested in Datawrapper. Datawrapper is built by journalists for journalists and handles the finer nuances that come up with charting. This also allowed the desk to duplicate a chart mass-produced by algorithm and annotate it for a story. Finally, the exports from Datawrapper work seamlessly for print newspapers. Each Datawrapper chart created also gets stored as a Newscard.

Numeracy. The one thing that no technology can replace is basic numeracy within the newsroom. It involved asking the right questions from data and blending it with the story they are trying to convey — among journalists. Luckily the team already had it.

Algorithms: With the raw material in place, our algorithms mass-created charts for various use cases — pollution, elections, Coronavirus, economy. For example, in anticipation of the UP Elections, the system has pre-generated 1000s of charts for online and print teams.

Use #1: Improve everyday stories

The desk would then find the correct chart relevant to their story and then plug it in, thereby increasing data usage in everyday stories and LiveBlogs.

Use #2: New Habit-Based Products

Meanwhile, we started reusing these charts in Start Your Day With — our daily active user increasing property discussed in an earlier blog post.

Use #3: Replacing DataHub with Factsheets

By now, the Lego effect had started kicking in.

  • Newscards and Datawrapper Charts became individual modular pieces of content.
  • A collection of Newscards and Datawrapper Charts became Bundles.

So when we started designing the Factsheet template as a replacement for DataHub, we doubled down on the lego-effect. Factsheets are a collection of Bundles.

Editorial interfaces should be composable.
Nov 21, 2022

Content-responsive design: Much like with a newspaper, Editors should be able to pick and choose newscards and combine them into different experiences that are relevant to the newscycle — planned or unplanned. This “composable” design approach gives immense flexibility to the editors while keeping the overall user experience consistent as a whole.

Scaling modular architecture

Use #4: Why now. So What.

What is/are
Bridge between Why Now and So What
Nov 21, 2022

So what: Most content in a paper of record is about reporting events. But often, these stories and LiveBlog updates do not elaborate on the larger context in which the event is placed in.

Why now: Conversely, evergreen content like deep explainers, timelines, and data dives can explain the context well but do not necessarily get properly replugged whenever a new related event occurs.

Often from a news product perspective, the audience journey moves audiences from fresh news events to opinions and explainers and vice versa.

Content Journey between fast-moving, low shelf-life content (LiveBlog) and slow-moving, high shelf-life context (Factsheet).

Hence, we deeply integrated Factsheets into our Liveblogs as tabs.

Case Study: You can see how we use each of these elements in redoing our Covid-19 coverage.


The increase in general availability of charts helped the desk use charts in LiveBlogs, articles, and deep data dives in TOI Plus.

The Factsheets also saw a jump in usage compared to DataHub. For example, the pollution factsheets went live around Diwali day (early November). From then to January 20, pollution pages have seen a jump of 111.4% in users and 56% in page views compared to the same period last year.