Search:

How should we value news by AI

How should we value news used by AI? A checklist for publishers

Publishers worldwide are concerned about the impact of AI systems like OpenAI’s large language models on their news content. They question how they can license their news for use in these AI systems and ensure fair compensation. This situation mirrors past issues with social media companies profiting from news content while providing minimal compensation to publishers.

Even if OpenAI (or another large company) strikes deals with the big outlets in each market, it will still need to make arrangements with other outlets. That’s because generative AI predicts text effectively only when it also operates as a super-powered search engine in real time. Prediction is only effective with a large enough input sample.

Nevertheless, even if an AI company decides to make deals with multiple sources, there’s bad news. OpenAI and other firms are in startup mode and booking losses. They will likely use this as an excuse to pay less money to publishers. Plus there is a lack of transparency: Publishers don’t know what of their content large language model developers have already hoovered up — it is a complete mystery what content was used to train models.

Why could journalistic content be valuable for AI companies?

  • Reliable content: Journalists will be happy to know that accurate and unique content is still valuable.
  • Unique content: As well as being reliable, the information put into large language models should be unique. This is potentially disadvantageous for smaller outlets that don’t have a lot of original content. After years of being starved for funds, many smaller outlets rely on wire service stories. Strip that out and there is not much left to sell to generative AI companies.
  • Contextualized information: Successful AI will be able to search for information and also provide context to that information.
  • Images and video: Publishers with original images and video will be more valuable if their work can be authenticated and its provenance made known.
  • Historical archives: These are used for training models, and so access to them can be sold by the publishers who have them, which are mostly the larger and older publishers.
  • Also valuable is current content that comes with journalists who can answer questions about the reporting process and how they got the news.
  • Writing quality: Writing quality still matters, but it might not matter as much later. This depends, in part, on what happens with prompt optimization. Just as news outlets began optimizing for search, so too they may start optimizing for AI prompts.
  • Non-English languages such as Spanish and Portuguese: They are more niche and difficult to substitute. If a story in Spanish or Japanese is about an event in that country, it is much more likely to have been produced for an outlet in that country. Publishers in countries where fewer people speak their language will have a harder time as there is less need for their work.

Another approach is for news organizations is to create their own large language models.

 

Read full report here

Full Report

Trends Home

Trends