Research

By hosting the results of our efforts and current conclusions, we provide building blocks for future research around our central question about assessments of news credibility.

Below you can find papers and datasets that have been sponsored by the Coalition and published in a public format.

Questions about ongoing research and data under development can be directed to hello [at] credibilitycoalition [dot] org.

Reports

Bridging the Divide Between Local News and Online Platforms

The Credibility Coalition and the Nieman Foundation for Journalism at Harvard. Bridging the Divide Between Local News and Online Platforms. The Schema-Online Local News discussion series, 2021.

Download the report

As news increasingly moves online, search rankings, online traffic metrics and website referrals matter a great deal to the bottom lines of many local news operations. It is not only important that these local outlets get credit for their reporting; ranking higher in search results can help justify the resources they invest in newsgathering and reinforce their image as an authoritative, trusted source for readers.

Failing to surface their content has become another setback for local news outlets already struggling with cutbacks, shrinking subscription bases, and increasing pressure to produce more content with ever-dwindling resources. A vibrant local news landscape plays a vital role in keeping the public informed on issues that directly impact their lives. If local outlets can’t reach readers online, the whole news ecosystem suffers.

In early 2021, the Credibility Coalition, in partnership with the Nieman Foundation for Journalism at Harvard convened a series of conversations aimed at both establishing a common vocabulary for greater understanding between local news outlets and online platforms and examining the way local news is surfaced online.

The project, which received financial support from the Google News Initiative, is part of a wider effort to empower journalists to better understand — and platforms to better execute — the promotion of quality local journalism on news delivery platforms.

The Schema-Online Local News discussion group consisted of journalists from a variety of outlets (with majority representation from local news operations), and representatives of several of the major tech platforms. Over the course of three meetings, the group examined ways that journalists might better adapt local content to effectively communicate with platform technologies. At the same time, they looked at ways platforms might improve systems for identifying local news and tweaking their algorithms to place local stories more prominently in front of readers.

Bridging the Divide Between Local News and Online Platforms, by the Credibility Coalition and the Nieman Foundation for Journalism at Harvard, captures some of the insights, ideas, and suggestions that came out of these discussions.

An Introduction to Schemas for Journalists

The Credibility Coalition, the Nieman Foundation for Journalism at Harvard, and Samantha Sunne. An Introduction to Schemas for Journalists. The Schema-Online Local News discussion series, 2021.

Download the report

This document is meant to be used as a curriculum that empowers journalists to better understand — and help platforms to better execute — the discovery and ranking of quality local journalism on news delivery platforms.

Journalists from local news outlets and representatives from technology platforms participated in one introductory training module and three subsequent discussion sessions. Participants worked together in small groups to brainstorm ideas, processes or mechanisms they think platforms should adopt to better surface local news. At the end of the discussion series, these ideas were refined and submitted to platforms for consideration.

In order to facilitate these conversations, both journalists and platform representatives needed to be able to understand and use common vocabulary around news algorithms. This vocabulary, and in particular the way it represents how online information is organized and processed, can be known as a “schema.” Schemas are unfamiliar to many journalists, but nonetheless have a huge impact on their work and how it is distributed.

What This Curriculum Covers

This document provides a hands-on introduction to practical applications of schemas for online news, including:

  • Structured data
  • Rich snippets
  • Schemas for online news articles
  • Applying schema markup and Schema.org markup
  • New initiatives to markup news content
  • Signals, indicators, and ranking factors
  • Labels

Additional resources for further exploring how local news is distributed online An Introduction to Schemas for Journalists was developed as part of the Schema-Online Local News discussion series that was convened over several months in 2021, by the Credibility Coalition and the Nieman Foundation for Journalism at Harvard, and with financial support from the Google News Initiative. The purpose of this discussion series was to establish a common vocabulary for understanding between local news outlets and online platforms by convening regular, structured conversations between journalists and representatives of technology platforms.

Acknowledgements

We would like to acknowledge the assistance and insights of Mark Stencel, Joel Luther, Beck Levy, Scott Yates, Momen Bhuiyan, and Kate Harloe.

Papers

Amy Zhang, Aditya Ranganathan, Sarah Emlen Metz, Scott Appling, Connie Moon Sehat, Norman Gilmore, Nick B. Adams, Emmanuel Vincent, Jennifer 8. Lee, Martin Robbins, Ed Bice, Sandro Hawke, and David Karger. A Structured Response to Misinformation: Defining and Annotating Credibility Indicators in News Articles. The Web Conference, April 2018.

The proliferation of misinformation in online news and its amplification by platforms are a growing concern, leading to numerous efforts to improve the detection of and response to misinformation. Given the variety of approaches, collective agreement on the indicators that signify credible content could allow for greater collaboration and data-sharing across initiatives. In this paper, we present an initial set of indicators for article credibility defined by a diverse coalition of experts. These indicators originate from both within an article’s text as well as from external sources or article metadata. As a proof-of-concept, we present a dataset of 40 articles of varying credibility annotated with our indicators by 6 trained annotators using specialized platforms. We discuss future steps including expanding annotation, broadening the set of indicators, and considering their use by platforms and the public, towards the development of interoperable standards for content credibility.

Download the paper. PDF

Dataset

View data.

Acknowledgements

This paper would not be possible without the valuable support and feedback of members of the Credibility Coalition, who have joined weekly calls and daily Slack chats to generously contribute their time, effort and thinking to this project. In addition to the authors of this paper, this includes Nate Angell, Robyn Caplan, Renee DiResta, James P. Fairbanks, Dan Froomkin, Dhruv Ghulati, Vinny Green, Natalie Gyenes, Cameron Hickey, Stuart Myles, Aviv Ovadya, Karim Ratib, Cameron Hickey, Evan Sandhaus, Heather Staines, Robert Stojnic, Sara-Jayne Terp, Jon Udell, Rick Weiss, Dan Whaley.

We are also grateful for feedback and support from the attendees of our in-person meetings, including Jordan Adler, Erica Anderson, Dan Brickley, Mike Caulfield, Miles Campbell, Jeff Chang, Jason Chuang, Nic Dias, Mark Graham, Eric Kansa, Burt Herman, Mandy Jenkins, Olivia Ma, Sunil Paul, Aubrie Johnson, Sana Saleem, Wafaa Heikal, Mark Graham, Tessa Lyons-Laing, Patricia Martin, Alice Marwick, Andrew Mullaney, Merrilee Proffitt, Zara Rahman, Paul Resnick, Prashant Prakashbhai Shiralkar, Joel Schlosser, Ivan Sigal, Dario Taraborelli, Tom Trewinnard, Paul Walsh, Rebecca Weiss, Cong Yu. A special thanks to Sally Lehrmann and Subramaniam Vincent from the Trust Project for shared thinking and support.

We owe thanks to those who have housed conversations and workshops and offered critical feedback, including First Draft and the Shorenstein Center on Media, Politics and Public Policy at Harvard University; the Brown Institute for Media Innovation at Columbia University; and Northwestern University. Thanks to conferences and events that have hosted workshops or presentations with us, including W3C TPAC, the Mozilla Festival, MisinfoCon, Newsgeist, the Knight Commission on Trust, Media and Democracy, and the Computation and Journalism Symposium.

Md Momen Bhuiyan, Amy Zhang, Connie Moon Sehat, Tanushree Mitra. Investigating ‘Who’ in the Crowdsourcing of News Credibility. Computation+Journalism Symposium, March 2020.

Concerns about the spread of misinformation online via news articles have led to the development of many tools and processes involving human annotation of their credibility. However, much is still unknown about how different people judge news credibility or the quality or reliability of news credibility ratings from populations of varying expertise. In this work, we consider credibility ratings from two “crowd” populations: 1) students within journalism or media programs, and 2) crowd workers on UpWork, and compare them with the ratings of two sets of experts: journalists and climate scientists, on a set of 50 climate-science articles. We find that both groups’ credibility ratings have higher correlation to journalism experts compared to the science experts, with 10-15 raters to achieve convergence. We also find that raters’ gender and political leaning impact their ratings. Among article genre of news/opinion/analysis and article source leaning of left/center/right, crowd ratings were more similar to experts respectively with opinion and strong left sources.

Download the paper. PDF

Dataset

View data

Acknowledgements

This paper would not be possible without the valuable support of the Credibility Coalition, with special thanks to Caio Almeida, An Xiao Mina, Jennifer 8. Lee, Rick Weiss, Kara Laney, and especially Dwight Knell. Bhuiyan and Mitra were partly supported through National Science Foundation grant # IIS-1755547.

Md Momen Bhuiyan, Amy X. Zhang, Connie Moon Sehat, and Tanushree Mitra. 2020. Investigating Differencesin Crowdsourced News Credibility Assessment: Raters, Tasks, and Expert Criteria. Proceedings of the ACM on Human-Computer Interaction, Article 93, October 2020.

Misinformation about critical issues such as climate change and vaccine safety is oftentimes amplified on online social and search platforms. The crowdsourcing of content credibility assessment by laypeople has been proposed as one strategy to combat misinformation by attempting to replicate the assessments of experts at scale. In this work, we investigate news credibility assessments by crowds versus experts to understand when and how ratings between them differ. We gather a dataset of over 4,000 credibility assessments taken from 2 crowd groups—journalism students and Upwork workers—as well as 2 expert groups—journalists and scientists—on a varied set of 50 news articles related to climate science, a topic with widespread disconnect between public opinion and expert consensus. Examining the ratings, we find differences in performance due to the makeup of the crowd, such as rater demographics and political leaning, as well as the scope of the tasks that the crowd is assigned to rate, such as the genre of the article and partisanship of the publication. Finally, we find differences between expert assessments due to differing expert criteria that journalism versus science experts use—differences that may contribute to crowd discrepancies, but that also suggest a way to reduce the gap by designing crowd tasks tailored to specific expert criteria. From these findings, we outline future research directions to better design crowd processes that are tailored to specific crowds and types of content.

Download the paper. PDF

Dataset

View data.

Acknowledgements

This paper would not be possible without the valuable support of the Credibility Coalition, with special thanks to Caio Almeida, An Xiao Mina, Jennifer 8. Lee, Rick Weiss, Kara Laney, and especially Dwight Knell. Bhuiyan and Mitra were partly supported through National Science Foundation grant # IIS-1755547.