The V0 Draft is now closed to public comment to allow the Steering Committee to integrate comments and for input from the Diverse Voices process. V1 will be released for public comment in early 2020. To discuss topics in the V0 Draft and keep apprised of updates, please visit the ABOUT ML public forum, which is always open.

About the Project

As machine learning (ML) technologies become increasingly pervasive in high-stakes contexts like criminal justice and banking, transparency has emerged as a top priority for policymakers, technical communities, and populations affected by their use. One way to increase transparency is via documentation. Machine learning systems (models and datasets) need to come with thorough documentation, including but not limited to how they were designed and for what purposes, where their data came from and why that data was chosen, how they were trained, tested, and corrected, and what purposes they’re not suitable for. Creating that documentation requires new kinds of processes among AI research teams and tech companies.

ABOUT ML (Annotation and Benchmarking on Understanding and Transparency of Machine learning Lifecycles) is a multi-year, multi-stakeholder initiative led by PAI that aims to bring together a diverse range of perspectives to develop, test, and promulgate documentation best practices for transparency in machine learning by synthesizing learnings from academic research and existing practices. This is an ongoing, iterative process designed to co-evolve with the rapidly advancing research and practice of AI technology. Read more about the motivation for this project here.


See real-world deployed examples of ML documentation which can focus on datasets, models, and ML systems. Provide your feedback on these examples as part of ABOUT ML’s public feedback comment process.

OpenAI, GPT-2


Share More Examples


To guide ABOUT ML, let the steering committee know what you think of these examples. Which questions are useful? What questions are these examples missing? Is there anything about the format of one of these examples that is effective? Submit a comment below.

About this Version

The Version 0 draft of this document is based on existing research and provides an overview of where reflection about transparency needs to be incorporated into the design process, as well as a deep-dives into documentation proposals for data sets and models, as these are currently most well-researched in the documentation for transparency space. Future drafts will incorporate feedback from the public and the Steering Committee and may include deep dives on other enablers of ML transparency, such as team and institutional setting, ML system-level considerations, test suites, feedback loops. See Process and Timeline for more details.

Process and Timeline

PAI is launching the ABOUT ML iterative multi-stakeholder process with this initial draft v0 in order to provide a starting place for community response. The goal is to update this draft into future releases and move towards best practices by going through the following phases:

  • Understand the latest research
  • Combine research theory and results of current practice into testable pilots
  • Run pilot tests with PAI Partners and organizations
  • Collect data from pilot tests for transparency practices
  • Iterate on pilots with the latest research and practice
  • When there is sufficient body of evidence for a certain practice, elevate it to a best practice
  • Promulgate effective transparency practices

PAI recognizes that this effort can only succeed with input from as broad a set of stakeholders as possible, and will be seeking input not only from our 90+ partners, but also from stakeholders from academia, civil society organizations, companies designing and deploying ML technology, and the general public. We welcome your participation.

The process is modeled after iterative ongoing processes to design internet standards (such as W3C, IETF, WHATWG) and includes a public forum for discussion and a place to submit any proposed changes. We welcome you to join in the public discussion and to submit proposed changes as many times as you’d like.

Public comments will be collected and batch evaluated by the ABOUT ML Steering Committee, a group of ~30 experts, researchers and practitioners recruited from a diverse set of PAI Partner organizations. The Steering Committee will be guiding the process of updating ABOUT ML drafts based on the public comments submitted and new developments in research and practice. They will vote to approve new releases by “rough consensus” commonly used by other multi-stakeholder working groups. They will convene 1-3 times a year, depending on the volume of proposed changes and the velocity of change of research and practice.

To prevent contributors from commenting on an out-of-date draft, the document will be closed to proposed changes for ~2-4 weeks prior to each Steering Committee meeting. This closure will be announced here with at least 2 weeks of notice. The first meeting of the Steering Committee will be on September 28th, 2019, so the v0 draft will be closed for comment from September 14th until the end of the update process.

To ensure that diverse perspectives— especially those from communities historically excluded from technology decision-making—contribute to any ABOUT ML recommendations, PAI is engaging with our Partner, the Tech Policy Lab at the University of Washington to conduct Diverse Voices panels. This method was designed to gather feedback from stakeholders whose perspectives might not otherwise be consulted and to ensure that those perspectives are reflected in the released text. Thus, for any ABOUT ML releases that go through the Diverse Voices process, the panel feedback will be the last edits incorporated before a new release. This also means that each round of Diverse Voices panels will cause public comment on the document to be paused for several months, although the public forum will remain open for discussion during that time. Public comment on the document itself will re-open with the new release of the draft. The first round of Diverse Voices panels for ABOUT ML will be held between October and December 2019. Thus, draft v0 will be closed to public comment from September 14th, 2019 until early 2020, and will reopen upon the release of draft v1 in early 2020. Additional panels will be convened approximately once a year, especially for milestone releases when the ABOUT ML project progresses through the phases outlined above.

ABOUT ML Process

ABOUT ML Timeline

Steering Committee

The ABOUT ML Steering Committee is comprised of around 30 experts, researchers and practitioners recruited from a diverse set of PAI Partner organizations. The Steering Committee guides the process of updating ABOUT ML drafts based on the public comments submitted and new developments in research and practice. They vote to approve new releases by “rough consensus” commonly used by other multi-stakeholder working groups. They convene 1-3 times a year, depending on the volume of proposed changes and velocity of change of research and practice.

To allow for as many diverse perspectives as possible, PAI limits participation in the Steering Committee to up to 2 people per organization, with 1 vote per organization. As needed, PAI will periodically reopen applications to the Steering Committee and recruit more members.

Current Steering Committee members:

Norberto Andrade

Privacy and Public Policy Manager (FACEbook)

Amir Banifatemi

General Manager, Innovation & Growth (XPRIZE)

Rachel Bellamy

Principal Researcher & Manager, Human-AI Collaboration (IBM)

Umang Bhatt

Student Fellow (Leverhulme Centre for the Future of Intelligence)

Rumman Chowdhury

Managing Director 
(Accenture AI)

Jacomo Corbo

Chief Scientist (QuantumBlack)

Daniel First

Associate / Data Scientist (McKinsey / QuantumBlack)

Ben Garfinkel

Research Fellow (Future of Humanity Institute)

Jeremy Gillula

Tech Projects Director 

Brenda Leong

Senior Counsel and Director 
of Strategy (Future of Privacy Forum)

Tyler Liechty

Data Engineer (DeepMind)

Momin M. Malik

Data Scientist 
(Berkman Klein Center)

Lassana Magassa

Graduate Research Associate (Tech Policy Lab/University of Washington)

Meg Mitchell

Researcher, ML Fairness, Ethical AI (Google)

Amanda Navarro

Managing Director (PolicyLink)

Deborah Raji

Tech Fellow 
(AI Now)

Nicole Rigillo

(Berggruen Institute)

Andrew Selbst

Postdoctoral Scholar 
(Data & Society)

Ramya Sethuraman

Product Manager (Facebook)

Reshama Shaikh

Board Member, NY Co-Organizer 

Moninder Singh

Research Staff Member 

Amber Sinha

Senior Programme Manager (Centre for Internet and Society)

Michael Spranger

Senior Research Scientist, AI 
Collaboration Office (Sony)

Andrew Strait

Researcher, Ethics and 
Society Team (DeepMind)

Gabriel Straub

Head of Data Science and Architecture (BBC)

Michael Veale

Assistant Professor (UCL)

Hanna Wallach

Senior Principal Researcher (Microsoft)

Adrian Weller

Senior Research Fellow (Leverhulme Centre for the Future of Intelligence)

Abigail Wen

Managing Counsel, Office of the CTO (Intel)

Alexander Wong

Co-Director, Vision and Image Processing (VIP) Research Group (University of Waterloo)

Jennifer Wortman Vaughan

Principal Researcher (Microsoft)

Andrew Zaldivar

Senior Developer Advocate (Google)