The Dataset Nutrition Label

Driving healthy data use through increased transparency

An image of DNP's dataset nutrition label maker

A Nutrition Label for Datasets

Similar to food nutrition labels, Dataset Nutrition Labels provides transparency into the contents of a dataset to drive “healthier” use.

The Dataset Nutrition Label is a free, public-facing, voluntarily disclosed dataset standard. We believe that increased visibility into dataset provenance, quality, and intended use drives better data practices and helps mitigate bias in AI systems.

A screenshot showing how to use the dataset nutrition label maker

How It Works

3 easy steps to get started

You’ll need a few things to get started: a user account, information about a dataset, and team member emails to enable collaboration.

A screenshot showing how to create a dataset nutrition label

1. Create your first label

Follow this link to set up your account 
and create your first Label. 

As you follow prompts in the Label Maker, 
you can always save your progress and 
log back in to continue.  

A screenshot showing how to invite collaborators to a label

2. Add collaborators

We have found that often, teams are 
building Labels together.

You can specify and share Label drafts with collaborators who will be able to edit the label 
you have created.

A screenshot showing how to submit a label for review

3. Click “Submit” when you’re done

Submit your completed Label for review.

When you have finished completing the Label Maker process, you can submit the Label for review to the Data Nutrition Project team. While your Label is under review, you will receive a watermarked “Draft” of your Label.

An image of a nutrition label

The Story

A “nutrition label” for datasets.
The Data Nutrition Project aims
to create a standard label for interrogating datasets.

Our belief is that transparency into dataset health can
lead to better decisions, which will in turn lead to better AI.

Founded in 2018 through the Harvard-MIT Assembly Fellowship, the Data Nutrition Project takes inspiration from nutritional labels on food, aiming to build labels that highlight the key ingredients in a dataset, such as metadata and demographic representation, as well as unique or anomalous features regarding distributions, missing data, and comparisons to other “ground truth” datasets.

Now in its third generation design, the current Dataset Nutrition Label provides information about 
a dataset including its intended use and other known uses, the process of cleaning, managing, and curating that data, ethical and or technical reviews, the inclusion of subpopulations in the dataset, and a series of potential risks or limitations in the dataset.

An image of a dataset nutrition label

Third Generation Dataset Nutrition Label (2022)

Take a Closer Look

See examples of Dataset Nutrition Labels

Data Nutrition Project's logo

Measuring Massive Multitask Language Understanding

This is a massive multitask test consisting of multiple-choice questions from various branches of knowledge. The test spans subjects in the humanities…

View label
Data Nutrition Project's logo

ASL Citizen: A Community-Sourced Dataset for Advancing Isolated Sign Language Recognition

ASL Citizen is the first crowdsourced Isolated Sign Language Recognition (ISLR) dataset, collected with consent and containing 83,399 videos…

View label
Data Nutrition Project's logo

2024 ISIC Challenge SLICE-3D Dataset

The 2024 ISIC Challenge SLICE-3D (“Skin Lesion Image Crops Extracted from 3D Total Body Photography”) dataset was created for the 2024 ISIC…

View label
Data Nutrition Project's logo

Replication Data for: Capturing Bonding, Bridging, and Linking Social Capital through Publicly…

A growing body of research has illuminated the powerful role played by social capital in influencing disaster and resilience outcomes. Popular…

View label

Don’t know where to begin?

Check out the tutorial above to see a step-by-step explanation of how to start your Label using our web-based Label Maker tool. If you have more questions, feel free to reach out!