Where and how can I publish my dataset?
There are literally thousands of data repositories out there, and so it’s important that you find the right one for your dataset. Please contact your or a Research Data Service team member if you’d like help selecting a repository or depositing your data.
What to look for in a repository
- Does it follow ?
- (F)indable: Data are assigned a globally unique and persistent identifier, described with rich metadata, and registered or indexed in a searchable resource
- (A)ccessible: , with metadata available even if data are no longer available
- (I)nteroperable: Metadata use a formal, accessible, shared, and broadly applicable language/vocabulary for knowledge representation.
- (R)eusable: (Meta)data are released with a clear and accessible data usage license, are associated with detailed provenance, and meet domain-relevant community standards
- Does it adhere to your funder’s requirements?
- For example, the NIH recently issued along with its .
- Does it allow you to provide the metadata that will help your data be easily found?
- There are various catered to certain research fields, so you’ll want to be sure the repository can meet your metadata needs.
- Does it have the functionality that your dataset needs?
- Can it handle the size of your dataset?
- Can it handle the structure that you want for your dataset?
Registries/Lists of Data Repositories
- (index of 2,000 data repositories across all disciplines)
Repository options hosted by Princeton
Princeton Research Data Repository
Princeton has an institutional repository, called Princeton Data Commons, for archiving and publicly disseminating digital research data generated by members of the Princeton community.
More information about the Princeton Research Data Repository is on a dedicated page.
Other data publishing options at Princeton
- (institutional subscription)
General, cross-disciplinary data repositories that might be right for your data
All research outputs from across all fields of research are welcome; Zenodo accepts any file format as well as both positive and negative results, and promotes peer-reviewed openly accessible research
Dryad hosts research data underlying scientific and medical publications. Most data in the repository are associated with peer-reviewed journal articles, but data associated with dissertations and books are also accepted.
OSF is a free and open source project management tool that provides support through the entire project lifecycle, including pre-registration, collaboration, and storage and publication of data.
Some examples of funder-specific repositories
The National Institute of Mental Health Data Archive (NDA) makes available human subjects data collected from hundreds of research projects across many scientific domains. In addition to NIMH, other institutes use NDA as well, including NIAAA ().
A brand new general data repository for all NIH-funded researchers, developed in partnership with Figshare
NCEI is a consolidation of the former National Oceanographic Data Center (NODC), National Climatic Data Center (NCDC), and National Geophysical Data Center (NGDC).
Some examples of discipline-specific repositories
Environmental Sciences
Data Observation Network for Earth (DataONE) is the foundation of new innovative environmental science through a distributed framework and sustainable cyberinfrastructure that meets the needs of science and society for open, persistent, robust, and secure access to well-described and easily discovered Earth observational data
Life Sciences
-
VertNet is a NSF-funded collaborative project that makes biodiversity data free and available on the web. VertNet is a tool designed to help people discover, capture, and publish biodiversity data.
A free and open platform for sharing MRI, MEG, EEG, iEEG, and ECoG data
Social & Behavioral Sciences
ICPSR maintains a data archive of more than 250,000 files of research in the social and behavioral sciences. It hosts 21 specialized collections of data in education, aging, criminal justice, substance abuse, terrorism, and other fields.
Computer Science
Code Ocean is a research collaboration platform that covers the entire lifecycle from the beginning of a project through publication. With direct access to cloud computing and reproducibility best practices built in, no extra software or hardware is needed.
Humanities
Commons Open Repository Exchange is a repository that allows users to preserve their research and increase its reach by sharing it across disciplinary, institutional, and geographic boundaries.