Version: 6.0.0

Step-by-step guide to organize and submit SPARC datasets with SODA for SPARC

Prepare and submit SPARC datasets with SODA

The typical process for submitting your SPARC dataset consists of organizing your data according to the SPARC Data Structure (SDS), adding metadata files, uploading everything on the Pennsieve data platform where more metadata needs to be added, and finally sharing the dataset with the SPARC Curation Team who will review it for compliance. Once approved by the Curation Team, you will have to share your dataset as embargoed dataset and it will become accessible to all members of the SPARC Consortium through Pennsieve. Once the embargo period is over (one year after initial upload or after publication of related manuscript(s), wichever comes first), you will have to publish your dataset and it will then become accessible publicly through the SPARC Data Portal.

We describe below the suggested workflow for preparing and submitting your SPARC datasets with SODA. All these steps are mandatory (unless marked otherwise) if you wish to satisfy the SPARC requirements.

A. Preliminary Steps

These steps only need to be completed once.

Dowload and install SODA
All SPARC datasets must be uploaded on the Pennsieve data platform. Get access to Pennsieve as well as the SPARC Consortium organization on Pennsieve by filling out this form. We also suggest to request access to the SPARC Airtable sheet through the same form as it will come in handy when your prepare your SPARC metadata files.
Download and install the Pennsieve agent required to upload files through SODA
Watch our quick video to familiarize yourself with the user interface of SODA (note: optional but recommended)
Read about the SPARC requirements for organizing and sharing datasets to familiarize yourself with the process (note: optional but recommended)

B. Prepare Dataset on Pennsieve

The SPARC guidelines require each dataset to have specific metadata on Pennsieve. We recommend starting with this such that everything is set on Pennsieve when you are ready to upload your data and metadata files (Step D). This metadata can be easily added to Pennsieve through SODA.

Connect your Pennsieve account with SODA. This is only required the first time you use SODA
Create a new Pennsieve dataset
Make PI of the SPARC award the owner of the dataset.
If others need to contribute to your dataset, give access to your dataset to other members/teams
Add a subtitle
Add a description
Upload a banner image
Assign a license
Add/edits tags

C. Prepare SPARC Metadata Files

The SPARC guidelines require each dataset to have specific metadata files, as described by the SPARC Data Standards (SDS). These metadata files can be conveniently prepared through SODA.

Prepare protocol on protocols.io following the instructions provided here. This is not supported through SODA since protocols.io already provides an intuitive interface for preparing the protocol.
Prepare the submission file
Prepare the dataset description file
Prepare the README file
If your study includes subjects, prepare the subjects file
If your study includes samples, prepare the samples file
If your study includes a computational model, prepare the code metadata files with help from the O2S2PARC team (email support@osparc.io)
If you are publishing a new version of a dataset, prepare the CHANGES file

D. Organize Dataset According to the SPARC Data Structure

All SPARC datasets must be organized according to the structure described by the SPARC Data Standards (SDS). Briefly, all data must be organized into one of the following six high-level folders: primary, source, derivative, code, protocol, and docs. Each of these folders must have have a manifest metadata file that summarizes the content of the folder. Additionally, all the metadata files created during Step C must be located at the highest-level of the dataset, alongside the high-level folders. SODA provides a intuitive interface for organizing your dataset according to the SDS and upload it on Pennsieve with automatically generated manifest files.

Specify files and metadata files to be included in your dataset and generate dataset directly on Pennsieve

E. Submit Dataset to the Curation Team for Review

Once all the previous steps have been completed, it is time to share your dataset with the SPARC Data Curation Team for review.

Share with the Curation Team

F. Post-curation steps

These steps must be completed ONLY after your dataset is approved by the Curation Team

Prepare and submit SPARC datasets with SODA​

A. Preliminary Steps​

B. Prepare Dataset on Pennsieve​

C. Prepare SPARC Metadata Files​

D. Organize Dataset According to the SPARC Data Structure​

E. Submit Dataset to the Curation Team for Review​

F. Post-curation steps​

Was this page helpful?