Our review process for published datasets already includes an initial technical quality check of the dataset submitted for publication. This check is based on well-defined formal criteria and is performed by a team being responsible for the open data repository DaRUS at the University of Stuttgart.
A second important step will be an automated review of data and software. We have developed a review tool to automate certain steps of the review and to help us organize the process. With this step, we will guarantee the OS and platform independency of software and data containers.
Finally, with the overlay journal we want to implement a scientific review process that is handled by the topic editors in a typical single-blind review process. Open peer review, where the identity of the reviewers is known, will be used if the reviewers agree. The review process is conducted by domain experts and ensures the highest scientific quality of the submission. To name a few issues, it includes a critical discussion of possible reuse of data, documentation of datasets and metadata and scientific originality of the submission.
Reviewers are required to verify the metadata of the dataset to ensure it conforms to the JoDaKISS mandatory baseline, preferably using EasyReview to comment on existing fields and suggest missing fields. Although the DaRUS dataset review at SimTech already encompasses metadata, it may be necessary for repositories outside of DaRUS to define a set of mandatory fields to be checked. Please note that these requirements may change:
Reviewers are required to check the validity of the files provided with the dataset. This may include light code review, format checks, and, re-running the provided scripts if necessary. The goal is to discourage proprietary formats and encourage the use of open-source formats. For example, if results are provided in XLSX format, the reviewer should suggest using CSV/TSV as an alternative. Key issues to review within the data include:
Reviewers are asked to provide comments and a general textual review of the dataset and its scientific content. This process is similar to traditional peer review and is highly dependent on the scientific context. Existing guidelines may be adapted as necessary.