Reviewers

Peer review process

Our review process for published datasets already includes a first technical quality check of the dataset submitted for publication. This check is based on well-defined formal criteria and will be established through the FoKUS team being responsible for the open data repository DaRUS.

Finally, with the overlay journal we would like to implement a scientific review process that will be handled by the topic editors in a typical single blind review process. The review process is performed through domain experts and guarantees the top scientific quality of the submission.

Reviewer guidelines

1. Metadata review

Reviewers are required to verify the dataset metadata to ensure it meets the JoDaKISS mandatory baseline, preferably utilizing EasyReview to provide comments on existing fields and suggest any missing ones. Although the DaRUS dataset review at SimTech already encompasses metadata, it may be necessary for repositories outside DaRUS to define a set of mandatory fields to be checked. Please note that these requirements are subject to change:

Persistent identifier (e.g., DOI)
Author information (e.g., name, ORCID)
Contact information (persistent contact)
Licence existing and fitting
Description (instructions for reproduction and navigation)
References (e.g., related publications, websites)
Parameterization and variables
Model description (e.g., equations, references to existing models)
Software used (e.g., links to GitHub, GitLab, or software packages)
For scripts:

Programming language and version
Dependencies (e.g., links to packages)

For compiled software:

Programming language and version
Specification of running system (compute/storage/memory requirements) (if not running on standard infrastructure)
Compiler and settings
Dependencies (e.g., links to packages)
GitHub/GitLab repository and issue tracker (if available)

2. Technical File Review

Reviewers are required to check the validity of the files supplied within the dataset. This may involve light code reviews, format checks, and, if applicable, re-running the provided scripts. The aim is to discourage proprietary formats and promote the use of open-source formats. For instance, if results are supplied in XLSX format, the reviewer should suggest using CSV/TSV as an alternative. Key points to check within the data include:

Use of open source formats.
Clear filenames (avoid cryptic names such as "x.npy").
Meaningful directory structure (avoid having numerous files in a single root directory).
Download convenience: Pack a large number of small, similar files into a compressed archive. (If there are a high number of small similar files: Pack files together in meaningful archives).
Available description of dataset/code structure (readme, description that explains the structure of the repo, what the code does, and what files do what).
For software/code: Well-formatted and readable Code.
Examples on re-using the code (optional, for custom libraries/modules) or data.

3. Content review

Reviewers are required to provide comments and a general textual review of the dataset and its scientific content. This process is similar to traditional peer review and highly depends on the scientific context. Existing guidelines may be adapted as necessary for this purpose.

Originality and Uniqueness: Does the added value described in the value section really make a substantial contribution to science? Can the software/data really fulfill this claim?
For Data:
- Source of the data: where does it come from? How was it produced?
- Reliability of the data: Any kind of assumptions, biases or heuristics that might have affected the data. Were there any checks or tests that have been done on the data?
- Completeness of the data: Is the dataset complete or just a subset of a larger set? (Sometimes it is also not possible to process all the data in a gigantic data set). What information is available for the records in the data set?
For Software:
- If the repository has installation instructions and if following those instructions actually allows you to run the code.
- Whether the code runs (after installing it) without any significant errors.
- If the code actually does what is claimed in the paper or in accompanying examples (as far as possible).
- Depending on the paper (e.g. when benchmarking is important or if the computations done take a lot of time), it is also checked whether the specs of the system used to do computations are given.