PDBe-KB Guidelines for Collaborating

Introduction

Protein Data Bank in Europe - Knowledge Base (PDBe-KB) is a community-driven resource managed by the PDBe team, collating functional annotations and predictions for structure data in the PDB archive. PDBe-KB is a collaborative effort between PDBe and a diverse group of bioinformatics resources and research teams. This document outlines the framework for contributing to the PDBe-KB collaborative resource and the services PDBe-KB agrees to provide.

Terms of collaboration

PDBe-KB

  1. The infrastructure for data deposition and retrieval will be maintained by PDBe-KB
  2. Data exchange format schema(s) will be maintained by PDBe-KB
  3. The schema will evolve in consultation with collaborating partners
  4. PDBe-KB will provide programmatic access to expose contributed annotations
  5. PDBe-KB will link back to the original collaborating partners resource, attributing credit for their contributions
  6. PDBe-KB will maintain an open-access library of reusable data visualisation components

Collaborating partners

  1. The data contributed to PDBe-KB by collaborating partners will be free from any restrictions on distribution and re-use
  2. The partners are responsible for the quality of the data they contribute
  3. Protocols for data generation must be published in peer-reviewed publications
  4. In case of predicted/calculated annotations, the contributing partner makes a commitment of depositing data at least once a year: e.g., to provide annotations for newer PDB entries and/or update the existing annotations when the underlying algorithms change significantly.
    1. Manually curated annotations may be exempt from this condition on a case-by-case basis
    2. Depositors can change/update/delete their entries at any time

General Data Protection Regulation (GDPR) notice

Collaborating partners have to agree to the PDBe-KB GDPR notice before registering a data deposition account. The privacy notice is available here

Technical appendix (data collection)

Most data, and in particular residue-level annotations will be deposited using the PDBe-KB deposition system. The deposition system consists of the following components:

PDBe-KB JSON specification

A common JSON schema is defined to capture residue-level annotations, and is hosted at https://gitlab.ebi.ac.uk/pdbe-kb/funpdbe/funpdbe-schema. Contributing partners have to comply with the JSON specification in order to deposit data via the FunPDBe deposition system.

Validator tool

A validator tool (Python3.x) is provided for the contributors to parse input JSON files, validate them against the FunPDBe schema, and perform all the data checks the local pipeline would perform. The tool is available for download at: https://gitlab.ebi.ac.uk/pdbe-kb/funpdbe/funpdbe-validator

Deposition

All PDBe-KB partners are provisioned with a private FTP area provided by PDBe-KB, where their JSON files containing the annotations can be transferred. The PDBe-KB validation and processing pipeline picks up these files and integrates the data into the PDBe-KB graph database.