Unverified Commit f8cca1e5 authored by Sreenath Sasidharan Nair's avatar Sreenath Sasidharan Nair Committed by GitHub
Browse files

Merge pull request #15 from 3D-Beacons/development

Development
parents ca364b99 6b9c1269
Pipeline #233924 passed with stage
in 10 seconds
# Welcome to the 3D-Beacons Registry
## Background
3D-Beacons is an open collaboration between providers of macromolecular structure models. The goal of this collaboration is to provide model coordinates and meta-information from all the contributing data resources in a standardized data format and on a unified platform.
![Image](https://raw.githubusercontent.com/3D-Beacons/3d-beacons-documentation/main/assets/3d-beacons-summary.png)
3D-Beacons is an open collaboration between providers of macromolecular structure models. The goal of this collaboration
is to provide model coordinates and meta-information from all the contributing data resources in a standardized data
format and on a unified platform.
![Image](https://www.ebi.ac.uk/pdbe/pdbe-kb/3dbeacons/assets/img/overview.png)
**Schematical overview of the 3D-Beacons infrastructure**
3D-Beacons consists of a Registry, a Hub and Beacons who host Clients. The Registry is used by the [3D-Beacons Hub](https://github.com/3D-Beacons/3d-beacons-hub-api) to look up which API endpoints are supported by the various Beacons. The Beacons provide data according to the 3D-Beacons data specifications ([Current version: 0.3.1](https://app.swaggerhub.com/apis/3dbeacons/3D-Beacons/0.3.1)). The Hub collates the data from the Beacons and expose it via Hub API endpoints.
3D-Beacons consists of a Registry, a Hub and Beacons who host Clients. The Registry is used by
the [3D-Beacons Hub](https://github.com/3D-Beacons/3d-beacons-hub-api) to look up which API endpoints are supported by
the various Beacons. The Beacons provide data according to the 3D-Beacons data
specifications ([Current version: 0.3.1](https://app.swaggerhub.com/apis/3dbeacons/3D-Beacons/0.3.1)). The Hub collates
the data from the Beacons and expose it via Hub API endpoints.
### Current 3D-Beacons
- [FoldX](http://foldxsuite.crg.eu/)
- [AlphaFold](https://alphafold.ebi.ac.uk)
- [AlphaFill](https://alphafill.eu/)
- [Genome3D](http://genome3d.eu/)
- [Protein Data Bank in Europe](https://pdbe.org)
- [Protein Data Bank in Europe - Knowledge Base](https://pdbe-kb.org)
......@@ -18,28 +27,129 @@
- [SWISS-MODEL](https://swissmodel.expasy.org/)
## About the 3D-Beacons Registry
The 3D-Beacons Registry records meta-information about all the contributing partner resources, and lists the API endpoints that they support. In other words, looking at the registry will give specific information on which API endpoints provide what data from which data resource.
The Registry is implemented as a JSON object that complies with the schema specification, which is also included in this repository.
The 3D-Beacons Registry records meta-information about all the contributing partner resources, and lists the API
endpoints that they support. In other words, looking at the registry will give specific information on which API
endpoints provide what data from which data resource.
The Registry is implemented as a JSON object that complies with the schema specification, which is also included in this
repository.
These are available in `resources` folder in the repository. To add or change a registry entry, make the relevant
changes in `resources/registry.json` which should comply with schema defined in `resources/schema.json`.
There is also an installable Python package in this repository which provides utilities like schema validation. Please
follow the installation section below for installing the Python package.
## Linking a new data resource to 3D-Beacons Network
Data providers who are interested in making their macromolecule structures available through the 3D-Beacons Network
should follow the following steps:
1. Contact the 3D-Beacons consortium
2. Review the [API specifications](https://app.swaggerhub.com/apis/3dbeacons/3D-Beacons) for sharing metadata
3. Implement API endpoints or set up an instance of
the [3D-Beacons Client](https://github.com/3D-Beacons/3d-beacons-client)
4. Review the `resources/registry.json` file in this repository
5. Update the `resources/registry.json` file to include information on your data resource and your API endpoint URLs
6. Create a pull request for the `development` branch with your updated `resources/registry.json` file
### 1. Contact the 3D-Beacons consortium
3D-Beacons is an open consortium, and we welcome new data providers who would like to make their experimentally
determined or theoretical macromolecule structures available through the 3D-Beacons Network.
To ensure that the network provides access to relevant data, we require new prospective data providers to contact us
before linking their data to 3D-Beacons. Please send an email to Sameer Velankar (sameer@ebi.ac.uk) or Christine
Orengo (c.orengo@ucl.ac.uk) to initiate discussions.
### 2. Review the [API specifications](https://app.swaggerhub.com/apis/3dbeacons/3D-Beacons) for sharing metadata
The 3D-Beacons Network provides access to metadata regarding macromolecule structures in a unified format. This means
that every data provider has to expose information in the same data format. We define the accepted data schemas in the
[3D-Beacons API specification](https://app.swaggerhub.com/apis/3dbeacons/3D-Beacons) on SwaggerHub.
Please review this specification, and identify the schemas that fit the data you would like to make accessible via
3D-Beacons. For example, if you want to make your structures discoverable based on a UniProt identifier, then the
endpoints with `/uniprot/{qualifier}.json` are relevant for you.
### 3. Implement API endpoints or set up an instance of the [3D-Beacons Client](https://github.com/3D-Beacons/3d-beacons-client)
After reviewing the API specifications and deciding what data you will make available, and which data schema you will
use, the next step is to either implement the selected API endpoints in a REST API, or to take advantage of
the [3D-Beacons Client](https://github.com/3D-Beacons/3d-beacons-client), which can be installed locally and includes a
pre-packaged and ready-to-use implementation of certain API endpoints. For more information on this, please visit the [3D-Beacons Client](https://github.com/3D-Beacons/3d-beacons-client) repository.
### 4. Review the `resources/registry.json` file in this repository
Once your metadata is exposed via API endpoints that comply with the [3D-Beacons API specification](https://app.swaggerhub.com/apis/3dbeacons/3D-Beacons), you should review the `resources/registry.json` file in this repository. This file contains all the information needed by the [3D-Beacons Hub API](https://github.com/3D-Beacons/3d-beacons-hub-api) for linking your API endpoints to the 3D-Beacons Network.
The registry has two main data blocks: 1.) `providers` and 2.) `services`.
The `providers` contains information that describes your data resource. We use this information to let users know where to look for the original sources of data.
An example item in the `providers` list would look like this:
These are available in `resources` folder in the repository. To add or change a registry entry, make the relevant changes in `resources/registry.json` which should comply with schema defined in `resources/schema.json`.
```
{
"providerId": "alphafold",
"providerName": "AlphaFold Protein Structure Database",
"providerDescription": "AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment.",
"providerUrl": "https://alphafold.ebi.ac.uk/",
"baseServiceUrl": "https://alphafold.ebi.ac.uk/api/",
"devBaseServiceUrl": "https://dev.alphafold.ebi.ac.uk/api/",
"providerLogo": "https://alphafold.ebi.ac.uk/assets/img/dm-logo.png"
}
```
The `services` contains information about what API endpoints are implemented by which data provider.
An example item in the `services` list would look like this:
```
{
"serviceType": "summary",
"provider": "alphafold",
"accessPoint": "uniprot/summary/"
},
```
Together, the `providers` and `services` data blocks tell the 3D-Beacons Hub API that in the example above, AlphaFold DB provides access to their data by implementing the `summary` API endpoint, which they serve on the URL `https://alphafold.ebi.ac.uk/api/uniprot/summary/`
### 5. Update the `resources/registry.json` file
The next step is to fork this repository (
i.e. [https://github.com/3D-Beacons/3d-beacons-registry](https://github.com/3D-Beacons/3d-beacons-registry)) and edit
the `resources/registry.json` file by adding a new item in the `providers` list and listing all the API endpoints you implemented in the `services` list.
**NOTE**: If you don't have a production setup at this point, set same value for `baseServiceUrl` as `devBaseServiceUrl`.
There is also an installable Python package in this repository which provides utilities like schema validation. Please follow the installation section below for installing the Python package.
### 6. Create a pull request for the `development` branch
Finally, please create a pull request so that we can merge your version of the `resources/registry.json` file to our `development` branch. We will then test the updated file, and also test all the API endpoints you specified in the `services` list of the `resources/registry.json` file.
As part of testing the API endpoints, we will perform stress testing of all the API endpoints you provide. We will also validate the data format against the [3D-Beacons API specification](https://app.swaggerhub.com/apis/3dbeacons/3D-Beacons), and test if the [3D-Beacons Hub API](https://github.com/3D-Beacons/3d-beacons-hub-api) can concatenate data.
Once done, we proceed to merge the updates into the `master` branch, at which point your data resource will become officially linked to the 3D-Beacons Network.
## Installation
### Prerequisites
Below are the list of softwares/tools for the utilities to properly run in the environment.
Below are the list of softwares/tools for the utilities to properly run in the environment.
Python 3.7+
**Note**
Because [Python 2.7 supports ended January 1](https://pythonclock.org/), 2020, new projects should consider supporting Python 3 only, which is simpler than trying to support both. As a result, support for Python 2.7 in this project has been dropped.
Because [Python 2.7 supports ended January 1](https://pythonclock.org/), 2020, new projects should consider supporting
Python 3 only, which is simpler than trying to support both. As a result, support for Python 2.7 in this project has
been dropped.
### Setup the environment
Setup a Python virtual environment and install required packages.
```
$ python3 -m venv venv
$ source venv/bin/activate
......@@ -52,23 +162,28 @@ Now install the project dependencies.
```
Install the package
```
(venv) $ pip install .
```
## Usage
The installed package can be used to validate `registry.json` against the defined schema. Both the files are available in `resources` folder.
The installed package can be used to validate `registry.json` against the defined schema. Both the files are available
in `resources` folder.
```
(venv) $ beacons_bio_3d validate_schema --schema_json resources/schema.json --registry_json resources/registry.json
```
## Contributors
- Sreenath Nair - _Initial work_ - [sreenathnair](https://github.com/sreenathnair)
- Mihaly Varadi - _Initial work_ - [mvaradi](https://github.com/mvaradi)
See also the list of [contributors](https://github.com/3D-Beacons/3d-beacons-registry/contributors) who participated in this project.
See also the list of [contributors](https://github.com/3D-Beacons/3d-beacons-registry/contributors) who participated in
this project.
### How to contribute
This repository is open to contributions. Please fork the repository and send pull requests.
......@@ -6,6 +6,7 @@
"providerDescription": "SWISS-MODEL is a fully automated protein structure homology-modelling server, accessible via the ExPASy web server, or from the program DeepView (Swiss Pdb-Viewer). The purpose of this server is to make protein modelling accessible to all life science researchers worldwide.",
"providerUrl": "https://swissmodel.expasy.org/",
"baseServiceUrl": "https://swissmodel.expasy.org/3d-beacons/",
"devBaseServiceUrl": "https://swissmodel.expasy.org/3d-beacons/",
"providerLogo": "https://swissmodel.expasy.org"
},
{
......@@ -14,6 +15,7 @@
"providerDescription": "Genome3D provides consensus structural annotations and 3D models for sequences from model organisms, including human.",
"providerUrl": "http://genome3d.eu/",
"baseServiceUrl": "http://genome3d.eu/beacons/",
"devBaseServiceUrl": "http://genome3d.eu/beacons/",
"providerLogo": "http://genome3d.eu/"
},
{
......@@ -22,6 +24,7 @@
"providerDescription": "PDBe is the European resource for the collection, organisation and dissemination of data on biological macromolecular structures.",
"providerUrl": "https://www.ebi.ac.uk/pdbe/",
"baseServiceUrl": "https://www.ebi.ac.uk/pdbe/aggregated-api/beacons/",
"devBaseServiceUrl": "https://wwwdev.ebi.ac.uk/pdbe/aggregated-api/beacons/",
"providerLogo": "https://www.ebi.ac.uk/pdbe/static/images/logos/PDBe/logo.png"
},
{
......@@ -30,6 +33,7 @@
"providerDescription": "The Protein Ensemble Database (PED) is an open access database for the deposition of structural ensembles, mainly intrinsically disordered proteins (IDPs).",
"providerUrl": "https://proteinensemble.org/",
"baseServiceUrl": "https://proteinensemble.org/api/3d-beacons/",
"devBaseServiceUrl": "https://proteinensemble.org/api/3d-beacons/",
"providerLogo": "https://proteinensemble.org/assets/PED_logo.svg"
},
{
......@@ -38,6 +42,7 @@
"providerDescription": "AlphaFold is an AI system developed by DeepMind that predicts a protein’s 3D structure from its amino acid sequence. It regularly achieves accuracy competitive with experiment.",
"providerUrl": "https://alphafold.ebi.ac.uk/",
"baseServiceUrl": "https://alphafold.ebi.ac.uk/api/",
"devBaseServiceUrl": "https://dev.alphafold.ebi.ac.uk/api/",
"providerLogo": "https://alphafold.ebi.ac.uk/assets/img/dm-logo.png"
},
{
......@@ -46,6 +51,7 @@
"providerDescription": "Small angle scattering (SAS) of X-ray and neutrons provides structural information on biological macromolecules in solution at a resolution of 1-2 nm. SASBDB is a fully searchable curated repository of freely accessible and downloadable experimental data, which are deposited together with the relevant experimental conditions, sample details, derived models and their fits to the data.",
"providerUrl": "https://www.sasbdb.org/",
"baseServiceUrl": "https://www.sasbdb.org/rest-api/",
"devBaseServiceUrl": "https://www.sasbdb.org/rest-api/",
"providerLogo": "https://www.sasbdb.org/media/sasbdb-logo.png"
},
{
......@@ -54,6 +60,7 @@
"providerDescription": "AlphaFill is a databank of AlphaFold models enriched with ligands, cofactors, and ions that are transplanyed from homologous experimental structure models.",
"providerUrl": "https://alphafill.eu/",
"baseServiceUrl": "https://alphafill.eu/v1/aff/",
"devBaseServiceUrl": "https://alphafill.eu/v1/aff/",
"providerLogo": "https://alphafill.eu/images/alphafill-logo.svg"
}
],
......
......@@ -23,6 +23,7 @@
"providerDescription",
"providerUrl",
"baseServiceUrl",
"devBaseServiceUrl",
"providerLogo"
],
"properties": {
......@@ -72,6 +73,16 @@
],
"default": ""
},
"devBaseServiceUrl": {
"$id": "#/properties/providers/properties/devBaseServiceUrl",
"type": "string",
"title": "Base service url for development environment",
"description": "A base development url for the service of each provider",
"examples": [
"https://wwwdev.ebi.ac.uk/pdbe/beacons/"
],
"default": ""
},
"providerLogo": {
"$id": "#/properties/providers/properties/providerLogo",
"type": "string",
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment