Unverified Commit 6ed04886 authored by Károly Erdős's avatar Károly Erdős Committed by GitHub
Browse files

Merge pull request #13 from elixir-europe/dev

Merge changes from DEV to MASTER
parents 2efc47fb 4096b336
Pipeline #149747 passed with stage
in 2 minutes and 21 seconds
FROM node:carbon
FROM node:12.22.0-buster
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies
# A wildcard is used to ensure both package.json AND package-lock.json are copied
# where available (npm@5+)
COPY package*.json ./
COPY ./start.sh /
RUN npm install
# Bundle app source
COPY . .
EXPOSE 3000
CMD [ "npm", "start" ]
\ No newline at end of file
ENTRYPOINT ["/start.sh"]
\ No newline at end of file
......@@ -2,11 +2,9 @@
[![Build Status](https://travis-ci.org/EMBL-EBI-SUBS/json-schema-validator.svg?branch=master)](https://travis-ci.org/EMBL-EBI-SUBS/json-schema-validator) [![Codacy Badge](https://api.codacy.com/project/badge/Grade/7fbabc981e294249a9a0967965418058)](https://www.codacy.com/app/fpenim/json-schema-validator?utm_source=github.com&utm_medium=referral&utm_content=EMBL-EBI-SUBS/json-schema-validator&utm_campaign=Badge_Grade)
[![tested with jest](https://img.shields.io/badge/tested_with-jest-99424f.svg)](https://github.com/facebook/jest)
This repository contains a [JSON Schema](http://json-schema.org/) validator for the EMBL-EBI Submissions Project. This validator runs as a standalone node server that receives validation requests and gives back it's results.
This repository contains a deployable and/or executable [JSON Schema](http://json-schema.org/) validator service. This validator can runs as a standalone node server or just a command line application that receives validation requests and gives back it's results.
This service uses [Elixir's JSON schema validator library](https://github.com/elixir-europe/json-schema-validator).
The validation is done using the [AJV](https://github.com/epoberezkin/ajv) library version ^6.0.0 that fully supports the JSON Schema **draft-07**.
The validation is done using the [AJV](https://github.com/epoberezkin/ajv) library version ^7.0.0 that supports the JSON Schema draft-06/07/2019-09.
## Contents
- [Getting Started](README.md#getting-started)
......@@ -19,7 +17,7 @@ The validation is done using the [AJV](https://github.com/epoberezkin/ajv) libra
- [Executing](README.md#executing)
- [Executing with Docker](README.md#executing-with-docker)
- [Executing the validator by a command line](README.md#executing-with-the-provided-cli-script)
- [Development](README.md#development)
......@@ -59,8 +57,8 @@ npm -v
#### Project
Clone project and install dependencies:
```
git clone https://github.com/EMBL-EBI-SUBS/json-schema-validator.git
cd json-schema-validator
git clone https://github.com/elixir-europe/bio-validator.git
cd bio-validator
npm install
```
......@@ -93,15 +91,34 @@ node src/server --pidPath=/pid/file/path/server.pid
```
Note: This is the **file path** and not just the directory it will be written to.
### Executing with Docker
1. Build docker image:
```
docker build -t subs/json-schema-validator .
```
2. Run docker image:
```
docker run -p 3020:3020 -d subs/json-schema-validator
### Executing with the provided CLI script
There is a `validator-cli.js` script is provided in the repository's root folder for the user if they would like to execute the validation from the command line without setting up a running server.
Just simply type `node ./validator-cli.js --help` to get the usage of this script:
```js
node ./validator-cli.js --help
Bio-validator CLI (Command Line Interface)
usage: node ./validator-cli.js [--schema=path/to/schema] [--json=path/to/json]
Options:
--help Show help [boolean]
--version Show version number [boolean]
-s, --schema path to the schema file [required]
-j, --json path to the json file to validate [required]
Examples:
node ./validator-cli.js Validates 'valid.json' with
--json=valid.json 'test_schema.json'.
--schema=test_schema.json
```
### Development
For development purposes using [nodemon](https://nodemon.io/) is useful. It reloads the application everytime something has changed on save time.
```
......@@ -201,8 +218,176 @@ HTTP status code `400`
## Custom keywords
The AJV library supports the implementation of custom json schema keywords to address validation scenarios that go beyond what json schema can handle.
The list of implemented custom keywords could be found in the
[Elixir's JSON Schema Validator library's documentation](https://github.com/elixir-europe/json-schema-validator/blob/master/README.md#custom-keywords).
Currently, in this repository four custom keywords are implemented: `graph_restriction`, `isChildTermOf`, `isValidTerm` and `isValidTaxonomy`.
If the user would like to add a new custom keywords then they have to add it to the validator when it is being instantiated:
```js
// get all the custom extensions
const { newCustomKeyword, isChildTermOf, isValidTerm, isValidTaxonomy } = require("./keywords");
const validator = new BioValidator([
new CustomKeyword(param1, param2),
new isChildTermOf(null, "https://www.ebi.ac.uk/ols/api/search?q="),
new isValidTerm(null, "https://www.ebi.ac.uk/ols/api/search?q="),
new isValidTaxonomy(null)
]);
// only use the new custom keyword
let validator = new BioValidator([CustomKeyword])
```
### graph_restriction
This custom keyword *evaluates if an ontology term is child of another*. This keyword is applied to a string (CURIE) and **passes validation if the term is a child of the term defined in the schema**.
The keyword requires one or more **parent terms** *(classes)* and **ontology ids** *(ontologies)*, both of which should exist in [OLS - Ontology Lookup Service](https://www.ebi.ac.uk/ols).
This keyword works by doing an asynchronous call to the [OLS API](https://www.ebi.ac.uk/ols/api/) that will respond with the required information to know if a given term is child of another.
Being an async validation step, whenever used in a schema, the schema must have the flag: `"$async": true` in its object root.
#### Usage
Schema:
```js
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "http://schema.dev.data.humancellatlas.org/module/ontology/5.3.0/organ_ontology",
"$async": true,
"properties": {
"ontology": {
"description": "A term from the ontology [UBERON](https://www.ebi.ac.uk/ols/ontologies/uberon) for an organ or a cellular bodily fluid such as blood or lymph.",
"type": "string",
"graph_restriction": {
"ontologies" : ["obo:hcao", "obo:uberon"],
"classes": ["UBERON:0000062","UBERON:0000179"],
"relations": ["rdfs:subClassOf"],
"direct": false,
"include_self": false
}
}
}
}
```
JSON object:
```js
{
"ontology": "UBERON:0000955"
}
```
### isChildTermOf
This custom keyword also *evaluates if an ontology term is child of another* and is a simplified version of the graph_restriction keyword. This keyword is applied to a string (url) and **passes validation if the term is a child of the term defined in the schema**.
The keyword requires the **parent term** and the **ontology id**, both of which should exist in [OLS - Ontology Lookup Service](https://www.ebi.ac.uk/ols).
This keyword works by doing an asynchronous call to the [OLS API](https://www.ebi.ac.uk/ols/api/) that will respond with the required information to know if a given term is child of another.
Being an async validation step, whenever used in a schema, the schema must have the flag: `"$async": true` in its object root.
#### Usage
Schema:
```js
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$async": true,
"properties": {
"term": {
"type": "string",
"format": "uri",
"isChildTermOf": {
"parentTerm": "http://purl.obolibrary.org/obo/PATO_0000047",
"ontologyId": "pato"
}
}
}
}
```
JSON object:
```js
{
"term": "http://purl.obolibrary.org/obo/PATO_0000383"
}
```
### isValidTerm
This custom keyword *evaluates if a given ontology term url exists in OLS* ([Ontology Lookup Service](https://www.ebi.ac.uk/ols)). It is applied to a string (url) and **passes validation if the term exists in OLS**. It can be aplied to any string defined in the schema.
This keyword works by doing an asynchronous call to the [OLS API](https://www.ebi.ac.uk/ols/api/) that will respond with the required information to determine if the term exists in OLS or not.
Being an async validation step, whenever used in a schema, the schema must have the flag: `"$async": true` in its object root.
#### Usage
Schema:
```js
{
"$schema": "http://json-schema.org/draft-07/schema#",
"$async": true,
"properties": {
"url": {
"type": "string",
"format": "uri",
"isValidTerm": true
}
}
}
```
JSON object:
```js
{
"url": "http://purl.obolibrary.org/obo/PATO_0000383"
}
```
### isValidTaxonomy
This custom keyword evaluates if a given *taxonomy* exists in ENA's Taxonomy Browser. It is applied to a string (url) and **passes validation if the taxonomy exists in ENA**. It can be aplied to any string defined in the schema.
This keyword works by doing an asynchronous call to the [ENA API](https://www.ebi.ac.uk/ena/taxonomy/rest/any-name/<REPLACE_ME_WITH_AXONOMY_TERM>) that will respond with the required information to determine if the term exists or not.
Being an async validation step, whenever used in a schema, the schema must have the flag: `"$async": true` in its object root.
#### Usage
Schema:
```js
{
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Is valid taxonomy expression.",
"$async": true,
"properties": {
"value": {
"type": "string",
"minLength": 1,
"isValidTaxonomy": true
}
}
}
```
JSON object:
```js
{
"metagenomic source" : [ {
"value" : "wastewater metagenome"
} ]
}
```
## Running in Docker
Dockerized version of BioValidator is available in [quay.io](https://quay.io/repository/ebi-ait/biovalidator).
These images can be used to run the validator without cloning this repository.
Pull docker image from [quay.io](https://quay.io/repository/ebi-ait/biovalidator)
```shell
docker pull quay.io/ebi-ait/biovalidator:1.0.0
```
Run in server mode
```shell
docker run -p 3020:3020 -d quay.io/ebi-ait/biovalidator:1.0.0 --server
```
Run in onetime mode
```shell
docker run quay.io/ebi-ait/biovalidator:1.0.0 --schema /path/to/schema.json --json /path/to/json.json
```
## License
For more details about licensing see the [LICENSE](LICENSE.md).
{
"text": "cDNA",
"ontology": "EFO:0008481"
}
\ No newline at end of file
{
"text": "normal",
"ontology": "PATO:0000461"
}
\ No newline at end of file
{
"text": "glioblastoma",
"ontology": "MONDO:0018177"
}
\ No newline at end of file
{
"$id": "http://subs/graphRestriction-schema.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"description": "A term that may be associated with a disease-related ontology term",
"$async": true,
"additionalProperties": false,
"required": [
"text"
],
"title": "disease_ontology",
"properties": {
"describedBy": {
"pattern" : "^(http|https)://schema.(.*?)humancellatlas.org/module/ontology/(([0-9]{1,}.[0-9]{1,}.[0-9]{1,})|([a-zA-Z]*?))/disease_ontology",
"type": "string"
},
"schema_version": {
"description": "Version number in major.minor.patch format.",
"type": "string",
"pattern": "^[0-9]{1,}.[0-9]{1,}.[0-9]{1,}$",
"example": "4.6.1"
},
"text": {
"description": "The text for the term as the user provides it.",
"type": "string"
},
"ontology": {
"description": "An optional ontology reference in format where prefix_ indicates which ontology",
"type": "string",
"graph_restriction": {
"ontologies" : ["obo:mondo", "obo:efo"],
"classes": ["MONDO:0000001","PATO:0000461"],
"relations": ["rdfs:subClassOf"],
"direct": false,
"include_self": true
}
},
"ontology_label": {
"description": "The preferred label for the ontology term referred to in the ontology field. This may differ from the user-supplied value in the text field",
"type": "string"
}
},
"type": "object"
}
\ No newline at end of file
This diff is collapsed.
{
"name": "json-schema-validator",
"version": "1.8.0",
"name": "biovalidator",
"version": "1.9.0",
"description": "A nodejs JSON schema validator service.",
"main": "src/server.js",
"repository": "https://github.com/EMBL-EBI-SUBS/json-schema-validator.git",
"repository": "https://github.com/elixir-europe/biovalidator.git",
"scripts": {
"start": "node src/server.js",
"test": "jest",
......@@ -22,12 +22,15 @@
"author": "EMBL-EBI-SUBS, fpenim, ke4, haseeb-gh",
"license": "Apache-2.0",
"dependencies": {
"elixir-jsonschema-validator": "^2.0.0",
"ajv": "~7.2.1",
"ajv-formats": "^1.5.1",
"express": "^4.17.1",
"npid": "^0.4.0",
"request": "^2.88.2",
"request-promise": "^4.2.4",
"winston": "^3.2.1",
"winston-daily-rotate-file": "^3.10.0"
"winston-daily-rotate-file": "^3.10.0",
"yargs": "^16.2.0"
},
"devDependencies": {
"bufferutil": "^4.0.2",
......
/**
* Created by rolando on 08/08/2018.
*/
const Promise = require('bluebird');
const path = require("path");
const fs = require('fs');
const Ajv = require("ajv").default;
const addFormats = require("ajv-formats");
const request = require("request-promise");
const AppError = require("./model/application-error");
const devMode = 0;
console.debug = devMode ? console.debug : () => { };
/**
*
* Wraps the generic validator, outputs errors in custom format.
*
*/
class BioValidator {
constructor(customKeywordValidators, baseSchemaPath){
this.validatorCache = {};
this.cachedSchemas = {};
this.ajvInstance = this.constructAjv(customKeywordValidators);
this.baseSchemaPath = baseSchemaPath;
this.customKeywordValidators = customKeywordValidators;
}
validate(inputSchema, inputObject) {
inputSchema["$async"] = true;
return new Promise((resolve, reject) => {
const compiledSchemaPromise = this.getValidationFunction(inputSchema);
compiledSchemaPromise.then((validate) => {
Promise.resolve(validate(inputObject))
.then((data) => {
if (validate.errors) {
resolve(validate.errors);
} else {
resolve([]);
}
}
).catch((err) => {
if (!(err instanceof Ajv.ValidationError)) {
console.error("An error occurred while running the validation.");
reject(new AppError("An error occurred while running the validation."));
} else {
console.debug("debug", this.ajvInstance.errorsText(err.errors, {dataVar: inputObject.alias}));
resolve(err.errors);
}
});
}).catch((err) => {
console.error("async schema compiled encountered and error");
console.error(err.stack);
reject(err);
});
});
}
validateWithRemoteSchema(schemaUri, document) {
return this.getSchema(schemaUri)
.then(schema => {return this.validateSingleSchema(document, schema)})
}
getSchema(schemaUri) {
if(! this.cachedSchemas[schemaUri]) {
return new Promise((resolve, reject) => {
BioValidator.fetchSchema(schemaUri)
.then(schema => {
this.cachedSchemas[schemaUri] = schema;
resolve(schema);
})
.catch(err => {
reject(err);
})
});
} else {
return Promise.resolve(this.cachedSchemas[schemaUri]);
}
}
static fetchSchema(schemaUrl) {
return request({
method: "GET",
url: schemaUrl,
json: true,
});
}
getValidationFunction(inputSchema) {
const schemaId = inputSchema['$id'];
if(this.validatorCache[schemaId]) {
return Promise.resolve(this.validatorCache[schemaId]);
} else {
const compiledSchemaPromise = this.ajvInstance.compileAsync(inputSchema);
if(schemaId) {
this.validatorCache[schemaId] = compiledSchemaPromise;
}
return Promise.resolve(compiledSchemaPromise);
}
}
constructAjv(customKeywordValidators) {
const ajvInstance = new Ajv({allErrors: true, strict:false, loadSchema: this.generateLoadSchemaRefFn()});
addFormats(ajvInstance);
BioValidator._addCustomKeywordValidators(ajvInstance, customKeywordValidators);
return ajvInstance
}
static _addCustomKeywordValidators(ajvInstance, customKeywordValidators) {
customKeywordValidators.forEach(customKeywordValidator => {
ajvInstance = customKeywordValidator.configure(ajvInstance);
});
return ajvInstance;
}
generateLoadSchemaRefFn() {
const cachedSchemas = this.cachedSchemas;
const baseSchemaPath = this.baseSchemaPath;
const loadSchemaRefFn = (uri) => {
if(cachedSchemas[uri]) {
return Promise.resolve(cachedSchemas[uri]);
} else {
if (baseSchemaPath) {
let ref = path.join(baseSchemaPath, uri);
console.log('loading ref ' + ref);
let jsonSchema = fs.readFileSync(ref);
let loadedSchema = JSON.parse(jsonSchema);
loadedSchema["$async"] = true;
cachedSchemas[uri] = loadedSchema;
return Promise.resolve(loadedSchema);
}
else {
return new Promise((resolve, reject) => {
request({
method: "GET",
url: uri,
json: true
}).then(resp => {
const loadedSchema = resp;
loadedSchema["$async"] = true;
cachedSchemas[uri] = loadedSchema;
resolve(loadedSchema);
}).catch(err => {
reject(err);
});
});
}
}
};
return loadSchemaRefFn;
}
}
module.exports = BioValidator;
const runValidation = require("../validator");
const logger = require("../winston");
const fs = require("fs");
const {log_error, log_info } = require("../utils/logger");
class BioValidatorCLI {
constructor(pathToSchema, pathToJson) {
this.pathToSchema = pathToSchema
this.pathToJson = pathToJson
}
read_schema(pathToSchema) {
if (fs.existsSync(pathToSchema)) {
let schemaStr = fs.readFileSync(pathToSchema, 'utf-8')
return JSON.parse(schemaStr)
} else {
log_error("File '" + pathToSchema + "' does not exist!");
process.exit(1);
}
}
read_json(pathToJson) {
if (fs.existsSync(pathToJson)) {
let jsonStr = fs.readFileSync(pathToJson, 'utf-8')
return JSON.parse(jsonStr)
} else {
log_error("File '" + pathToJson + "' does not exist!");
process.exit(1);
}
}
validate() {
this.inputSchema = this.read_schema(this.pathToSchema)
this.jsonToValidate = this.read_json(this.pathToJson)
if (this.inputSchema && this.jsonToValidate) {
runValidation(this.inputSchema, this.jsonToValidate).then((output) => {
logger.log("silly", "Sent validation results.");
this.process_output(output);
}).catch((error) => {
console.error("console error: " + error);
logger.log("error", error);
});
} else {
let appError = "Something is missing, both schema and object are required to execute validation.";
log_error(appError);
}
}
process_output(output) {
if (output.length === 0) {
log_info("No validation errors reported.");
} else {
log_error("The validation process has found the following error(s):\n")
log_error(this.error_report(output));
}
console.log("Validation finished.");
}
error_report(jsonErrors) {
let errorOutput = "";
jsonErrors.forEach( (errorObject) => {
const dataPath = errorObject.dataPath;
const errors = errorObject.errors;
let errorStr = "";
errors.forEach( (error) => {
errorStr = errorStr.concat("\n\t", error);
})
errorOutput = errorOutput.concat(dataPath + errorStr + "\n");
})
return errorOutput;
}
}
module.exports = BioValidatorCLI;