Commit f691792b authored by carlosribas's avatar carlosribas
Browse files

Sequence search API documentation (#530)

parent a46094bd
......@@ -17,6 +17,10 @@ The new version runs on a [cloud infrastructure](https://www.embassycloud.org),
Please be aware that results now show **species-specific identifiers** which include NCBI taxonomy ids (for example [URS00000478B7_9606](/rna/URS00000478B7_9606), human 7SL RNA) while the old search results included only the unique RNA sequence identifiers (for example [URS00000478B7](/rna/URS00000478B7), SRP RNA from 5 species). This can increase the total number of results (in this example, the old search showed only 1 entry but the new one shows 5). This change enables the user to clearly see which species the sequence results are coming from.
### API documentation <a style="cursor: pointer" id="sequence-search-api" ng-click="scrollTo('sequence-search-api')" name="sequence-search-api" class="text-muted smaller"><i class="fa fa-link"></i></a>
See the [API documentation](/sequence-search/api) to learn how to start an asynchronous job using our REST based API.
### Exact sequence matches <a style="cursor: pointer" id="exact-matches" ng-click="scrollTo('exact-matches')" name="exact-matches" class="text-muted smaller"><i class="fa fa-link"></i></a>
Whenever a sequence is entered in the search input box, the query is compared with all RNAcentral sequences and if there is an **exact match**, the links to the entries matching the query are displayed in a green box. This is very quick because only identical matches are considered. To see all similar sequences, just click Submit.
......@@ -27,6 +31,10 @@ Whenever a sequence is entered in the search input box, the query is compared wi
In addition to nhmmer searches against RNAcentral, every query is automatically compared with the [Rfam](https://rfam.org) library of RNA families. The searches are done using the [Infernal](http://eddylab.org/infernal) cmscan program coupled with a [post-processing step](https://github.com/nawrockie/cmsearch_tblout_deoverlap). The post-processing removes any hits that overlap Rfam families from the same clan (a clan is a set of homologous families, for example LSU_rRNA_archaea, LSU_rRNA_bacteria and LSU_rRNA_eukarya). This is a unique functionality not available on the [Rfam website](https://rfam.org) or the [EBI cmscan service](https://www.ebi.ac.uk/Tools/rna/infernal_cmscan/) that report all matching families, including the redundant overlapping hits from the same clan.
### Secondary structure <a style="cursor: pointer" id="r2dt" ng-click="scrollTo('r2dt')" name="r2dt" class="text-muted smaller"><i class="fa fa-link"></i></a>
The RNAcentral sequence similarity search also generates secondary structure (2D) diagrams using the [R2DT](https://github.com/RNAcentral/R2DT) software that visualises RNA structure using standard layouts or templates. Learn more about this new feature in the [R2DT preprint](https://www.biorxiv.org/content/10.1101/2020.09.10.290924v1).
### Number of similar sequences <a style="cursor: pointer" id="number" ng-click="scrollTo('number')" name="number" class="text-muted smaller"><i class="fa fa-link"></i></a>
Although the number of similar sequences can reach tens of thousands, for performance reasons, only the top 1000 results will be shown in each search.
......
......@@ -11,7 +11,8 @@
<li><a href="{% url 'linking-to-rnacentral' %}">Linking to RNAcentral</a></li>
<strong>Data Access</strong>
<li><a href="{% url 'help-text-search' %}">Text search</a></li>
<li><a href="{% url 'help-sequence-search' %}">Sequence search <span class="label label-success">New<span></a></li>
<li><a href="{% url 'help-sequence-search' %}">Sequence search</a></li>
<li><a href="{% url 'sequence-search-api' %}">Sequence search API <span class="label label-success">New</span></a></li>
<li><a href="{% url 'api-docs' %}">API documentation</a></li>
<li><a href="{% url 'help-public-database' %}">Public Postgres database</a></li>
<strong>Functional annotations</strong>
......
{% extends "portal/base.html" %}
{% load staticfiles %}
{% block title %}
Sequence search API
{% endblock %}
{% block content %}
<div class="row">
<div class="col-12">
<h1><i class="fa fa-book"></i> Sequence search API</h1>
</div>
</div>
<div class="row">
<div class="col-12">
<h2>Overview</h2>
<p>
The RNAcentral sequence similarity search API allows you to perform most of the same actions through
your own tools as you can through the web interface. Any tool capable of making HTTP requests can
communicate with this API, for example <a href="https://curl.se/" target="_blank">curl</a>.
</p>
</div>
</div>
<div class="row">
<div class="col-12">
<h2>API Throttling</h2>
<p>
The maximum number of requests from the same IP address is limited to 50 requests per minute.
</p>
</div>
</div>
<div class="row">
<div class="col-12">
<h2>Example script</h2>
<p>
An example of a Python script to perform searches for each record of a FASTA file is provided below.
It uses the biopython package and requires python >= 3.5.
</p>
<p>
You can use this script to search for sequences in the RNAcentral database or in one of the
<a href="/expert-databases">Expert Databases</a>.
</p>
<p>
The results will be organized in a new directory. A file in JSON format containing the metadata and the
search results will be created. This file will be named with the description of the sequence.
Here is an example:
</p>
<p>
<code>
results<br />
|-- URS00001F7A66_9031 Gallus gallus (chicken) gga-let-7g-5p.json<br />
|-- ...
</code>
</p>
<table class="table table-borderless">
<thead>
<tr></tr>
</thead>
<tbody>
<tr><th><script src="https://gist.github.com/carlosribas/b2f4095df29a44116d5d0555d708b357.js"></script></th></tr>
</tbody>
</table>
</div>
</div>
<div class="row">
<div class="col-md-12">
<h2>Swagger</h2>
<div class="panel panel-default">
<div class="panel-heading">
<p class="panel-title">Explore the API through Swagger</p>
</div>
<div class="panel-body">
<div class="embed-responsive embed-responsive-16by9">
<iframe class="embed-responsive-item" src="https://search.rnacentral.org/api/doc" frameborder="0" allowfullscreen></iframe>
</div>
</div>
</div>
</div>
</div>
{% endblock %}
......@@ -11,6 +11,7 @@ See the License for the specific language governing permissions and
limitations under the License.
"""
from django.conf.urls import url
from django.views.generic.base import TemplateView
from django.urls import reverse_lazy
from django.views.generic import RedirectView
......@@ -47,6 +48,9 @@ urlpatterns = [
# help page
url(r'^help/?$', RedirectView.as_view(url=reverse_lazy('help-sequence-search'), permanent=False)),
# API documentation
url(r'^api/?$', TemplateView.as_view(template_name='api.html'), name='sequence-search-api'),
# user interface - embeddable react component
url(r'^$', sequence_search, name='sequence-search'),
]
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment