Commits · 20d5e0fc88322711323824673e1e87c9d008494f · ensembl-gh-mirror / ensembl

This project is mirrored from https://:*****@github.com/Ensembl/ensembl.git. Pull mirroring failed 2 months ago.
Repository mirroring has been paused due to too many failed attempts. It can be resumed by a project maintainer or owner.
Last successful update 5 months ago.

Jan 08, 2020

RFAMParser: relax selection criteria on analysis.logic_name · 20d5e0fc

Marek Szuba authored 5 years ago

Due to changes in the structure of the production database, since
release 98 the value of analysis.logic_name corresponding to non-coding
RNA can be either 'ncrna' (which is what we used before) or
'ncrna_species_name'. Change the SQL query used to map RFAM IDs to
Ensembl stable IDs so that it can correctly handle species using the
latter syntax, i.e. human, mouse and zebrafish.

Issue: ENSINT-402

20d5e0fc

Jan 02, 2020
- [annual copyright updater]. Bugfix: do not touch symbolic links · 866a6ba6
  Matthieu Muffato authored 5 years ago
  
  866a6ba6
Sep 26, 2019
- Enable VGNC xrefs for callithrix_jacchus and papio_anubis · f1829335
  Marek Szuba authored 5 years ago
  
  f1829335
Jul 25, 2019

ChecksumParser: add a comment about read-only input paths · 0f02729d

Marek Szuba authored 5 years ago

See ENSCORESW-3197. Have to think about the correct way of specifying
where to put that output file though, especially given the parser
doesn't actually delete it after it is done with it.

0f02729d

ChecksumParser: check if we have opened the temporary file for writing · 2c00b430
Marek Szuba authored 5 years ago

2c00b430

ChecksumParser: increment checksum_xref_id BEFORE use · 41aa26b1

Marek Szuba authored 5 years ago

The initial value of the variable $counter is set to the highest
checksum_xref.checksum_xref_id found if the table in question is not
empty, or 0 if it is. This causes problems if $counter is only
incremented after each use in the input-file loop:
 - for a non-empty table the parser would attempt to re-use an existing
   value of checksum_xref_id for the first entry read from the input
   file. checksum_xref_id is the primary key of checksum_xref so its
   values have to be unique, therefore "LOAD DATA" silently discards the
   offending row;
 - for an empty table we lose one input row as well but it is the SECOND
   rather than the first one. Reason: 0 is not a valid value for
   auto_increment fields in MySQL, resulting in the first row being
   inserted with the first allowed ID value of 1 - which brings us back
   to the previous scenario when "LOAD DATA" attempts to insert the
   second row.

Incrementing $counter before use ought to address both forms of the
problem.

41aa26b1

Jul 18, 2019
- schema_patcher.pl: Update location of ontology schema · 2d7a9e8e
  Marek Szuba authored 5 years ago
  
  2d7a9e8e
- Remove misc_scripts/jira/* · b2845fe5
  Marek Szuba authored 5 years ago
```
Not used any more + at least partly out of date.
```
  b2845fe5
Jun 17, 2019

The ontology schema patches have been moved from... · 85446eb8

Thomas Maurel authored 5 years ago

The ontology schema patches have been moved from ensembl/misc_scripts/ontology/sql to ols-ensembl-loader/sql

85446eb8

Jun 11, 2019
- Enable VGNC xrefs for felis_catus, macaca_mulatta and microcebus_murinus · aa4155d0
  Marek Szuba authored 5 years ago
  
  aa4155d0
May 14, 2019
- ENSCORESW-3147 : correctly capture all required fields from file · acb922b0
  Magali Ruffier authored 5 years ago
  
  acb922b0
Mar 18, 2019

stable_id_lookup: extract RNAProduct stable IDs from core databases · 502dcfff

Marek Szuba authored 6 years ago

Uses the same type of SQL SELECT queries as Translation, which makes
sense given how similar they are.

Tested on test-genome-DBs/homo_sapiens/core, works without errors.

Aborts upon encountering a core database missing the 'rnaproduct' table
but that is in my humble opinion very much desired behaviour, as it could
indicate incomplete application of schema patches in the release this
will be included in.

502dcfff

Feb 21, 2019
- Removed ontology related sql definition · fce09935
  Marc Chakiachvili authored 6 years ago
  
  fce09935
Feb 13, 2019
- extract correct dbtype and version across naming conventions · fbecac9c
  Magali Ruffier authored 6 years ago
  
  fbecac9c
Feb 12, 2019
- update ontology schema to 97 · 627a90fa
  Tiago Grego authored 6 years ago
  
  627a90fa
- update test dbs to match fixed patch · 9d9c94bd
  Tiago Grego authored 6 years ago
  
  9d9c94bd
Jan 02, 2019
- Yearly copyright update · a8c451eb
  Tiago Grego authored 6 years ago
  
  a8c451eb
Dec 19, 2018
- Added relation index creation in patch · 5f3d6bf5
  Marc Chakiachvili authored 6 years ago
  
  5f3d6bf5
- Added closure index creation in patch · 091d7606
  Marc Chakiachvili authored 6 years ago
  
  091d7606
- Updated typo in patch · b7afb98c
  Marc Chakiachvili authored 6 years ago
  
  b7afb98c
- Added new patch to update current CHARSET and COLLATE for related tables. · 3b7850f2
  Marc Chakiachvili authored 6 years ago
  
  3b7850f2
- Updated SQL tables script with all updates · 610a69fc
  Marc Chakiachvili authored 6 years ago
  
  610a69fc
Dec 17, 2018
- Updated ensembl_ontology schema scripts · 9722993e
  Marc Chakiachvili authored 6 years ago
  
  9722993e
Dec 07, 2018
- production schema file has been renamed · 1615dab6
  Magali Ruffier authored 6 years ago
  
  1615dab6
Dec 06, 2018
- ENSCORESW-2967 : update schema to 96 · 6399c37b
  Magali Ruffier authored 6 years ago
  
  6399c37b
Oct 25, 2018
- Code review from Tiago · f993d60a
  Wojtek Bazant authored 6 years ago
  
  f993d60a
Oct 18, 2018
- Use references instead of copying · e327bc83
  Wojtek Bazant authored 6 years ago
```
It made recognising incorrect entries needlessly slow
```
  e327bc83
Oct 15, 2018

Fix bug: use return instead of next · 4ab71f7e

Wojtek Bazant authored 6 years ago

return goes back one frame up the stack
next goes back to the closest frame on the stack that supports the
operation (that is close enough in RefSeqGPFFParser alone)
It works unless I subclass create_xrefs, and then my Hive workers die:

Lost control. Check your Runnable for loose 'next' statements that are
not part of a loop       WORKER_ERROR

4ab71f7e

C. elegans specific parsing of RefSeq_dna file · 7d6346f7

Wojtek Bazant authored 6 years ago

- New xref: to a WormBase CDS feature
- Modify WormbaseCElegansRefSeqGPFFParser to serve both kinds of files
- extract a utility method from RefSeqGPFFParser
- xref_config.ini stanza for wormbase_cds
- tests for new functionality

7d6346f7

C. elegans references use WormBase mapping to INSDC protein ids · d66449b6

Wojtek Bazant authored 6 years ago

- maintain naming convention: WormBase specific stuff says Wormbase at the front
- rewrite WormBaseDirectParser
- WormBaseDirectParser populates protein_ids
- superclass method to make dependent protein_ids as parent
- tap into UniProtParser
  + also skip EMBL scaffold ids (we can't reliably assign them)
- tap into RefSeqGPFFParser
  + extract a method
- tests for new stuff
  + add %args to parametrise test_parser

Benefits for RefSeqGPFFParser:
RefSeq proteins have coordinates as part of their identity, so we
can't reliably sequence match them, we will also pick up all paralogs.
This change fixes this spurious mapping.
Benefits for UniProtParser:
Not the above: UniProt entries are not tied to coordinates so all
paralogs map to the same entry. We can handle versioning and updates
a bit better: if WormBase updates an entry and a protein id changes but
UniProt doesn't reflect this yet, with the change we will still pick up
the UniProt entry although we can't sequence match any more.

d66449b6

Oct 01, 2018

Remove artificial dependency on XML::Simple · ddb71bb1

Marek Szuba authored 6 years ago

The only part of the xref-mapping pipeline that depended on the
long-deprecated module XML::Simple was TAIROntologyParser - which did
not actually *use* that module for anything. Get rid of the useless
import, thus making it unnecessary for XML::Simple to be mentioned in
the cpanfile.

ddb71bb1

Sep 25, 2018

create_release_tasks.pl: distinguish between submitter and assignee · 98f8ba25

Marek Szuba authored 6 years ago

Useful under the circumstance when the person running the script is not
in fact the RelCo for the next release, as it has already been the case
before. Saves one having to manually reassign all the newly created
tickets to the actual RelCo. Conversely, if it is the same person just
omit the new argument and the RelCo user name will be used to connect to
JIRA.

Note that the function validating user names is only applied to RelCo
ones. This is intentional, JIRA itself will complain if the submitter is
not authorised to create ENSCORESW tickets.

98f8ba25

Sep 24, 2018

create_release_tasks.pl: explicitly dereference arguments to 'keys' · 295985e9

Marek Szuba authored 6 years ago

Being able to pass a scalar to e.g. keys was an experimental feature
that was added in Perl 5.14 and since declared failed, thus causing the
script to fail with an "Experimental keys on scalar is now forbidden"
error.

295985e9

Updated the ontology DB schema to version 95 · 19f740cc
Marek Szuba authored 6 years ago

19f740cc

Sep 11, 2018
- ENSCORESW-2850 : update usage with all options · f5c96618
  Magali Ruffier authored 6 years ago
  
  f5c96618
Sep 07, 2018
- tidy up badly initialised values · 8293dad6
  Magali Ruffier authored 6 years ago
  
  8293dad6
Sep 05, 2018
- ENSCORESW-2850 : optimised for single species run · 805432c6
  Magali Ruffier authored 6 years ago
  
  805432c6
- ENSCORESW-2853 : match Ensembl species casing · ffacd1b1
  Magali Ruffier authored 6 years ago
  
  ffacd1b1
- ENSCORESW-2805 : tidy up dependent sources · 77e32653
  Magali Ruffier authored 6 years ago
  
  77e32653
- ENSCORESW-2805 : add default priority description · 9bbb4715
  Magali Ruffier authored 6 years ago
  
  9bbb4715