Skip to content

Feature/alias chromosome

Marek Szuba requested to merge feature/alias_chromosome into master

Created by: nwillhoft

Description

Draft pull request to add in a 'chromosome' alias feature to a coordinate system object. Please see JIRA ticket for more details: https://www.ebi.ac.uk/panda/jira/browse/ENSCORESW-3021.

Use case

Currently, a coordinate system named 'chromosome' may not exist in an Ensembl database, due to a change in sequence assembly formatting in databases created in the last few years. One example is the Atlantic Salmon database, which only contains a single coordinate system named 'primary_assembly'. If a user requests a chromosome slice object and there is no explicitly-named chromosome coordinate system in the query database, this new feature will use top-level- and karyotype-based attributes in the database to identify a coordinate system to add an appropriate ('chromosome') alias to.

Registry details used in testing:

my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_registry_from_db(
    -host    => 'mysql-ens-mirror-1.ebi.ac.uk',
    -user    => 'anonymous',
    -verbose => '0',
    -port    => 4240,
    -no_sql_schema_version_check => 1,
    -db_version => '101',
);

Atlantic Salmon use case details:

my $species = 'clupea_harengus';
my $group = 'core';

Slice adaptor / Slice example used:

my $slice_adaptor = $registry->get_adaptor( $species, $group, 'Slice' );
my $slice = $slice_adaptor->fetch_by_region('chromosome', '1');

# query slice for information about itself:
my $start = $slice->start();
my $end = $slice->end();

print "Name: ", $slice->coord_system()->name(), " ",
      $slice->seq_region_name, ", ",
      "alias: ", $slice->coord_system()->alias(), ", ",
      "start: $start, ",
      "end: $end.\n";

Benefits

Where an Ensembl Core species database does not have coordinate system named chromosome, this code will create an alias to the appropriate coordinate system when a user requests a chromosome slice object.

Possible Drawbacks

This is only a draft working version and this functionality has, so far, only been added to the SliceAdaptor::fetch_by_region() subroutine.

Testing

Written simple perl script to access Atlantic Salmon database and create a chromosome slice object. Tested expected output using print statements.

Edited by Stefano Giorgetti

Merge request reports