From 987b78576644bb42c8c58bf3354d99ca1228a92b Mon Sep 17 00:00:00 2001
From: Magali Ruffier <mr6@ebi.ac.uk>
Date: Thu, 29 Sep 2016 16:52:49 +0100
Subject: [PATCH] file not used any more

---
 docs/ensembl_changes_spec.txt | 955 ----------------------------------
 1 file changed, 955 deletions(-)
 delete mode 100644 docs/ensembl_changes_spec.txt

diff --git a/docs/ensembl_changes_spec.txt b/docs/ensembl_changes_spec.txt
deleted file mode 100644
index 0670adfca6..0000000000
--- a/docs/ensembl_changes_spec.txt
+++ /dev/null
@@ -1,955 +0,0 @@
-ENSEMBL - API Change Specification
-==================================
-
-CONTENTS
---------
-
-Introduction
-Goals
-Schema Modifications
-  Proposed New/Modified Tables
-    seq_region
-    coord_system
-    seq_region_annotation
-    dna
-    assembly
-    gene
-    transcript
-    translation
-    all feature tables
-    meta_coord
-    misc_feature
-    misc_set
-    misc_feature_misc_set
-    misc_attrib
-  Removed Tables
-    contig
-    clone
-    chromosome
-Meta Information
-API Changes
-  Slice
-  Tile
-  SliceAdaptor
-  RawContig
-  RawContigAdaptor
-  Clone
-  CloneAdaptor
-  Chromosome
-  ChromosomeAdaptor
-  Root
-  Storable Base Class
-  Features
-    transform
-    transfer
-    move
-    project
-  StickyExon
-  AssemblyMapper
-  FeatureAdaptors
-  CoordSystemAdaptor
-New Features
-  Assembly Exceptions
-  Haplotypes
-  Pseudo Autosomal Regions
-  Multiple Assemblies
-Other Considerations
-  Loci  
-
-
-INTRODUCTION
-------------
-
-This document describes the changes that are being made to the EnsEMBL core
-schema and Perl/Java/C APIs.
-
-GOALS
------
--A cleaner, more intuitive API
--A more general schema able to better capture divergent assembly types
--More flexibility with regards to assembly related data such as haplotypes,
- PARs, WGS assemblies etc.
-
-SCHEMA MODIFICATIONS
---------------------
-
-Proposed New/Modified Tables:
------------------------------
-
-  seq_region
-  ----------
-  The seq_region table is a generic replacement for the clone, contig, 
-  and chromosome tables.  Additionally supercontigs which were formerly in the
-  assembly table are also present in this table.  The name column can contain
-  chromosome names, clone accessions, supercontig names or anything that is
-  appropriate for the seq_region it describes.  The coord_system_id is a 
-  foreign key to the new coordinate system and is used to distinguish 
-  between the divergent types of sequence regions in the table.
-
-  seq_region_id    int
-  name             varchar
-  coord_system_id  int         references coord_system table
-  length           int
-
-
-  coord_system
-  ------------
-  The coordinate system table lists the available coordinate systems in the
-  database.  The attrib is mysql set and is used to denote the default version
-  of each named coordinate system. E.g. there may be two 'chromosome' coordiate
-  systems and the default may be version 'NCBI34'.  The 'top_level' and 
-  sequence level attribs denote the coordinate system from which sequence is
-  retrieved and the coordinate system which has the largest assembled pieces.
-  The top_level coordinate system will usually be 'chromosome' but for some
-  shrapnel assemblied this may be something like 'supercontig' or 'clone'.
-
-  There may be multiple toplevel coordinate systems providing that they share
-  the same name (but different version) and providing one of them is the 
-  default.  There may only be a single sequence level coordinate system.
-
-  Note that the version in the coordinate system can be viewed as applying
-  to every seq_region of a given coordinate system.  It is analagous to 
-  a CVS tag, not a CVS version.  E.g. The version would  'NCBI33' apply to 
-  every chromosome seq_region so it is a valid version.  A clone accession of 
-  '8' would not be a valid version because it only describes a particular
-  seq_region of the coordinate system - not all of them.
-
-  coord_system_id   int
-  name              varchar
-  version           varchar
-  attrib            set ('top_level', 'default_version', 'sequence_level')          
-
-
-  seq_region_annotation
-  ---------------------
-  This table allows for extra arbitrary information to be attached to 
-  seq_regions. For example the htg_phase was formerly part of the clone table
-  but now is stored in this table.
-  
-  seq_region_id   int
-  attrib_type_id  smallint          references attrib_type table
-  value           varchar
-
-
-  dna
-  ---
-  Formerly the contig table referenced the dna table.  Now the dna table 
-  refrences the seq_region_table.  Every seq_region which has a coordinate
-  system with the 'sequence_level' attrib should be referenced by an entry in
-  the dna table.
-
-  seq_region_id  int 
-  sequence       varchar
-
-
-  assembly
-  --------
-  The assembly table has been made more generic.  Columns that previously
-  were names chr_* and contig_* have been renamed asm_* and cmp_* (assembled
-  and component) respectively.   The superctg_name column has been removed.
-  Supercontigs are now defined in the seq_region table.
-
-  The makeup of all seq_regions from smaller seq_regions can be described in
-  this table.  The relationships which are explicitly defined must be listed 
-  in the meta table.  For example, the clone <-> contig mapping used to be
-  defined in the contig table with an embl_offset column. This information is
-  now found in this table instead. 
-
-  asm_seq_region_id  int
-  asm_start          int
-  asm_end            int
-  cmp_seq_region_id  int
-  cmp_start          int
-  cmp_end            int
-  ori                tinyint
-
-  gene
-  ----
-  For faster retrieval and retrieval independently of transcripts and 
-  exons, genes have a seq_region_id, seq_region_start and seq_region_end
-  which defines the span of their transcript.
-
-  The transcript_count column has been removed as it was never used.
-  
-  gene_id             int
-  type                varchar
-  analysis_id         int
-  seq_region_id       int
-  seq_region_start    int
-  seq_region_end      int
-  seq_region_strand   tinyint
-  display_xref_id     int
-
-
-  transcript
-  ----------
-  For faster retrieval and retrieval independently of genes and exons
-  transcripts also have a seq_region_id, seq_region_start and 
-  seq_region_end. The translation_id has been removed; translations will point 
-  to transcripts instead (and pseudogenes will have no translation).  
- 
-  The exon_count column has been removed as it was never used.
-
-  transcript_id      int
-  gene_id            int 
-  seq_region_id      int
-  seq_region_start   int
-  seq_region_end     int
-  seq_region_strand  tinyint
-  display_xref_id    int
-
-  
-  translation
-  -----------
-  Translations now reference transcripts rather than transcripts referencing
-  a single (or no) translation.  This allows for more elegant handling of 
-  pseudogenes (where there is no translation) and also can be used to supply
-  multiple translations for a single transcript (e.g. polycistronic genes).
-
-  translation_id   int
-  transcript_id    int
-  start_exon_id    int
-  end_exon_id      int
-  seq_start        int
-  seq_end          int
-
-
-  all feature tables
-  ------------------
-  All feature tables would now have seq_region_id, seq_region_start, 
-  seq_region_end, seq_region_strand instead of contig_id, contig_start,
-  contig_end.  This includes the repeat_feature, simple_feature, 
-  dna_align_feature, protein_align_feature, exon, marker_feature,
-  karyotype and qtl_feature tables.
-
-  meta_coord
-  ----------
-  The meta coord table defines what coordinate systems are used to store each
-  type of feature.  A given type of feature may be stored in multiple
-  coordinate systems, but these will not be retrieved by the API unless there
-  is an entry in the meta_coord table.
-
-  table_name       varchar
-  coord_system_id  int
-
-
-  misc_feature
-  ------------
-  This is a renaming of the mapfrag table. The renaming reflects the fact that
-  this table can be used to store any type of feature.
-
-  misc_feature_id
-  seq_region_id
-  seq_region_start
-  seq_region_end
-  seq_region_strand
-
-  misc_set
-  --------
-  This table was formerly names mapset.  It defines 'sets' that can be used
-  to group misc_features together.
-
-  misc_set_id  smallint
-  code         varchar
-  name         description
-  description  text
-  max_length   int
-
-  misc_feature_misc_set
-  ---------------------
-  This is a link table defining the many-to-many relationship between the 
-  misc_set and misc_feature tables.
-
-  misc_feature_id   int
-  misc_set_id       smallint
-
-
-  misc_attrib
-  -----------
-  This table was formerly named mapfrag_annotation.  It contains arbitrary
-  annotations of misc_features and links to the same attrib_type table that
-  the seq_region_attrib table uses.
-
-  misc_feature_id  int
-  attrib_type_id   smallint
-  value            varchar
-
-
-  
-Removed Tables
---------------
-
-  contig
-  ------
-  Contigs are no longer needed.  They are stored as entries in the seq_region
-  table with type 'contig'.  The embl_offset and clone_id will not be
-  necessary as their relationship to clones can be described by the 
-  assembly table.
-
-  clone
-  -----
-  Clones are no longer needed.  Clones are stored as entries in the seq_region 
-  table with coord_system 'clone'.  The modified timestamp will be discarded 
-  as it is no longer maintained anyway.  The embl_acc, version, and 
-  embl_version columns are redundant and will also be discarded.  Versions
-  are simply appended onto the end of the name with a delimiting '.'. 
-
-  Any additional information that needs to be present (such as htg_phase) can 
-  be added to the seq_region_attrib table.
-
-  chromosome
-  ----------
-  This table is no longer needed.  Chromosomes can be stored in the 
-  seq_region table with a 'chromosome' coord_system.
-
-
-
-META INFORMATION
-----------------
-
-Considerable more meta information is stored in the core
-databases in order for the general approach to be maintained.  
-This information is stored in the new coord_system table and in the 
-meta, and meta_coord tables.
-
-Meta information includes the following:
-
-  * The coordinate system that features of a given type are stored in.  This
-    information is stored in the meta_coord table and is used when constructing
-    queries for a particular feature table.
-
-  * The top-level coordinate system. For human
-    this would be 'chromosome'.  For briggsae this may be something like
-    'scaffold' or 'super contig'.  This information would be used to construct
-    the web display and would possibly be the default coordinate system when 
-    a coordinate system is unspecified by a user. This is stored as a flag
-    in the coord_system table.
-
-  * The default version of each coordinate system.  This is stored as a flag
-    in the coord_system table.
-
-  * The coord_system where sequence is stored.  This will be stored as a
-    flag in the coord_system table.  Initially it will only be possible
-    to have a single coord_system in which sequence is stored.  This 
-    may be extended in the future to allow sequence to be stored for multiple
-    coord_systems.
-
-  * The coordinate system relationships between that are explicitly defined 
-    in the assembly table.  The new API is capable of 2 step (implicit) mapping
-    between coordinate systems, but these relationships can be determined
-    through the direct relationship information.
- 
-    For example the clone, chromosome and nt_contig coordinate systems may all
-    be constructed from the contig coordinate system:
-      contig -> clone
-      contig -> chromosome
-      contig -> nt_contig
-    Or there may be a more hierarchical approach:
-      contig    -> clone
-      clone     -> nt_contig
-      nt_contig -> chromosome
-    This information is stored in the meta table under the key 
-   'assembly.mapping' with the following format (versions are optional):
-    assembled_coord_system_name[:version]|component_coord_system_name[:version]
-
-    For example the meta table for human might contain the following entries:
-    mysql> select * from meta where meta_key = 'assembly.mapping';
-     +---------+------------------+--------------------------+
-     | meta_id | meta_key         | meta_value               |
-     +---------+------------------+--------------------------+
-     |      43 | assembly.mapping | chromosome:NCBI33|contig |
-     |      44 | assembly.mapping | clone|contig             |
-     |      45 | assembly.mapping | supercontig|contig       |
-     +---------+------------------+--------------------------+
-
-   * The names of the allowable coordinate systems.  This would allow for 
-     quick validation of API requests and provide a list that could be used
-     by the website for coordinate system selection.  This information will be
-     stored in the coord_system table.
-
-   * The coordinate system(s) that each feature type is stored in. This is
-     stored in the meta_coord table.
-
-
-API CHANGES
------------
-
-Slice
------
-  Slice methods chr_start, chr_end, chr_name will be renamed start, end, 
-  seq_region_name.  For backwards compatibility the old methods are 
-  chained to the new methods with deprecated warnings. 
-
-  A new slice method 'coord_system' will be added and will return a
-  Bio::EnsEMBL::CoordSystem object.
-
-  Slices will represent a region on a seq_region as opposed to a region on a
-  chromosome.  Slices will be immutable (i.e. their attributes will not be
-  changeable).  A new slice will have to be created if the attributes are to
-  be changed.
-
-  The following attributes will therefore define a unique slice:
-  coord_system    (e.g. object with name and version)
-  seq_region_name (e.g. 'X' or 'AL035554.1')
-  start           (e.g. 1000000 or 1)
-  end             (e.g. 2000001 or 800)
-  strand          (e.g. 1 or -1)
-
-  The name method will return the above values joined by a ':' delimiter, and
-  will not be settable:
-  e.g.  'chromosome:NCBI33:X:1000000:2000001:1' or 'clone::AL035554.1:1:800:-1'
-  This value can be used as a hashvalue that uniquely defines a slice.
-
-  The concept of an 'empty' slice will no longer exist.
-
-  The get_tiling_path method will be deprecated in favour of a more general
-  method project().  Whereas get_tiling_path() implies a relationship between 
-  an assembly and the coordinate system which makes up the assembly the
-  project method will allow conversion accross any two coordinate systems.
-  It will take a coord_system string as an argument and rather 
-  than returning a list of Tile objects it will return a listref of triplets 
-  containing a start int, and end int, and a 'to' slice object.  The following 
-  is an example of how this method would be used ($clone is a reference to a 
-  slice object in the clone coordinate system):
-
-    my $clone_path = $slice->project('clone');
-
-    foreach my $segment (@$clone_path) {
-      my ($start, $end, $clone) = @$segment;
-      print $slice->seq_region_name, ':', $start, '-', $end , ' -> ',
-            $clone->seq_region_name, ':', $clone->start, '-', $clone->end, 
-            $clone->strand, "\n";
-    }
-
-    An optional second argument to project() will be the coordinate
-    system version.  E.g.:
-     $ncbi34_path = $slice->project('chromosome','NCBI34').
-
-
-Tile
-----
-  The tile object will no longer be necessary.  However for backwards
-  compatibility it will remain in the system for some time before being phased
-  out along with the get_tiling_path method.
-
-
-SliceAdaptor
-------------
-  The Slice adaptor must provide a method to fetch a slice via its coordinate
-  system, seq_region_name, start, end, and strand.  
-  The old, commonly used method fetch_by_chr_start_end has been altered to 
-  simply chain to this new method (with a warning) as do most other 
-  SliceAdaptor methods.
-
-  Another method which is necessary with the disapearence of the Clone,
-  RawContig and Chromosome adaptors is one which allows for all slices
-  of a certain type to be retrieved.  For example it is often necessary to 
-  retrieve all chromosomes, or clones for a species.  This method is simply
-  named fetch_all.  The old fetch_all methods on the ChromosomeAdaptor, 
-  RawContigAdaptor, CloneAdaptor, etc. chain to the new method for backwards 
-  compatibility.
-
-  Method Names and Signatures
-  ---------------------------
-    Slice fetch_by_region(coord_system, name)
-    Slice fetch_by_region(coord_system, name, start)
-    Slice fetch_by_region(coord_system, name, start, end)
-    Slice fetch_by_region(coord_system, name, start, end, strand)
-    Slice fetch_by_region(coord_system, name, start, end, strand, version)
-    listref of Slices fetch_all(coord_system)
-    listref of Slices fetch_all(coord_system, version)
-  
-RawContig
----------
-  The RawContig object is no longer necessary with the new system.  RawContigs
-  are replaced by Slices with coord_system = 'contig'. In the interests of 
-  backwards compatibility the RawContig class will still be present for
-  sometime as a minimal implmentation inheriting from the Slice class.
-
-
-RawContigAdaptor
-----------------
-  The RawContigAdaptor is no longer necessary.  The RawContigAdaptor is 
-  replaced by the SliceAdaptor.  For backwards compatibility a minimal 
-  implementation of the RawContigAdaptor will remain which inherits from the 
-  SliceAdaptor.
-
-Clone
------
-  The Clone object is no longer necessary in the new system.  Clones are 
-  replaced by Slices with coord_system = 'clone'. For backwards compatibility
-  a minimal implementation will remain which inherits from the Slice object.
-
-CloneAdaptor
-------------
-  The CloneAdaptor object is no longer necessary in the new system.  The
-  CloneAdaptor is replaced by the SliceAdaptor.  For backwards compatibility
-  a minimal implementation will remain which inherits from the SliceAdaptor.
-
-Chromosome
-----------
-  The Chromosome object is no longer necessary in the new system.  The
-  Chromosome is replaced by Slices with coord system 'chromosome' (or
-  whatever the top level seq_region type is for that species).  For backwards
-  compatibility a minimal implementation will remain which inherits from the
-  Slice object.  
-
-  Statistical information (e.g. known genes, genes, snps) that 
-  was on chromosomes may be stored in the seq_region_attrib table or
-  in some sort of density table.
-
-ChromosomeAdaptor
------------------
-  The Chromosome object is no longer necessary in the new system. The 
-  ChromosomeAdaptor is replaced by the SliceAdaptor.  For backwards 
-  compatibility a minimal implementation which inherits from the SliceAdaptor
-  will remain.
-
-
-Root
-----
-  Every class in the current EnsEMBL perl API inherits directly or indirectly
-  from Bio::EnsEMBL::Root.  This inheritance is almost exclusively for the
-  following following three methods:
-    throw
-    warn
-    _rearrange
-
-  Nothing is gained by implementing this relationship as inheritance, and there
-  are several disadvantages:
-    (1) Everything must inherit from this class to use those 3 object methods.
-    This can result in patterns of multiple inheritance which are generally
-    considered to be a bad thing.
-
-    (2) It is not possible to use the throw, warn or rearrange method within
-    the constructor until the object is blessed.  Blessing the object first
-    and then calling rearrange to extract named arguments is slower because
-    the blessed hash needs to be expanded as more keys are added and several
-    key access/value assignements may need to be performed.
-
-    (3) Objects become larger and object construction becomes slightly slower
-    because constructors traverse an additional level of inheritance.
-
-  A better approach, which we have used, is to make the methods static and 
-  create a static utility class that exports the methods.  The warn method has
-  been renamed warning so as not to conflict with the builtin perl function
-  warn and the _rearrange method has be renamed rearrange.
-
-  The following is an example of the old styl Root inheritance and the new
-  style static utility methods:
-
-  #
-  # OLD STYLE
-  #
-  package Old;
-
-  use Bio::EnsEMBL::Root;
-
-  @ISA = qw(Bio::EnsEMBL::Root);
-
-  sub new {
-    my $caller = shift;
-    my $class = ref($caller) || $caller;
-
-    $self = $class->SUPER::new(@_);
-    
-    my ($start, $end) = $self->_rearrange(['START', 'END'], @_);
-
-    if(!defined($start) || !$defined($end)) {
-      $self->throw('-START and -END arguments are required');
-    }
-
-    $self->{'start'} = $start;
-    $self->{'end'}   = $end;
-
-    return $self;
-  }
-
-  #
-  # NEW STYLE
-  #
-  package New;
-
-  use Bio::EnsEMBL::Utils::Exception qw(throw warning);
-  use Bio::EnsEMBL::Utils::Argument  qw(rearrange);
-
-  sub new {
-    my $caller = shift;
-
-    my $class = ref($caller) || $caller;
-
-    my ($start, $end) = rearrange(['START', 'END'], @_);
-
-    if(!defined($start) || !defined($end)) {
-      throw('-START and -END arguments are required');
-    }
-
-    return bless {'start' => $start, 'end' => $end}, $class;
-  }
-
-  The calls to $self->rearrange $self->warn and $self->throw have been 
-  replaced by class method calls to warning() and throw() inside the core API.
-  However, for backwards compatibility the existance inheritance to 
-  Bio::EnsEMBL::Root will remain in many cases (and be removed at a later date)
-
-
-Storable Base Class
--------------------
-  Almost all business objects in the EnsEMBL system are storable in the db
-  and the ones which are always require 2 methods: dbID() and adaptor().  These
-  methods have been moved to a Storable base class which most of the 
-  business objects now inherit from.  This module has an additional method
-  is_stored() which takes a database argument and returns true if the object
-  appears to already have been stored in the provided database.
-
-Features
---------
-  All features should inherit from a base class that implements common feature
-  functionality.  Formerly this role was filled by the bloated SeqFeature class
-  which inherits from Bio::SeqFeature and Bio::SeqFeatureI etc.
-  This class has been replaced by a smaller, less complicated 
-  implementation named Feature.  To make classes more polymorphic in general,
-  the gene, and transcript objects should now also inherit from the Feature
-  class.  This class implements the following core methods common to all 
-  features:
-
-    start
-    end
-    strand
-    slice (formerly named contig/entire_seq/etc.)
-    transform
-    transfer
-    project
-    analysis
-    
-  The feature class inherits from the Storable base class and thereby inherits
-  the following methods:
-
-    adaptor
-    dbID
-    is_stored
-
-  The signature and behaviour of the transform method has been changed.  The
-  existing method works differently depending on the arguments passed as 
-  described below.
-
-    OLD transform(no arguments)
-    -----------------------
-      Transforms from slice coordinates to contig coordinates.  The feature
-      is changed in place and returned.  If the feature already is in contig
-      coordinates an exception is thrown.  The feature may be split into two
-      features in which case both features are returned (not sure if one of
-      them is transformed in place).  Some features are not permitted to be
-      split in two in which case an exception is thrown? (not sure) if it is
-      to be split accross contigs.
-
-    OLD transform(slice)
-    ----------------
-      If the feature is already in slice coordinates and the slice is on the
-      same chromosome the features coordinates are simply shifted.  If the
-      feature is already in slice coordinates but on a different chromosome
-      an exception is thrown.
-      It the feature is in contig coordinates and the slice is not empty then
-      it is transformed onto the new slice (or an exception is thrown if the 
-      transform would cause the feature to end up on a different chromosome
-      than the slice).  If the feature is in contig coordinates and the
-      slice is an empty slice the feature is transformed into chromosomal
-      coordinates and placed on a newly created slice of the entire chromosome.
-
-    The new transformation has only a single valid signature and splits its 
-    responsibilities with the new transfer method.  The transfer 
-    method transfers a feature onto another slice, whereas the transform
-    method simply converts coordinate systems. Transform does 
-    NOT transform features in place but rather returns the newly 
-    transformed feature as a new object:
-    
-    transform(coord_system, [version])
-    -----------------------
-      Takes a single string specifying the new coord system. If the coord
-      system is not valid an exception is thrown. If the coord system is the
-      same coord system as the feature is currently in a new feature that is
-      a copy of the old one is still be returned.  This also retrieves
-      a slice which is the entire span of the region of the coordinate system
-      that this feature is being transformed to.  For example transforming
-      an exon in contig coordinates to chromosomal coodinates will place a 
-      copied exon on a slice of an entire chromosome.  If a feature spans a 
-      boundary in the coordinate system, undef is returned by the method 
-      instead.
-
-    transfer(slice)
-    ----------------
-      Shifts a feature from one slice to another.  If the new slice is in the
-      same coordinate system but different seq_region_name (e.g. both 
-      chromosomal but different chromosomes) an exception is thrown.  
-      If the new slice is in a different coordinate system then the 
-      transform method is internally called first.  If the feature would be 
-      split across a boundary undef is returned instead.  After the transform 
-      there follows a potential move, if the slice does not cover the full 
-      seq_region. If there is no transform call necessary, the feature is 
-      copied and then moved.
-
-    move( start, end, strand )
-    --------------------------
-      In place change of the coordinates of the feature. It will stay on the 
-      same slice.
-
-    project(coord_system, [version])
-    -----------------
-      This method is analagous to the project method on Bio::EnsEMBL::Slice.
-      It 'projects' a feature onto another coordinate system and returns the
-      results formatted as a listref of [$start, $end, $feature] triplets.
-      The $features returned are copies of the feature on which the method was
-      called, but with coordinates in the coordinate system that was projected
-      to.  If the feature maps entirely to a gap then an empty list ref [] will
-      be returned.  If the feature is mapped to multiple locations a listref
-      containing split features will be returned.
-
-StickyExon
-----------
-  The sticky exon object is not be present in the new system.  It does not
-  make sense to define features in a coordinate system where they are simply 
-  not present.  Exons are calculated in chromosomal coordinates and they
-  will generally be retrieved in the same coordinates system.  It will
-  of course be possible to still retrieve exons in contig coordinates but only
-  is they are fully defined on the contigs of interest.
-  The split coordinates can be obtained through a call to the project
-  method.
-
-
-AssemblyMapper
---------------
-  The assembly mapper and assembly mapper adaptor classes have become more
-  general and sophisticated.Not only is it possible to map between two
-  coordinates systems whose relationship is explicitly defined in the assembly
-  table, but it is also possible to perform implicit, 2-step mapping
-   using 'coordinate system chaining'.
-
-  For example if no explicit relationship is defined between the supercontig
-  and clone coordinate systems but relationships between the clone and contig
-  and the supercontig and contig coordinate systems is present the mapper has
-  the faculty to perform the mapping between the clone and supercontig systems.
-  In this case the contig cooridinate system is used as an intermediary:
-  
-  NTContig <-> Contig <-> Clone
-
-  In the above example the assembly mapper adaptor internally does the 
-  following:
-  
-  (1) Create a mapper object between the NTContig and Contig region
-  (2) Create a mapper object between the Contig and Clone region
-  (3) Create and return a third mapper constructed from the sets of mappings
-  generated by the intermediate mappers.
-
-  
-FeatureAdaptors
----------------
-  Most FeatureAdaptors inherit from the BaseFeature adaptor.  As a 
-  minimum feature adaptors provide fetch_all_by_Slice and fetch_by_dbID 
-  methods. The fetch by slice method provides the same return types and 
-  requires the same arguments as before, but required some internal changes.
-
-  The simplified algorithm for fetching features via a slice is:
-
-  (1) Check with coord system is requested or that slice is in.
-  (2) Check which coord system features are in
-  (3) Obtain mapper between coord systems
-  (5) Retrieve features in their native coord system.
-  (6) Remap features to the requested coord system using the mapper
-  (7) Return the features
-
-  The method fetch_all_by_RawContig is obsolete (it is equivalent to
-  fetching by a slice of a contig) but has be left in as an alias for the
-  fetch all by slice method for backwards compatibility.
-
-  When performing a non-locational fetch (e.g. by dbID) features are still
-  returned in the coordinate system that they are calculated in.  This is to
-  ensure that the feature can always be retrieved in this manner of fetching
-  and so that features which are not in the database can be distinguished from
-  features which are simply not in the requested coordinate sytem.  When a 
-  single feature which is not in the database is requested via a non-locational
-  fetch undef is returned instead.  If multiple features are requested but none
-  are present in the database a reference to an empty list is returned.  If the
-  features are required in a specific coordinate system the transfer, project 
-  or transform method can always be used.
-
-
-CoordSystemAdaptor
-------------------
-  A CoordSystemAdaptor provides access to the information in the 
-  coord_system, meta and meta_coord tables.  This adaptor provides 
-  Bio::EnsEMBL::CoordSystem objects.
-
-
-NEW FEATURES
-------------
-
-Assembly Exceptions (Symbolic Sequence Links)
----------------------------------------------
-
-  It is sometimes desirable to have multiple regions refer to the same 
-  sequence.
-
-  In much the same way a symlinked file acts as a pointer to a real file, 
-  a symlinked region can point to another region of sequence.
-
-  This can be described in the database through the addition of a table which 
-  has a structure that mirrors that of the assembly table. The assembly table
-  does not define the structure underlying this seq_region, and it does not
-  have sequence of its own.  By means of the assembly_exception table this 
-  seq_region points to another seq_region where the underlying sequence is 
-  defined:
-
-      assembly_exception
-      ------------------   
-      seq_region_id        int
-      seq_region_start     int
-      seq_region_end       int
-      exc_type             enum('HAP', 'PAR')
-      exc_seq_region_id    int
-      exc_seq_region_start int
-      exc_seq_region_end   int
-      ori                  int  (may not be needed, may implicitly be 1)
-
-   When fetching features and sequence from a slice that overlaps a symlinked
-   region, the features and sequence from the symlinked region are returned.  
-   This may be implemented by altering fetch by slice calls and adding a 
-   SliceAdaptor method with splits a slice into non-symlinked components.  
-   The following algorithm would apply to sequence and feature fetches:
-      (1) Split the slice into non-symlinked component slices
-      (2) Recursively call the method with the component slices
-      (3) Adjust the start and end of the returned features and place them
-          back on the original slice (or splice the sequence together if this
-          is a sequence fetch)
-      (4) Return the features or sequence
-   
-   Consider a slice which overlaps regions (A), (B), and (C) on chromosome Y:
-
-             ===============  (chrX)
-               ^^^^^^^^^^^
-     ========   symlink     =========  (chrY)
-      (A)          (B)         (C)
-
-   Regions (A) and (C) are described by the assembly table, but region (B)
-   is described in the assembly_exception table and points to a region of
-   chromosome Y.  When features or sequence are retrieved the slice is split
-   into 3 component slices which have no symlinks:  region (A) and (C) are
-   slices on chromosome Y but region (B) is made into a slice on chromsome (X).
-   All of the features are fetched from the individual slices adjusted by
-   some addition and placed on back on the original Slice before being
-   returned. 
-  
-   
-    
-Haplotypes (and the MHC region)
--------------------------------
-  There are several requirements related to haplotypes:
-    - Must be able to determine which haplotypes overlap a slice
-    - Must be able to run genebuild/raw computes over the haplotypes
-    - Must be able to retrieve a slice on a haplotype and its flanking
-      regions (i.e. the regions of the default assembly bordering the 
-      haplotype).
-    - It may be desireable to interpolate features from the default sequence
-      onto the haplotype
-
-   Proposal:
-    The haplotype will be present as a full length 'chromosome' in the 
-    seq_region table (or other appropriate coordinate system) but only the 
-    region which differs from the the default assembly will be described in 
-    the chromosome table.  The regions which are identical will be described 
-    by the assembly_exception table. 
-  
-    It is possible to retrieve a slice on a haplotype just as any other slice
-    is retrieved from the SliceAdaptor.  For example: 
-    $slice = $slice_adaptor->fetch_by_region('chromosome', '6_DR52');
-    
-    A slice created on a haplotype will have coordinates relative to the
-    start of the chromosome NOT relative to the start of the haplotype
-    region. For all intents and purposes a haplotype slice will behave as
-    a normal slice.
-
-    For example, the assembly table could define the composition of the 
-    divergent region of chromosome 6_DR52 (C), but leave the remainder of the
-    chromosomal composition undefined.  The remainder of the 
-    chromosome composition would be accounted for by 2 rows in the 
-    assembly_exception table which described the synonymous regions in terms 
-    of chromosome 6:
- 
-       ==============  6       ==============  6
-            ^                         ^
-       _____|________          _______|______  
-                   C ==========               6_DR52
-
-
-
-
-
-Pseudo Autosomal Regions (PARs)
--------------------------------
-  There are several requirements related to PARs:
-    - The same sequence and features must be present on a region of
-      both chromosome X and chromosome Y
-    - The region and features should be returned when retreiving features
-      from either chromosome.
-    - It must still be possible to retrieve one of the features via its 
-      identifier
-    - It must still be possible to transform features in the region from 
-      chromosomal coordinates to contig coords and vice-versa.
-    - The genebuild should run over the region, but only once.
-
-  Proposal:
-    Use the assembly_exception table in a similar fashion as it is used
-    for the haplotypes described above. Chromosome X can be the 'default' 
-    chromosome for the PAR and Chromosome Y can be described by the assembly 
-    table except in the PAR.  The PAR on chromosome Y can be defined by the 
-    assembly_exception table and refer to the corresponding sequence on 
-    chromosome X.  The same algorithm as used for haplotypes can then be used 
-    when retrieving sequence or features from slices which overlap this 
-    exception on chromosome Y.
-
-    The following diagram illustrates how chromsome X and chromosome Y could
-    be defined:
-
-    ========================================== X
-           ^                 ^ 
-          _|_            ____|____ 
-    ======   ============         ============= Y
-
-
-Multiple Assemblies
--------------------
-
-In theory it is possible to load multiple assemblies into the same database.
-For example two coordinate systems with two versions chromosome:NCBI33 and
-chromosome:NCBI34 could be loaded into the database.  Leveraging the fact that
-two step mapping is possible and that these coordinate systems share a 
-coincident mapping with the contig coorinate system it is possible to pull
-across annotation from one assembly to the other. The following example 
-illustrates the transfer of genes from the chromosome X on the NCBI33 assembly
-to the NCBI34 assembly:
-
-  $slice = $slice_adaptor->fetch_by_region('chromosome', 'X', undef,
-                                           undef,undef, 'NCBI33');
-  @genes = @{$gene_adaptor->fetch_all_by_Slice($slice)};
-
-  foreach my $gene (@genes) {
-    $gene->transform('chromosome', 'NCBI34');
-    #...
-  }
-
-
-OTHER CONSIDERATIONS
---------------------
-
-Loci
-----
-
-Similar genes which are defined across haplotypes need
-to be somehow linked into loci.  The intent is that 
-a user would be able to see that a gene has a counterpart
-on an equivalent haplotypic sequence. 
-
-This is work in progress. The current opinion among us is to implement
-it via a relationship table that specifies which genes on default
-haplotypes are considered equivalent to other genes on
-(overlapping) haplotypes.
-  
-- 
GitLab