ensembl-otter merge requestshttps://gitlab.ebi.ac.uk/ensembl-gh-mirror/ensembl-otter/-/merge_requests2020-10-09T18:32:35Zhttps://gitlab.ebi.ac.uk/ensembl-gh-mirror/ensembl-otter/-/merge_requests/127Module to fetch data from a compara database2020-10-09T18:32:35ZMarek SzubaModule to fetch data from a compara database*Created by: thibauthourlier*
*Created by: thibauthourlier*
https://gitlab.ebi.ac.uk/ensembl-gh-mirror/ensembl-otter/-/merge_requests/126Do not check pipeline table if primary_assembly2020-10-09T18:30:56ZMarek SzubaDo not check pipeline table if primary_assembly*Created by: thibauthourlier*
This is needed for herring and all new species*Created by: thibauthourlier*
This is needed for herring and all new specieshttps://gitlab.ebi.ac.uk/ensembl-gh-mirror/ensembl-otter/-/merge_requests/118Make attributes persistent2021-02-25T10:18:58ZMarek SzubaMake attributes persistent*Created by: s-mm*
This PR combines changes required to resolve ENSCORESW-3411 and ENSCORESW-3390.
1. ENSCORESW-3411
Author changed for any transcript in an edited locus for the following reasons
- Missing attributes
- Differ...*Created by: s-mm*
This PR combines changes required to resolve ENSCORESW-3411 and ENSCORESW-3390.
1. ENSCORESW-3411
Author changed for any transcript in an edited locus for the following reasons
- Missing attributes
- Difference in number of the same attribute
- Difference in letter case for vega hashkey for transcript
- Trailing white space in vega hashkey
- Difference in status of biotypes
Similar reasons were observed for author changes at gene level.
2. ENSCORESW-3390
When a gene/transcript is edited, only few attributes are added to gene_attrib/transcript_attrib table. It does not add attributes like `'vega_name', 'TAGENE_transcript', 'MANE_Select', 'ccds_transcript', 'miRNA', 'ncRNA', 'Frameshift'`. Whenever a locus is edited, the code checks if these attributes are present. As these attributes are not added to gene_attrib/transcript_attrib table, the code assumes the gene/transcript has been edited even when it has not been edited.
`'vega_name', 'TAGENE_transcript', 'MANE_Select', 'ccds_transcript', 'miRNA', 'ncRNA', 'Frameshift'` is set for transcripts. `'vega_name'` is set for genes.
Note:
- `upstream_ATG, parent_exon_key, parent_sid` attributes have not been handled. This is due to lack of testing.
- Author changes have been noticed for transcripts due to change in status for few biotypes (miRNA, ncRNA). Author changes have been noticed for genes as the code sets the `'name'` attribute for these genes. This has been noticed for genes that were introduced in the DB as part of the NoMerge process.
- Random author changes have been noticed for ENSE and ENSP. This is due to change in vega_hashkey.