Make attributes persistent
Created by: s-mm
This PR combines changes required to resolve ENSCORESW-3411 and ENSCORESW-3390.
- ENSCORESW-3411 Author changed for any transcript in an edited locus for the following reasons
-
Missing attributes
-
Difference in number of the same attribute
-
Difference in letter case for vega hashkey for transcript
-
Trailing white space in vega hashkey
-
Difference in status of biotypes
Similar reasons were observed for author changes at gene level.
-
ENSCORESW-3390
When a gene/transcript is edited, only few attributes are added to gene_attrib/transcript_attrib table. It does not add attributes like
'vega_name', 'TAGENE_transcript', 'MANE_Select', 'ccds_transcript', 'miRNA', 'ncRNA', 'Frameshift'
. Whenever a locus is edited, the code checks if these attributes are present. As these attributes are not added to gene_attrib/transcript_attrib table, the code assumes the gene/transcript has been edited even when it has not been edited.
'vega_name', 'TAGENE_transcript', 'MANE_Select', 'ccds_transcript', 'miRNA', 'ncRNA', 'Frameshift'
is set for transcripts. 'vega_name'
is set for genes.
Note:
-
upstream_ATG, parent_exon_key, parent_sid
attributes have not been handled. This is due to lack of testing. -
Author changes have been noticed for transcripts due to change in status for few biotypes (miRNA, ncRNA). Author changes have been noticed for genes as the code sets the
'name'
attribute for these genes. This has been noticed for genes that were introduced in the DB as part of the NoMerge process. -
Random author changes have been noticed for ENSE and ENSP. This is due to change in vega_hashkey.