Skip to content
Snippets Groups Projects
Commit f00dc2bc authored by Ian Longden's avatar Ian Longden
Browse files

updated info for priority xrefs

parent a4dcb98e
No related branches found
No related tags found
No related merge requests found
......@@ -52,6 +52,12 @@ RGD (rat only)
Refseq_dna
----------
Refseq_dna is now a priority xref source for human, so in addition to the ncbi file used it will
also use a local file that is generated from the CCDS data which DIRECTLY links refseqs to the ensembl
trancsripts. If a refseq is not in this file then the sequence data from the ncbi is used to mapped
via exonerate in the normal manner.
More generally.
The files come in two types those for specific species i.e.
ftp://ftp.ncbi.nih.gov/genomes/Gallus_gallus/RNA/rna.gbk.gz
......@@ -253,19 +259,37 @@ entries will also be matched.
HUGO
----
HUGO data uses prioritys to allocate each identifier to one ensembl id.
The prioritys are :-
1) via Havana
2) Via CCDS
3) Via Refseq
4) Via Uniprot
5) Via Entrezgene
1) DIRECT relationships are made by transfering the manually annotated ones from
havana to ensembl.
LOCAL:HUGO/HUGO_TO_ENSG
2) DIRECT relationships are made by transfering the ones from CCDS to ensembl.
LOCAL:HUGO/CCDS_TO_HUGO
3,4 and 5)
The Human Genome Organisation xrefs are obtained from using the following url:-
http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/gdlw.pl?title=Genew+output+data
&col=gd_hgnc_id&col=gd_app_sym&col=gd_app_name&col=gd_prev_sym&col=gd_aliases
&col=md_prot_id&col=gd_pub_refseq_ids&status=Approved&status=Approved+Non-Human
&status_opt=3&=on&where=&order_by=gd_hgnc_id&limit=&format=text&submit=submit
&.cgifields=&.cgifields=status&.cgifields=chr
&col=md_prot_id&col=gd_pub_refseq_ids&col=md_eg_id&status=Approved
&status=Approved+Non-Human&status_opt=3&=on&where=&order_by=gd_hgnc_id&limit=
&format=text&submit=submit&.cgifields=&.cgifields=status&.cgifields=chr
Which is a script that produces a list of HUGO identifiers with the uniprot and
refseq entries they are linked to.
Which is a script that produces a list of HUGO identifiers with the Uniprot and
Refseq and EntrezGene entries they are linked to.
The files have references to uniprot and refseq entries and so the GO entries are
set to be dependent xref on these.
The files have references to uniprot, refseq and entrezgen entries and so the
HUGO entries are set to be dependent xref on these.
NOTE: due to length of its name the file is stored in the name of its checksum.
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment