Skip to content
Snippets Groups Projects
Commit 269d5c19 authored by Magali Ruffier's avatar Magali Ruffier
Browse files

added new test database with complete 72 data

allows testing on patches and similar
parent 25e2ec54
No related branches found
No related tags found
No related merge requests found
Showing
with 4011 additions and 0 deletions
1 4679 633682
2 4679 633696
3 4790 633688
4 4790 633702
5 4795 633683
6 4795 633697
7 4998 633686
8 4998 633700
9 5212 633689
10 5212 633703
1 IS_REPRESENTATIVE
3 IS_REPRESENTATIVE
5 IS_REPRESENTATIVE
7 IS_REPRESENTATIVE
9 IS_REPRESENTATIVE
4679
4790
4795
4998
5212
8405 2009-05-14 15:43:42 ensembl_havana_gene NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8406 2009-06-01 09:01:22 xrefexoneratedna NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8407 2009-05-14 15:43:42 havana NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8408 2007-09-07 12:01:22 est2genome_human_havana NULL 10-Nov-10 (105) NULL NULL NULL NULL ori_analysis => Est2genome_human_raw, mode => single ClusterDepthFilter NULL EST_Human similarity
8409 2007-09-07 12:01:22 vertrna_havana NULL 10-Nov-10 (105) NULL NULL NULL NULL ori_analysis => vertrna_raw, mode => single, no_filter => 9606 ClusterDepthFilter NULL vertebrate_mRNA similarity
8410 2010-09-30 09:17:16 human_cdna2genome NULL NULL NULL NULL NULL /nfs/users/nfs_j/jhv/bin/exonerate.hacked.cdna2genome NULL Exonerate2Genes NULL NULL NULL
8411 0000-00-00 00:00:00 human_cdna NULL NULL NULL exonerate 0.9.0 /usr/local/ensembl/bin/exonerate-0.9.0 NULL Exonerate2Genes NULL Exonerate similarity
8412 2007-09-07 12:01:22 uniprot_sw_havana NULL 2010_11 NULL NULL NULL NULL percentid_cutoff => 40, ori_analysis => Uniprot_raw, hit_db => Swissprot, mode => single DepthFilter NULL SwissProt NULL
8413 2009-03-11 17:25:55 human_protein NULL refseq_40,uniprot_2010_07 NULL NULL NULL NULL NULL BestTargetted NULL NULL NULL
8414 2007-09-07 12:01:22 uniprot_tr_havana NULL 2010_11 NULL NULL NULL NULL percentid_cutoff => 40, ori_analysis => Uniprot_raw, hit_db => TrEMBL, mode => single DepthFilter NULL TrEMBL NULL
8415 2009-06-01 09:01:22 xrefexonerateprotein NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8416 2012-09-03 17:48:15 xrefchecksum NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8417 2013-04-02 14:50:52 pfam Pfam NULL /data/blastdb/Ensembl/interpro_scan/Pfam-A.hmm /software/ensembl/bin/hmmer3/hmmscan NULL /software/ensembl/bin/hmmer3/hmmscan --acc --noali --cut_ga --cpu 1 ProteinAnnotation/Hmmpfam NULL Pfam domain
8418 2013-04-02 14:50:52 superfamily Superfamily NULL /data/blastdb/Ensembl/interpro_scan/superfamily.tab /software/ensembl/bin/superfamily.pl NULL /software/ensembl/bin/superfamily.pl -t /tmp -m /data/blastdb/Ensembl/interpro_scan/superfamily.hmm -d /data/blastdb/Ensembl/interpro_scan/superfamily.tab -a /data/blastdb/Ensembl/interpro_scan/superfamily.acc -p /software/ensembl/bin/hmmpfam -s /software/ensembl/bin/ 1e-05 -r y ProteinAnnotation/Superfamily NULL Superfamily domain
8419 2013-04-02 14:50:52 smart Smart NULL /data/blastdb/Ensembl/interpro_scan/smart.HMMs hmmpfam NULL hmmpfam -E 0.01 -A 100 -Z 350000 --acc --cpu 1 ProteinAnnotation/Hmmpfam NULL Smart domain
8420 2013-04-02 14:50:51 seg low_complexity NULL NULL seg NULL seg NULL ProteinAnnotation/Seg NULL Seg annotation
8421 2013-04-02 14:50:52 pirsf PIRSF NULL /data/blastdb/Ensembl/interpro_scan/pirsf.dat /software/ensembl/bin/pirsf.pl NULL /software/ensembl/bin/pirsf.pl -pirsf /data/blastdb/Ensembl/interpro_scan/pirsf.dat -sfhmm /data/blastdb/Ensembl/interpro_scan/sf_hmm.bin -subsf /data/blastdb/Ensembl/interpro_scan/sf_hmm_sub -sfseq /data/blastdb/Ensembl/interpro_scan/sf.seq -sftb /data/blastdb/Ensembl/interpro_scan/sf.tb -hmmer /software/ensembl/bin/hmmpfam -blast /software/ensembl/bin/blastall ProteinAnnotation/PIRSF NULL PIRSF domain
8422 2013-04-02 14:50:52 pfscan Prosite_profiles NULL /data/blastdb/Ensembl/interpro_scan/prosite.profiles pfscan NULL pfscan NULL ProteinAnnotation/PrositeProfile NULL Profile domain
8423 2013-04-02 14:50:52 signalp signal_peptide NULL NULL signalp NULL /software/worm/signalp/signalp NULL ProteinAnnotation/Signalp NULL Signalp annotation
8424 2010-09-23 14:40:18 ccds NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8425 2009-05-14 15:43:42 ensembl_havana_transcript NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8426 2013-04-02 14:50:51 ensembl NULL NULL NULL NULL NULL NULL NULL GeneBuilder NULL NULL NULL
8427 2013-04-02 14:50:52 prints Prints NULL /data/blastdb/Ensembl/interpro_scan/prints.pval /software//ensembl/bin/FingerPRINTScan NULL /software/ensembl/bin/FingerPRINTScan -e 0.0001 -d 10 -E 257043 84355444 -fjR -a -o 15 ProteinAnnotation/Prints NULL Prints domain
8428 2007-09-07 12:01:22 est2genome_mouse_havana NULL 10-Nov-10 (105) NULL NULL NULL NULL ori_analysis => Est2genome_mouse_raw, mode => single ClusterDepthFilter NULL EST_Mouse similarity
8429 2013-04-02 14:50:52 ncoils coiled_coil NULL /usr/local/ensembl/data/coils ncoils NULL ncoils NULL ProteinAnnotation/Coil NULL ncoils annotation
8430 2009-05-12 18:26:57 ncrna NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8431 2009-05-20 12:21:05 rfamblast Rfam NULL /lustre/scratch1/ensembl/sw4/ncRNA/BLAST/high_copy.fasta wublastn NULL wublastn lowcopy => /lustre/scratch1/ensembl/sw4/ncRNA/BLAST/low_copy.fasta Bio::EnsEMBL::Analysis::RunnableDB::BlastRfam NULL ensembl gene
8432 2009-03-27 11:33:33 human_est human_ests NULL /lustre/work1/ensembl/jb16/NCBI37/ests/est_chunks NULL NULL exonerate-0.9.0 NULL Exonerate2Genes NULL NULL NULL
8433 2012-03-02 14:01:07 proj_ncrna NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8434 2012-03-01 13:57:43 projected_transcript NULL NULL NULL NULL NULL NULL NULL ProjectedTranscriptEvidence NULL NULL NULL
8435 2012-03-02 14:01:07 proj_havana NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8436 2013-04-30 11:33:07 lrg_import NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8437 2011-02-15 17:06:42 firstef firstef NULL NULL firstef NULL NULL -repeatmasked FirstEF NULL firstef exon
8438 2011-02-15 17:06:42 eponine Eponine NULL NULL eponine-scan 1 /vol/software/linux-x86_64/jdk1.6.0_01/bin/java -epojar => /usr/local/ensembl/lib/eponine-scan.jar, -threshold => 0.999 EponineTSS NULL Eponine TSS
8439 2011-02-15 17:06:42 cpg cpg NULL NULL cpg NULL cpg NULL CPG NULL cpg cpg_island
8440 2011-02-15 17:06:42 genscan HumanIso.smat NULL HumanIso.smat genscan 1.0 genscan NULL Genscan NULL genscan prediction
8441 2013-07-25 14:40:18 shortnoncodingdensity NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8442 2013-07-25 14:40:18 codingdensity NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8443 2013-07-25 14:40:19 percentagerepeat NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8444 2013-07-25 14:43:46 percentgc NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8445 2013-07-25 14:43:46 pseudogenedensity NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8446 2013-07-25 14:43:46 longnoncodingdensity NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL
8447 2011-02-23 14:35:10 marker NULL NULL /lustre/scratch103/ensembl/amonida/builds/homo_sapiens/GRCh37_jan11_e62/marker_features/dumped_markers.out e-PCR 1.2.0 NULL -M=>150,-W=>7,-NMIN=>0, -NMAX=>2 EPCR NULL e-PCR sts
8448 2013-07-29 09:24:09 xrefcoordinatemapping NULL NULL NULL xref_mapper.pl NULL NULL weights(coding,ensembl)=2.00,3.00;transcript_score_threshold=0.75 CoordinateMapper.pm NULL NULL NULL
41 2009-03-12 16:36:37 fantom_gis_pet_raw NULL NULL NULL NULL NULL NULL NULL ExonerateTags NULL NULL NULL
42 2009-03-12 16:36:37 fantom_gsc_pet_raw NULL NULL NULL NULL NULL NULL NULL ExonerateTags NULL NULL NULL
2 2011-02-15 17:06:42 repeatmask repbase 3.2.5 repbase RepeatMasker 3.2.5 /nfs/ensembl/genebuild/human_repeatmasker/RepeatMasker/RepeatMasker -nolow -species homo -s RepeatMasker NULL RepeatMasker repeat
6 2011-02-15 17:06:42 trf NULL NULL NULL trf NULL trf NULL TRF NULL trf tandem_repeat
7 2011-02-15 17:06:42 dust Dust NULL NULL dust 1 tcdust NULL Dust NULL NULL NULL
8405 Annotation for this gene includes both automatic annotation from Ensembl and <a rel="external" href="http://vega.sanger.ac.uk/index.html">Havana</a> manual curation, see <a href="http://www.ensembl.org/info/docs/genebuild/genome_annotation.html" class="cp-external">article</a>. Ensembl/Havana merge 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'}
8406 Sequences from various databases are matched to Ensembl transcripts using <a rel="external" href="http://www.biomedcentral.com/1471-2105/6/31">Exonerate</a>. These are external references, or 'Xrefs'. DNA match 0 NULL
8407 Manual annotation (determined on a case-by-case basis) from the <a rel="external" href="http://www.sanger.ac.uk/HGP/havana/havana.shtml">Havana</a> project. Havana 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'}
8408 Alignment of human ESTs (expressed sequence tags) to the genome using the program <a rel="external" href="http://emboss.sourceforge.net/apps/cvs/emboss/apps/est2genome.html">Est2genome</a>. ESTs are from <a rel="external" href="http://www.ncbi.nlm.nih.gov/dbEST/">dbEST</a> Human EST (EST2genome) 0 {'type' => 'est'}
8409 Positions of vertebrate mRNAs along the genome. mRNAs are from the <a rel="external" href="http://www.ebi.ac.uk/embl/">European Nucleotide Archive</a> database. Initial alignments are performed using TBLASTN of Genscan-predicted peptides against the European Nucleotide Archive mRNAs. Vertebrate cDNAs (ENA) 0 {'type' => 'cdna','default' => {'contigviewbottom' => 'stack'}}
8410 Homo Sapiens cDNAs from <a rel="external" href="http://www.ncbi.nlm.nih.gov/RefSeq/">NCBI RefSeq</a> and <a rel="external" href="http://www.ebi.ac.uk/embl/">EMBL</a> are aligned to the genome using <a rel="external" href="http://www.biomedcentral.com/1471-2105/6/31">Exonerate cdna2genome model</a>. Human cDNAs (cdna2genome) 0 {'type' => 'cdna'}
8411 Human cDNAs from <a rel="external" href="http://www.ncbi.nlm.nih.gov/RefSeq/">NCBI RefSeq</a> and <a rel="external" href="http://www.ebi.ac.uk/embl/">ENA</a> are aligned to the genome using <a rel="external" href="http://www.biomedcentral.com/1471-2105/6/31">Exonerate</a>. Human cDNAs 0 {'type' => 'cdna'}
8412 Proteins from the <a rel="external" href="http://uniprot.org">UniProtKB Swiss-Prot</a> database, aligned to the genome by Havana. UniProt proteins 0 NULL
8413 Human protein sequences from <a rel="external" href="http://uniprot.org">UniProtKB</a> and <a rel="external" href="http://www.ncbi.nlm.nih.gov/RefSeq/">NCBI RefSeq</a> are aligned to the genome using <a rel="external" href="http://genome.cshlp.org/cgi/content/abstract/14/5/988?maxtoshow=&amp;HITS=10&amp;hits=10&amp;RESULTFORMAT=&amp;andorexactfulltext=and&amp;searchid=1&amp;FIRSTINDEX=0&amp;sortspec=relevance&amp;volume=14&amp;firstpage=988&amp;resourcetype=HWCIT">GeneWise</a> or <a rel="external" href="http://www.biomedcentral.com/1471-2105/6/31">Exonerate</a>. Human proteins 1 NULL
8414 Proteins from the <a rel="external" href="http://uniprot.org">UniProtKB TrEMBL</a> database, aligned to the genome by Havana. TrEMBL proteins 0 NULL
8415 match Protein 0 NULL
8416 Xref mapping based on checksum equivalency Xref checksum 0 NULL
8417 Protein domains and motifs in the <a rel="external" href="http://nar.oxfordjournals.org/cgi/content/abstract/32/suppl_1/D138?maxtoshow=&amp;HITS=10&amp;hits=10&amp;RESULTFORMAT=1&amp;author1=Bateman&amp;andorexacttitle=and&amp;andorexacttitleabs=and&amp;andorexactfulltext=and&amp;searchid=1&amp;FIRSTINDEX=0&amp;sortspec=relevance&amp;resourcetype=HWCIT">Pfam</a> database. Pfam domain 1 {'type' => 'domain'}
8418 Protein domains and motifs in the <a rel="external" href="http://www.sciencedirect.com/science?_ob=ArticleURL&amp;_udi=B6WK7-457CXWM-3D&amp;_user=776054&amp;_coverDate=11%2F02%2F2001&amp;_rdoc=17&amp;_fmt=high&amp;_orig=browse&amp;_srch=doc-info(%23toc%236899%232001%23996869995%23286382%23FLA%23display%23Volume)&amp;_cdi=6899&amp;_sort=d&amp;_docanchor=&amp;_ct=17&amp;_acct=C000042238&amp;_version=1&amp;_urlVersion=0&amp;_userid=776054&amp;md5=a921e84cd71c59f75644aa28f3224b58">SUPERFAMILY</a> database. Superfamily domains 1 {'type' => 'domain'}
8419 Protein domains and motifs in the <a rel="external" href="http://nar.oxfordjournals.org/cgi/content/full/34/suppl_1/D257?maxtoshow=&amp;HITS=10&amp;hits=10&amp;RESULTFORMAT=1&amp;author1=letunic&amp;andorexacttitle=and&amp;andorexacttitleabs=and&amp;andorexactfulltext=and&amp;searchid=1&amp;FIRSTINDEX=0&amp;sortspec=relevance&amp;fdate=1/1/2006&amp;tdate=12/31/2006&amp;resourcetype=HWCIT">SMART</a> database. SMART domains 1 {'type' => 'domain'}
8420 Identification of peptide low complexity sequences by <a rel="external" href="http://www.sciencedirect.com/science?_ob=ArticleURL&amp;_udi=B6TFV-44PXMF3-45&amp;_user=776054&amp;_coverDate=06%2F30%2F1993&amp;_rdoc=6&amp;_fmt=high&amp;_orig=browse&amp;_srch=doc-info(%23toc%235236%231993%23999829997%23279143%23FLP%23display%23Volume)&amp;_cdi=5236&amp;_sort=d&amp;_docanchor=&amp;_ct=13&amp;_acct=C000042238&amp;_version=1&amp;_urlVersion=0&amp;_userid=776054&amp;md5=ac6f98882f2c6626643118367fb28cad">Seg</a>. Low complexity (Seg) 1 NULL
8421 Protein domains and motifs from the <a rel="external" href="http://pir.georgetown.edu/pirwww/index.shtml">PIR (Protein Information Resource)</a> Superfamily database. PIRSF domain 1 {'type' => 'domain'}
8422 Protein domains and motifs from the <a rel="external" href="http://www.ebi.ac.uk/ppsearch/">PROSITE</a> profiles database are aligned to the genome. PROSITE profiles 1 {'type' => 'domain'}
8423 Prediction of signal peptide cleavage sites by <a rel="external" href="http://www.sciencedirect.com/science?_ob=ArticleURL&amp;_udi=B6WK7-4CKBS0M-3&amp;_user=776054&amp;_coverDate=07%2F16%2F2004&amp;_alid=772330061&amp;_rdoc=1&amp;_fmt=high&amp;_orig=search&amp;_cdi=6899&amp;_sort=d&amp;_docanchor=&amp;view=c&amp;_ct=1&amp;_acct=C000042238&amp;_version=1&amp;_urlVersion=0&amp;_userid=776054&amp;md5=9f42be939814b7711268fd414604c9dd">SignalP</a>. Cleavage site (Signalp) 1 NULL
8424 Protein coding sequences agreed upon by the Consensus Coding Sequence project, or <a href="http://www.ensembl.org/info/docs/genebuild/ccds.html" class="cp-external">CCDS</a>. CCDS set 1 {'dna_align_feature' => {'do_not_display' => '1'},'type' => 'cdna','default' => {'contigviewbottom' => 'normal'}}
8425 Transcript where the Ensembl genebuild transcript and the <a rel="external" href="http://vega.sanger.ac.uk/index.html">Vega</a> manual annotation have the same sequence, for every base pair. See <a href="http://www.ensembl.org/info/docs/genebuild/genome_annotation.html" class="cp-external">article</a>. Ensembl/Havana merge 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'}
8426 Annotation produced by the Ensembl <a href="http://www.ensembl.org/info/docs/genebuild/genome_annotation.html" class="cp-external">genebuild</a>. Ensembl 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'}
8427 Protein fingerprints (groups of conserved motifs) are aligned to the genome. These motifs come from the <a rel="external" href="http://nar.oxfordjournals.org/cgi/content/abstract/31/1/400?maxtoshow=&amp;HITS=10&amp;hits=10&amp;RESULTFORMAT=1&amp;author1=Attwood&amp;andorexacttitle=and&amp;andorexacttitleabs=and&amp;andorexactfulltext=and&amp;searchid=1&amp;FIRSTINDEX=0&amp;sortspec=relevance&amp;resourcetype=HWCIT">PRINTS</a> database. Prints domain 1 {'type' => 'domain'}
8428 Alignment of mouse ESTs (expressed sequence tags) to the genome using the program <a rel="external" href="http://emboss.sourceforge.net/apps/cvs/emboss/apps/est2genome.html">Est2genome</a>. ESTs are from <a rel="external" href="http://www.ncbi.nlm.nih.gov/dbEST/">dbEST</a> Mouse EST (EST2genome) 0 {'type' => 'est'}
8429 Prediction of coiled-coil regions in proteins is by <a rel="external" href="http://www.sciencemag.org/cgi/reprint/252/5009/1162">Ncoils</a>. Coiled-coils (Ncoils) 1 NULL
8430 Non-coding RNAs (ncRNAs) predicted using sequences from <a href="http://rfam.sanger.ac.uk">RFAM</a> and <a href="http://microrna.sanger.ac.uk/sequences/index.shtml">miRBase</a>. See <a href="http://www.ensembl.org/info/docs/genebuild/ncrna.html" class="cp-external">article</a>. ncRNAs 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'}
8431 <a href="http://www.ensembl.org/info/docs/genebuild/ncrna.html" class="cp-external">Positions</a> of ncRNAs (non-coding RNAs) from the <a rel="external" href="http://rfam.sanger.ac.uk/">Rfam </a> database are shown. Initial BLASTN hits of genomic sequence to RFAM ncRNAs are clustered and filtered by E value. These hits are supporting evidence for ncRNA genes. RFAM ncRNAs 0 NULL
8432 Homo sapiens 'Expressed Sequence Tags' (ESTs) from <a rel="external" href="http://www.ncbi.nlm.nih.gov/dbEST/">dbEST</a> are aligned to the genome using <a rel="external" href="http://www.biomedcentral.com/1471-2105/6/31">Exonerate</a>. Human ESTs 0 {'type' => 'est'}
8433 Non-coding RNAs (ncRNAs) predicted using sequences from <a href="http://rfam.sanger.ac.uk">RFAM</a> and <a href="http://microrna.sanger.ac.uk/sequences/index.shtml">miRBase</a>. See <a href="http://www.ensembl.org/info/docs/genebuild/ncrna.html" class="cp-external">article</a>. These were projected to the <a rel="external" href="http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/info/definitions.shtml">alternate locus</a> via a mapping from the <a rel="external" href="http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/info/definitions.shtml">primary assembly</a>. Projected ncRNA 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'}
8434 Transcript that was projected from the primary assembly, aligned to the alternate locus version as supporting evidence. Projected transcript 0 {'type' => 'cdna'}
8435 Manual annotation (determined on a case-by-case basis) from the <a rel="external" href="http://www.sanger.ac.uk/HGP/havana/havana.shtml">Havana</a> project, projected to the <a rel="external" href="http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/info/definitions.shtml">alternate locus</a> via a mapping from the <a rel="external" href="http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/info/definitions.shtml">primary assembly</a>. Projected Havana 1 {'multi_name' => 'Ensembl genes, or Merged Ensembl and Havana genes','colour_key' => '[biotype]','caption' => 'Genes (Merged Ensembl/Havana) (GENCODE)','name' => 'Merged Ensembl and Havana genes (GENCODE)','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'}
8436 Data from LRG database LRG 0 {'multi_name' => 'LRG genes','colour_key' => 'rna_[status]','caption' => 'LRG gene','name' => 'LRG Genes','label_key' => '[text_label] [display_label]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'}}
8437 First Exon Finder (<a rel="external" href="http://www.nature.com/ng/journal/v29/n4/full/ng780.html">First EF</a>) predicts positions of the first exons of transcripts, both coding and non-coding, using the sequence to identify features such as CpG islands and promoter regions. First EF 1 NULL
8438 Transcription start sites predicted by <a rel="external" href="http://www.sanger.ac.uk/resources/software/eponine/">Eponine-TSS</a>. TSS (Eponine) 1 NULL
8439 CpG islands are regions of nucleic acid sequence containing a high number of adjacent cytosine guanine pairs (along one strand). Usually unmethylated, they are associated with promoters and regulatory regions. They are determined from the genomic sequence using a program written by G. Miklem, similar to <a rel="external" href="http://emboss.sourceforge.net/apps/cvs/emboss/apps/newcpgreport.html">newcpgreport</a> in the EMBOSS package. CpG islands 1 NULL
8440 Ab initio prediction of protein coding genes by <a rel="external" href="http://www.sciencedirect.com/science?_ob=ArticleURL&amp;_udi=B6WK7-45VGF7T-9&amp;_user=776054&amp;_rdoc=1&amp;_fmt=&amp;_orig=search&amp;_sort=d&amp;view=c&amp;_version=1&amp;_urlVersion=0&amp;_userid=776054&amp;md5=aa15a5f8122912c172ddb9dd15b237dc">Genscan</a>. The splice site models used are described in more detail in C. Burge, Modelling dependencies in pre-mRNA splicing signals. 1998 In Salzberg, S., Searls, D. and Kasif, S., eds. Computational Methods in Molecular Biology, Elsevier Science, Amsterdam, 127-163. Genscan predictions 1 NULL
8441 Short non-coding gene density as calculated by <a rel="external" href="http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl/modules/Bio/EnsEMBL/Production/Pipeline/Production/NonCodingDensity.pm?root=ensembl&view=markup">ShortNonCodingDensity.pm</a>. Short non-coding genes (density) 1 NULL
8442 Coding gene density as calculated by <a rel="external" href="http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl/modules/Bio/EnsEMBL/Pipeline/Production/CodingDensity.pm?root=ensembl&view=markup">gene_density_calc.pl</a>. Coding genes (density) 1 NULL
8443 Percentage of repetitive elements for top level sequences (such as chromosomes, scaffolds, etc.) Repeats (percent) 1 NULL
8444 Percentage of G/C bases in the sequence. GC content 1 NULL
8445 Pseudogene density as calculated by <a rel="external" href="http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl/modules/Bio/EnsEMBL/Pipeline/Production/PseudogeneDensity.pm?root=ensembl&view=markup">PseudogeneDensity</a>. Pseudogenes (density) 1 NULL
8446 Long non-coding gene density as calculated by <a rel="external" href="http://cvs.sanger.ac.uk/cgi-bin/viewvc.cgi/ensembl/modules/Bio/EnsEMBL/Production/Pipeline/Production/NonCodingDensity.pm?root=ensembl&view=markup">LongNonCodingDensity.pm</a>. Long non-coding genes (density) 1 NULL
8447 Markers, or sequence tagged sites (STS), from <a rel="external" href="http://www.ncbi.nlm.nih.gov/sites/entrez?db=unists">UniSTS</a> are aligned to the genome using <a rel="external" href="http://genome.cshlp.org/cgi/content/full/7/5/541">Electronic PCR (e-PCR)</a>. Marker 1 NULL
27515 22566 112323682 112380141 2001 58460 1
27515 23120 112568920 112586593 101 17774 1
27515 27454 112380142 112568919 2001 190778 1
27515 27609 95830544 171055067 1 75224524 1
27516 24555 345529 562251 2001 218723 1
27516 24823 281385 312288 1 30904 -1
27516 24829 144822 182592 1 37771 1
27516 26177 312289 345528 101 33340 1
27516 26178 217465 231384 1999 15918 1
27516 26443 182593 217464 1 34872 -1
27516 26483 60001 94821 1 34821 1
27516 27605 60001 94821 1 34821 1
27516 27641 144822 231384 1 86563 1
27516 27688 281385 1047557 1 766173 1
27605 26483 1 34821 1 34821 1
27609 22566 16493139 16549598 2001 58460 1
27609 23120 16738377 16756050 101 17774 1
27609 27454 16549599 16738376 2001 190778 1
27641 24829 1 37771 1 37771 1
27641 26178 72644 86563 1999 15918 1
27641 26443 37772 72643 1 34872 -1
27688 24555 64145 280867 2001 218723 1
27688 24823 1 30904 1 30904 -1
27688 26177 30905 64144 101 33340 1
1000759268 22566 112323682 112380141 2001 58460 1
1000759268 23120 112568920 112586593 101 17774 1
1000759268 1000759269 112323682 112586593 1 262912 1
1000759268 1000759337 112380142 112568919 2001 190778 1
1000759269 22566 1 56460 2001 58460 1
1000759269 23120 245239 262912 101 17774 1
1000759269 1000759337 56461 245238 2001 190778 1
1001161223 22328 1 33740 56493 90232 -1
154 1000759268 112323682 112586593 PATCH_FIX 27515 112323682 112586593 1
1 27507 10001 2649520 PAR 27516 60001 2699520 1
2 27507 59034050 59373566 PAR 27516 154931044 155270560 1
1 embl_acc European Nucleotide Archive (was EMBL) accession NULL
2 status Status NULL
3 synonym Synonym NULL
4 name Name Alternative/long name
5 type Type of feature NULL
6 toplevel Top Level Top Level Non-Redundant Sequence Region
7 GeneCount Gene Count Total Number of Genes
8 KnownGeneCount Known Gene Count Total Number of Known Genes
9 PseudoGeneCount PseudoGene Count Total Number of PseudoGenes
10 SNPCount SNP Count Total Number of SNPs
11 codon_table Codon Table Alternate codon table
12 _selenocysteine Selenocysteine NULL
13 bacend bacend NULL
14 htg htg High Throughput phase attribute
15 miRNA Micro RNA Coordinates of the mature miRNA
16 non_ref Non Reference Non Reference Sequence Region
17 sanger_project Sanger Project name NULL
18 clone_name Clone name NULL
19 fish FISH location NULL
21 org Sequencing centre NULL
22 method Method NULL
23 superctg Super contig id NULL
24 inner_start Max start value NULL
25 inner_end Min end value NULL
26 state Current state of clone NULL
27 organisation Organisation sequencing clone NULL
28 seq_len Accession length NULL
29 fp_size FP size NULL
30 BACend_flag BAC end flags NULL
31 fpc_clone_id fpc clone NULL
32 KnwnPCCount protein_coding_KNOWN Number of Known Protein Coding
33 NovPCCount protein_coding_NOVEL Number of Novel Protein Coding
34 NovPTCount processed_transcript_NOVEL Number of Novel Processed Transcripts
35 PutPTCount processed_transcript_PUTATIVE Number of Putative Processed Transcripts
36 PredPCCount protein_coding_PREDICTED Number of Predicted Protein Coding
37 IGGeneCount IG_gene Number of IG Genes
38 IGPsGenCount IG_pseudogene Number of IG Pseudogenes
39 TotPsCount total_pseudogene Total Number of Pseudogenes
42 KnwnPCProgCount protein_coding_in_progress_KNOWN Number of Known Protein Coding in progress
43 NovPCProgCount protein_coding_in_progress_NOVEL Number of Novel Protein Coding in progress
44 AnnotSeqLength Annotated sequence length Annotated Sequence
45 TotCloneNum Total number of clones Total Number of Clones
46 NumAnnotClone Fully annotated clones Number of Fully Annotated Clones
47 ack Acknowledgement Acknowledgement for manual annotation
48 htg_phase High throughput phase High throughput genomic sequencing phase
49 description Description A general descriptive text attribute
50 chromosome Chromosome Chromosomal location for supercontigs that are not assembled
51 nonsense Nonsense Mutation Strain specific nonesense mutation
52 author Author Group resonsible for Vega annotation
53 author_email Author email address Author email address
54 remark Remark Annotation remark
55 transcr_class Transcript class Transcript class
56 KnwnPTCount processed_transcript_KNOWN Number of Known Processed Transcripts
57 ccds CCDS CCDS identifier
58 CCDS_PublicNote CCDS Public Note Public Note for CCDS identifier, provided by http://www.ncbi.nlm.nih.gov/CCDS
59 Frameshift Frameshift Frameshift modelled as intron
60 PTCount processed_transcript Number of Processed Transcripts
61 PredPTCount processed_transcript_PREDICTED Number of Predicted Processed Transcripts
62 ncRNA Structure RNA secondary structure line
63 skip_clone skip clone Skip clone in align_by_clone_identity.pl NULL
64 coding_cnt Protein coding gene count Number of protein coding Genes
65 GeneNo_novCod novel protein_coding Gene Count Number of novel protein_coding Genes
66 GeneNo_rRNA rRNA Gene Count Number of rRNA Genes
67 pseudogene_cnt Pseudogene count Number of pseudogenes
68 GeneNo_snRNA snRNA Gene Count Number of snRNA Genes
69 GeneNo_snoRNA snoRNA Gene Count Number of snoRNA Genes
70 GeneNo_miRNA miRNA Gene Count Number of miRNA Genes
71 GeneNo_mscRNA misc_RNA Gene Count Number of misc_RNA Genes
72 GeneNo_scRNA scRNA Gene Count Number of scRNA Genes
73 GeneNo_MTrRNA Mt_rRNA Gene Count Number of Mt_rRNA Genes
74 GeneNo_MTtRNA Mt_tRNA Gene Count Number of Mt_tRNA Genes
75 GeneNo_RNA_pseu RNA_pseudogene Gene Count Number of RNA_pseudogene Genes
76 GeneNo_tRNA tRNA Gene Count Number of tRNA Genes
77 GeneNo_rettran retrotransposed Gene Count Number of retrotransposed Genes
78 GeneNo_snlRNA snlRNA Gene Count Number of snlRNA Genes
79 GeneNo_proc_tr processed_transcript Gene Count Number of processed transcript Genes
80 supercontig SuperContig name NULL
81 well_name Well plate name NULL
82 bacterial Bacterial NULL
83 NovelCDSCount Novel CDS Count NULL
84 NovelTransCount Novel Transcript Count NULL
85 PutTransCount Putative Transcript Count NULL
86 PredTransCount Predicted Transcript Count NULL
87 UnclassPsCount Unclass Ps count NULL
88 KnwnprogCount Known prog Count NULL
89 NovCDSprogCount Novel CDS prog count NULL
90 bacend_well_nam BACend well name NULL
91 alt_well_name Alt well name NULL
92 TranscriptEdge Transcript Edge NULL
93 alt_embl_acc Alt European Nucleotide Archive (was EMBL) acc NULL
94 alt_org Alt org NULL
95 intl_clone_name International Clone Name NULL
96 embl_version European Nucleotide Archive (was EMBL) Version NULL
97 chr Chromosome Name Chromosome Name Contained in the Assembly
98 equiv_asm Equivalent EnsEMBL assembly For full chromosomes made from NCBI AGPs
99 GeneNo_ncRNA ncRNA Gene Count Number of ncRNA Genes
100 GeneNo_Ig Ig Gene Count Number of Ig Genes
109 HitSimilarity hit similarity percentage id to parent transcripts
110 HitCoverage hit coverage coverage of parent transcripts
111 PropNonGap proportion non gap proportion non gap
112 NumStops number of stops NULL
113 GapExons gap exons number of gap exons
114 SourceTran source transcript source transcript
115 EndNotFound end not found end not found
116 StartNotFound start not found start not found
117 Frameshift Fra Frameshift modelled as intron NULL
118 ensembl_name Ensembl name Name of equivalent Ensembl chromosome
119 NoAnnotation NoAnnotation Clones without manual annotation
120 hap_contig Haplotype contig Contig present on a haplotype
121 annotated Clone Annotation Status NULL
122 keyword Clone Keyword NULL
123 hidden_remark Hidden Remark NULL
124 mRNA_start_NF mRNA start not found NULL
125 mRNA_end_NF mRNA end not found NULL
126 cds_start_NF CDS start not found NULL
127 cds_end_NF CDS end not found NULL
128 write_access Write access for Sequence Set 1 for writable , 0 for read-only
129 hidden Hidden Sequence Set NULL
130 vega_name Vega name Vega seq_region.name
131 vega_export_mod Export mode E (External), I (Internal) etc
132 vega_release Vega release Vega release number
133 atag_CLE Clone_left_end Clone_lef_end feature marked in GAP database
134 atag_CRE Clone_right_end Clone_right_end feature marked in GAP database
135 atag_Misc Misc miscellaneous feature marked in GAP database
136 atag_Unsure Unsure region of uncertain DNA sequence marked in GAP database
137 MultAssem Multiple Assembled seq region Part of Seq Region is part of more than one assembly
140 wgs WGS contig WGS contig integrated into the map
141 bac AGP clones tiling path of clones
142 GeneGC Gene GC Percentage GC content for this gene
143 TotAssemblyLeng Finished sequence length Length of the assembly not counting sequence gaps
144 amino_acid_sub Amino acid substitution Some translations have been manually curated for amino acid substitiutions. For example a stop codon may be changed to an amino acid in order to prevent premature truncation, or one amino acid can be substituted for another.
145 _rna_edit rna_edit RNA edit
146 kill_reason Kill Reason Reason why a transcript has been killed
147 strip_UTR Strip UTR Transcript needs bad UTR removing
148 TotAssLength Finished sequence length Finished Sequence
149 PsCount pseudogene Number of Pseudogenes
152 TotPTCount total_processed_transcript Total Number of Processed Transcripts
153 TotPCCount total_protein_coding Total Number of Protein Coding
154 NovNcCount novel_non_coding Number of Novel Non Coding
155 KnwnPolyPsCount known_polymorphic Number of Known Polymorphic Pseudogenes
156 PolyPsCount polymorphic_pseudogene Number of Polymorphic Pseudogenes
157 TotIGGeneCount total_IG_gene Total Number of IG Genes
158 ProcPsCount proc_pseudogene Number of Processed Pseudogenes
159 UnPsCount unproc_pseudogene Number of Unprocessed Pseudogenes
160 TPsCount transcribed_pseudogene Number of Transcribed Pseudogenes
161 TECCount TEC Number of TEC Genes
162 KnwnIGGeneCount IG_gene_KNOWN Number of Known IG Genes
163 KnwnIGPsGeCount IG_pseudogene_KNOWN Number of Known IG Pseudogenes
164 IsoPoint Isoelectric point Pepstats attributes
165 Charge Charge Pepstats attributes
166 MolecularWeight Molecular weight Pepstats attributes
167 NumResidues Number of residues Pepstats attributes
168 AvgResWeight Ave. residue weight Pepstats attributes
170 initial_met Initial methionine Set first amino acid to methionine
171 NonGapHCov NonGapHCov NULL
172 otter_support otter support Evidence ID that was used as supporting feature for building a gene in Vega
173 enst_link enst link Code to link a OTTT with an ENST when they both share the CDS of ENST
174 upstream_ATG upstream ATG Alternative ATG found upstream of the defined as start ATG for the transcript
175 TPPsCount transcribed_processed_pseudogene Number of Transcribed Processed Pseudogenes
176 TUPsCount transcribed_unprocessed_pseudogene Number of Transcribed Unprocessed Pseudogenes
177 UniPsCount unitary_pseudogene Number of Unitary Pseudogenes
178 KnwnTECCount TEC_KNOWN Number of Known TEC genes
179 TotTECGeneCount TEC_all Total number of TEC genes
180 TUyPsCount transcribed_unitary_pseudogene Number of Transcribed Unitary Pseudogenes
181 PolyCount polymorphic Number of Polymorphic Genes
182 KnwnPolyCount polymorphic Number of Known Polymorphic Genes
183 KnwnTRCount TR_gene_known Number of Known TR Genes
184 TRGeneCount TR_gene Number of TR Genes
185 TRPsCount TR_pseudo Number of TR Pseudogenes
186 tp_ott_support otter protein transcript support Evidence ID that was used as supporting feature for building a gene in Vega
187 td_ott_support otter dna transcript support Evidence ID that was used as supporting feature for building a gene in Vega
188 ep_ott_support otter protein exon support Evidence ID that was used as supporting feature for building a gene in Vega
189 ed_ott_support otter dna exon support Evidence ID that was used as supporting feature for building a gene in Vega
190 GeneNo_lincRNA lincRNA Gene Count Number of lincRNA Genes
191 StopGained SNP causes stop codon to be gained This transcript has a variant that causes a stop codon to be gained in at least 10 percent of a HapMap population
192 StopLost SNP causes stop codon to be lost This transcript has a variant that causes a stop codon to be lost in at least 10 percent of a HapMap population
193 GeneNo_class_I_ class_I_RNA Gene Count Number of class_I_RNA Genes
194 GeneNo_SRP_RNA SRP_RNA Gene Count Number of SRP_RNA Genes
195 GeneNo_class_II class_II_RNA Gene Count Number of class_II_RNA Genes
196 GeneNo_P_RNA RNase_P_RNA Gene Count Number of RNase_P_RNA Genes
197 GeneNo_RNase_MR RNase_MRP_RNA Gene Count Number of RNase_MRP_RNA Genes
198 lost_frameshift lost_frameshift Frameshift on the query sequence is lost in the target sequence
199 AltThreePrime Alternate three prime end The position of other possible three prime ends for the transcript
216 GeneInLRG Gene in LRG This gene is contained within an LRG region
217 GeneOverlapLRG Gene overlaps LRG This gene is partially overlapped by a LRG region (start or end outside LRG)
218 readthrough_tra readthrough transcript Havana readthrough transcripts
300 CNE Constitutive exon An exon that is always included in the mature mRNA, even in different mRNA isoforms
301 CE Cassette exon One exon is spliced out of the primary transcript together with its flanking introns
302 IR Intron retention A sequence is spliced out as an intron or remains in the mature mRNA transcript
303 MXE Mutually exclusive exons In the simpliest case, one or two consecutive exons are retained but not both
304 A3SS Alternative 3' sites Two or more splice sites are recognized at the 5' end of an exon. An alternative 3' splice junction (acceptor site) is used, changing the 5' boundary of the downstream exon
305 A5SS Alternative 5' sites Two or more splice sites are recognized at the 3' end of an exon. An alternative 5' splice junction (donor site) is used, changing the 3' boundary of the upstream exon
306 AFE Alternative first exon The second exons of each variant have identical boundaries, but the first exons do not overlap
307 ALE Alternative last exon Penultimate exons of each splice variant have identical boundaries, but the last exons do not overlap
308 II Intron isoform Alternative donor or acceptor splice sites lead to truncation or extension of introns, respectively
309 EI Exon isoform Alternative donor or acceptor splice sites leads to truncation or extension of exons, respectively
310 AI Alternative initiation Alternative choice of promoters
311 AT Alternative termination Alternative choice of polyadenylation sites
312 patch_fix Assembly Patch Fix Assembly patch that will, in the next assembly release, replace the corresponding sequence found in the current assembly
313 patch_novel Assembly Patch Novel Assembly patch that will, in the next assembly release, be retained as an alternate non-reference sequence in a similar way to haplotypes
314 LRG Locus Reference Genomic Locus Reference Genomic sequence
315 NoEvidence Evidence for transcript removed Supporting evidence for this projected transcript has been removed
316 circular_seq Circular sequence Circular chromosome or plasmid molecule
317 external_db External database External database to which seq_region name may be linked
318 split_tscript split_tscript split_tscript
319 Threep Three prime end Alternate three prime end
320 gene_cluster Gene cluster Havana annotated gene cluster
328 _rib_frameshift Ribosomal Frameshift Position and magnitude of frameshift
345 vega_ref_chrom Vega reference chromosome Haplotypes reference a regular chromosome (indicated in the value of the attribute)
346 PutPCCount protein_coding_PUTATIVE Number of Putative Protein Coding
347 proj_alt_seq Projection altered sequence Projected sequence differs from original
348 hav_gene_type Havana gene biotype Gene biotype assigned by Havana
349 GeneNo_asense antisense Gene Count Number of antisense Genes
350 GeneNo_sense_in sense_intronic Gene Count Number of sense_intronic Genes
351 GeneNo_amb_orf ambiguous_orf Gene Count Number of ambiguous_orf Genes
352 GeneNo_ret_int retained_intron Gene Count Number of retained_intron Genes
353 noncoding_cnt Non coding gene count Number of non coding genes
354 GeneNo_ncrna_h ncrna_host Gene Count Number of ncrna_host Genes
355 GeneNo_sens_ov sense_overlapping Gene Count Number of sense_overlapping Genes
356 GeneNo_3prime 3prime_overlapping Gene Count Number of 3prime_overlapping Genes
357 GeneNo_tmRNA tmRNA Gene Count Number of tmRNA Genes
358 PHIbase_mutant PHI-base mutant PHI-base phenotype of the mutants
359 GeneNo_ribozyme ribozyme Gene Count Number of ribozyme Genes
360 ncrna_host ncrna_host Havana ncrna_host gene
361 peptide-class Peptide classification The classification of the gene or transcript based on alignment to NR (values: TE WH NH)
362 working-set Working Gene Set High-confidence set of genes, composed of evidence-based genes and non-overlapping protein-coding ab initio gene models
363 filtered-set Filtered Gene Set v1 Working genes that are screened for TE content and orthology with sorghum and rice.
364 super-set Super Working Gene Set Set of all working gene set loci from both Builds 4a and 5a
365 projected4a2 Projected by alignment Temporary (Monday, August 23, 2010)
366 merged Merged species NULL
367 karyotype_rank Rank in the karyotype For a given seq_region, if it is part of the species karyotype, will indicate its rank
368 noncoding_acnt Alternate non coding gene count Number of non coding genes on alternate sequences
369 coding_acnt Alternate protein coding gene count Number of protein coding genes on alternate sequences
370 pseudogene_acnt Alternate pseudogene count Number of pseudogenes on alternate sequences
371 clone_end Clone end Side of the contig on which a vector lies (enum:RIGHT, LEFT).
372 contig_scaffold Contig Scaffold Scaffold that contains mutually ordered contigs.
373 current_version Current Accession Version Identifies the most recent version of an accession.
374 seq_status Sequence Status Sequence status.
375 clone_vector Vector sequence A clone-end vector associated with a contig (enum:SP6, T7).
376 creation_date Creation date Creation date of annotation
377 update_date Update date Last update date of annotation
378 seq_date Sequence date Sequence date
379 has_stop_codon Contains stop codon Translation attribute
380 havana_cv Havana CV term Controlled vocabulary terms from Havana
381 TlPPsCount translated_processed_pseudogene Number of Translated Processed Pseudogenes
382 NoTransRefError No translations due to reference error This gene is believed to include protein coding transcripts, but no transcript has a translation due to a reference assembly error making specifying the translation impossible.
383 parent_exon_key parent_exon_key The exon key to identify a projected transcript's parent transcript.
386 parent_sid parent_sid The parent stable ID to identify a projected transcript's parent transcript. For internal statistics use only since this method does not work in all cases.
387 snoncoding_acnt Alternate short non coding gene count Number of short non coding genes on alternate sequences
388 lnoncoding_acnt Alternate long non coding gene count Number of long non coding genes on alternate sequences
389 snoncoding_cnt Short non coding gene count Number of short non coding genes
390 lnoncoding_cnt Long non coding gene count Number of long non coding genes
391 TlUPsCount translated_unprocessed_pseudogene Number of Translated Unprocessed Pseudogenes
1 1 contig NULL 4 default_version,sequence_level
2 1 chromosome GRCh37 1 default_version
3 1 supercontig GRCh37 2 default_version
4 1 clone NULL 3 default_version
27 1 chromosome NCBI36 5
101 1 chromosome NCBI35 6
1001 1 chromosome NCBI34 7
1003 1 lrg NULL 8 default_version
16758804 97 27515 111795178 112935944 4
16758805 97 27516 1 1035137 3
16758806 96 27515 111795178 112935944 1
16758807 98 27515 111795178 112935944 8.20553
16758808 98 27516 1 1035137 25.5257
16758809 100 27516 1 1035137 1
16758810 101 27515 111795178 112935944 4
16758811 101 27516 1 1035137 1
16758812 99 27515 111795178 112935944 39.6
16758813 99 27516 1 1035137 51.6
96 8441 0 150 sum
97 8442 0 150 sum
98 8443 0 150 ratio
99 8444 0 150 ratio
100 8445 0 150 sum
101 8446 0 150 sum
This diff is collapsed.
241886 U_71322 SHC013 1 GGCCTGCCTGCCCGCTCCCGAAAAACCCAGAAAAAG
1161749 U_142006 SHC019 1 AAGCATATACAATTATGTTGTATATTTTAGAATC
697898 U_287123 SHC016 1 ATTACAGGATTTTGATCAAATCTTCCCCTTCTATCC
1366611 U_221680 SHC019 1 TGCCATTACATGTGAGATGTGCTGTGTTTAAAAATGA
1221441 241886 1 27515 614299 614315 -1 41 19 35 1 17M R
1221442 241886 1 27515 615532 615548 -1 41 2 18 1 17M L
1221443 1161749 1 27515 7594781 7594796 -1 42 18 33 1 16M R
1221444 1161749 1 27515 7884188 7884203 -1 42 1 16 1 16M L
1221445 697898 1 27515 9042706 9042723 1 42 1 18 1 18M L
1221446 697898 1 27515 9045216 9045233 1 42 19 36 1 18M R
1221447 1366611 3 27515 9078184 9078201 1 42 20 37 1 18M R
1221448 1366611 3 27515 9154857 9154875 1 42 1 19 1 19M L
This diff is collapsed.
source diff could not be displayed: it is too large. Options to address this: view the blob.
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment