Commit b22587d9 authored by Alessandro Vullo's avatar Alessandro Vullo
Browse files

New circular chromosome test DB based on bacillus_thuringiensis_core_19_72_1 from EG

parent 29afdeac
1 2000-09-08 16:10:03 RepeatMask repbase 001212 RepeatMasker 1 RepeatMasker 1 RepeatMasker repeat
2 2000-09-22 10:59:05 Genscan HumanIso.smat 1 HumanIso.smat genscan 1.0 Genscan 1 genscan prediction
5 2000-09-28 17:33:27 Swall swall 1 swall wublastp 1 -hitdist=40, -cpus=1 BlastGenscanPep wublastp similarity
7 2000-10-09 15:30:14 Vertrna embl_vertrna 1 embl_vertrna wutblastn 1 -hitdist=40, -cpus=1 BlastGenscanDNA 1 wutblastn similarity
4 2000-12-18 15:12:30 tRNAscan tRNA 1 tRNAscan-SE 1.11 tRNAscan_SE 1 tRNAscan-SE tRNA
8 2000-12-18 18:28:03 CpG cpg 1 cpg 1 CPG 1 cpg cpg_island
9 2001-01-15 17:00:00 Unigene unigene.seq 1 unigene.seq wutblastn 1 -hitdist=40, -cpus=1 BlastGenscanDNA 1 wutblastn similarity
10 2001-01-15 17:00:00 QTL 1 1 QtlPlacer 1 experimental marker
12 2001-03-28 16:42:32 dbEST dbEST 1 dbEST wutblastn 1 -hitdist=40, -cpus=1 BlastGenscanDNA 1 wutblastn similarity
6 2001-09-04 17:21:19 Eponine NULL NULL NULL eponine-scan 1 /usr/opt/java131/bin/java -epojar => /usr/local/ensembl/lib/eponine-scan.jar, -threshold => 0.999 VC_EponineTSS 1 Eponine TSS
54 2002-08-12 14:52:26 human_swall_protein 1 TGE_gw 1 TargettedGeneWise TGE_gw gene
53 2002-08-12 14:52:26 human_refseq_protein 1 TGE_gw 1 TargettedGeneWise TGE_gw gene
1280 2002-08-12 14:52:26 other_swall_protein 1 similarity_genewise 1 FPC_BlastMiniGenewise similarity_genewise gene
1281 2002-08-12 14:52:26 combined-protein_cdna 1 combined_gw_e2g 1 Combine_Genewises_and_E2Gs combined_gw_e2g gene
1282 2002-08-12 14:52:26 ensembl 1 ensembl 1 Gene_Builder ensembl gene
1290 2002-08-12 14:52:26 refseq_cdna human_mRNA 1 exonerate_e2g 1 FilterESTs_and_E2G exonerate gene
1291 2002-08-12 14:52:26 embl_vertrna human_mRNA 1 exonerate_e2g 1 FilterESTs_and_E2G exonerate gene
49 0000-00-00 00:00:00 Pfam Pfam 1 /data/blastdb/Ensembl/Pfam_ls;/data/blastdb/Ensembl/Pfam_fs /nfs/farm/Worms/bin/hmmpfam 1 /nfs/farm/Worms/bin/hmmpfam Pfam 1 PFAM domain
40 0000-00-00 00:00:00 prints prints NULL /acari/analysis/iprscan/data/prints.pval /acari/analysis/iprscan/bin/OSF1/FingerPRINTScan NULL /acari/analysis/iprscan/bin/OSF1/FingerPRINTScan NULL Prints NULL PRINTS domain
41 0000-00-00 00:00:00 pfscan pfscan NULL /acari/analysis/iprscan/data/prosite_prerelease.prf /acari/analysis/iprscan/bin/OSF1/pfscan NULL /acari/analysis/iprscan/bin/OSF1/pfscan NULL Profile NULL PROFILE domain
43 0000-00-00 00:00:00 Signalp signal_peptide NULL NULL /usr/local/ensembl/bin/signalp NULL /usr/local/ensembl/bin/signalp NULL Signalp NULL Signalp annot
46 0000-00-00 00:00:00 tmhmm transmembrane NULL NULL /acari/work1/mongin/src/pipeline4anopheles/scripts/protein_pipeline/run_tmhmm NULL /acari/work1/mongin/src/pipeline4anopheles/scripts/protein_pipeline/run_tmhmm NULL Tmhmm NULL Tmhmm annot
45 0000-00-00 00:00:00 ncoils coiled_coil NULL NULL /usr/local/ensembl/bin/ncoils NULL /usr/local/ensembl/bin/ncoils NULL ncoils NULL ncoils annot
44 0000-00-00 00:00:00 Seg low_complexity NULL NULL /usr/local/ensembl/bin/seg NULL /usr/local/ensembl/bin/seg NULL Seg NULL Seg annot
42 0000-00-00 00:00:00 scanprosite prosite NULL /acari/analysis/iprscan/data/prosite.patterns /acari/analysis/iprscan/bin/scanregexpf.pl /acari/analysis/iprscan/data/confirm.pat /acari/analysis/iprscan/bin/scanregexpf.pl Prosite NULL PROSITE domain
71 0000-00-00 00:00:00 Superfamily Superfamily 1 /data/blastdb/Ensembl/sam/superfamily /acari/work1/mongin/superfamily/superfamily.pl 1 /acari/work1/mongin/superfamily/superfamily.pl Superfamily 1 Superfamily annot
119 2006-03-14 15:48:36 XrefExonerateDNA \N \N \N \N \N \N \N \N \N \N \N
1292 0000-00-00 00:00:00 SNPDensity \N \N \N \N \N \N \N \N \N \N \N
1293 0000-00-00 00:00:00 RepeatCoverage \N \N \N \N \N \N \N \N \N \N \N
1500 0000-00-00 00:00:00 miRanda \N \N \N \N \N \N \N \N \N \N \N
1501 0000-00-00 00:00:00 cisred \N \N \N \N \N \N \N \N \N \N \N
1502 0000-00-00 00:00:00 cisred_search \N \N \N \N \N \N \N \N \N \N \N
1503 0000-00-00 00:00:00 DitagAlign \N \N \N \N \N \N \N \N \N \N \N
21 2013-07-16 13:03:11 ena ena \N \N \N \N \N \N \N \N ena gene
22 2013-07-16 13:03:12 gl_xref gl_xref \N \N \N \N \N \N \N \N gl_xref \N
23 2013-07-16 13:03:12 gl_name_xref gl_name_xref \N \N \N \N \N \N \N \N gl_name_xref \N
24 2013-07-16 13:03:13 gene3d gene3d \N \N \N \N \N \N \N \N gene3d domain
25 2013-07-16 13:03:12 pfam pfam \N \N \N \N \N \N \N \N pfam domain
26 2013-07-16 13:03:13 superfamily superfamily \N \N \N \N \N \N \N \N superfamily domain
27 2013-07-16 13:03:13 smart smart \N \N \N \N \N \N \N \N smart domain
28 2013-07-16 13:04:16 ena_rna ena_rna \N \N \N \N \N \N \N \N ena_rna gene
29 2013-07-16 13:03:18 pirsf pirsf \N \N \N \N \N \N \N \N pirsf domain
30 2013-07-16 13:03:18 scanprosite scanprosite \N \N \N \N \N \N \N \N scanprosite domain
31 2013-07-16 13:03:16 hmmpanther hmmpanther \N \N \N \N \N \N \N \N hmmpanther domain
32 2013-07-16 13:03:13 pfscan pfscan \N \N \N \N \N \N \N \N pfscan domain
33 2013-07-16 13:03:16 tigrfam tigrfam \N \N \N \N \N \N \N \N tigrfam domain
34 2013-07-16 13:03:16 hamap hamap \N \N \N \N \N \N \N \N hamap domain
35 2013-07-16 13:03:24 prints prints \N \N \N \N \N \N \N \N prints domain
36 2013-07-16 13:03:54 blastprodom blastprodom \N \N \N \N \N \N \N \N blastprodom domain
37 2013-07-16 13:04:18 misc_feature misc_feature \N \N \N \N \N \N \N \N misc_feature feature
38 2013-07-16 13:05:11 gene gene \N \N \N \N \N \N \N \N gene feature
1282 Genes were annotated by the Ensembl automatic analysis pipeline using either a GeneWise model from a human/vertebrate protein, a set of aligned human cDNAs followed by GenomeWise for ORF prediction or from Genscan exons supported by protein, cDNA and EST evidence. GeneWise models are further combined with available aligned cDNAs to annotate UTRs. ensembl transcript 1 web data 1
1503 new description new label 0 web data 2
1504 updated description \N 0 web data 3
21 Protein coding genes annotated in ENA ENA protein coding genes 1 {'colour_key' => '[biotype]','caption' => 'ENA Genes','label_key' => '[biotype]','name' => 'Protein coding genes annotated in ENA','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label','cytoview' => 'gene_label'},'multi_caption' => 'ENA Genes','key' => 'ena_genes'}
22 Cross-references attached by GenomeLoader GenomeLoader cross-references 1 \N
23 Cross-references attached by GenomeLoader to provide names GenomeLoader name cross-references 1 \N
24 Gene3D analysis as of interpro_scan.pl Gene3D 1 {'type' => 'domain'}
25 Protein domains and motifs in the <a rel="external" href="http://nar.oxfordjournals.org/cgi/content/abstract/32/suppl_1/D138?maxtoshow=&amp;HITS=10&amp;hits=10&amp;RESULTFORMAT=1&amp;author1=Bateman&amp;andorexacttitle=and&amp;andorexacttitleabs=and&amp;andorexactfulltext=and&amp;searchid=1&amp;FIRSTINDEX=0&amp;sortspec=relevance&amp;resourcetype=HWCIT">Pfam</a> database. Pfam 1 {'type' => 'domain'}
26 Protein domains and motifs in the <a rel="external" href="http://www.sciencedirect.com/science?_ob=ArticleURL&amp;_udi=B6WK7-457CXWM-3D&amp;_user=776054&amp;_coverDate=11%2F02%2F2001&amp;_rdoc=17&amp;_fmt=high&amp;_orig=browse&amp;_srch=doc-info(%23toc%236899%232001%23996869995%23286382%23FLA%23display%23Volume)&amp;_cdi=6899&amp;_sort=d&amp;_docanchor=&amp;_ct=17&amp;_acct=C000042238&amp;_version=1&amp;_urlVersion=0&amp;_userid=776054&amp;md5=a921e84cd71c59f75644aa28f3224b58">SUPERFAMILY</a> database. Superfamily 1 {'type' => 'domain'}
27 Protein domains and motifs in the <a rel="external" href="http://nar.oxfordjournals.org/cgi/content/full/34/suppl_1/D257?maxtoshow=&amp;HITS=10&amp;hits=10&amp;RESULTFORMAT=1&amp;author1=letunic&amp;andorexacttitle=and&amp;andorexacttitleabs=and&amp;andorexactfulltext=and&amp;searchid=1&amp;FIRSTINDEX=0&amp;sortspec=relevance&amp;fdate=1/1/2006&amp;tdate=12/31/2006&amp;resourcetype=HWCIT">SMART</a> database. SMART 1 {'type' => 'domain'}
28 ncRNA genes imported from ENA ncRNA 1 {'colour_key' => '[biotype]','caption' => 'ENA Genes','label_key' => '[biotype]','name' => 'Non-coding RNA genes annotated in ENA','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label','cytoview' => 'gene_label'},'multi_caption' => 'ENA Genes','key' => 'ena_genes'}
29 Protein domains and motifs from the <a rel="external" href="http://pir.georgetown.edu/pirwww/index.shtml">PIR (Protein Information Resource)</a> Superfamily database. PIRSF 1 {'type' => 'domain'}
30 Protein domains and motifs from the <a rel="external" href="http://www.ebi.ac.uk/ppsearch/">PROSITE</a> profiles database are aligned to the genome. PROSITE patterns 1 {'type' => 'domain'}
31 HMM-Panther families Panther 1 {'type' => 'domain'}
32 Protein domains and motifs from the <a rel="external" href="http://www.ebi.ac.uk/ppsearch/">PROSITE</a> profiles database are aligned to the genome. PROSITE profiles 1 {'type' => 'domain'}
33 Protein domains and motifs in the <a rel="external" href="http://nar.oxfordjournals.org/cgi/content/full/31/1/371?maxtoshow=&amp;HITS=10&amp;hits=10&amp;RESULTFORMAT=1&amp;author1=Haft&amp;andorexacttitle=and&amp;andorexacttitleabs=and&amp;andorexactfulltext=and&amp;searchid=1&amp;FIRSTINDEX=0&amp;sortspec=relevance&amp;fdate=1/1/2003&amp;tdate=12/31/2003&amp;resourcetype=HWCIT">TIGRFAM</a> database. TIGRFAM 1 {'type' => 'domain'}
34 HAMAP is a system, based on manual protein annotation, that identifies and semi-automatically annotates proteins that are part of well-conserved families or subfamilies: the HAMAP families. HAMAP is based on manually created family rules and is applied to bacterial, archaeal and plastid-encoded proteins HAMAP 1 {'type' => 'domain'}
35 Protein fingerprints (groups of conserved motifs) are aligned to the genome. These motifs come from the <a rel="external" href="http://nar.oxfordjournals.org/cgi/content/abstract/31/1/400?maxtoshow=&amp;HITS=10&amp;hits=10&amp;RESULTFORMAT=1&amp;author1=Attwood&amp;andorexacttitle=and&amp;andorexacttitleabs=and&amp;andorexactfulltext=and&amp;searchid=1&amp;FIRSTINDEX=0&amp;sortspec=relevance&amp;resourcetype=HWCIT">PRINTS</a> database. Prints 1 {'type' => 'domain'}
36 NCBI-BlastP search against ProDom families ProDom 1 {'type' => 'domain'}
37 misc_feature feature annotated in ENA ENA features 1 {'multi_name' => 'Protein-coding Gene (MIPS)','caption' => 'Protein-coding Gene (MIPS)','label_key' => '[text_label] [display_label]','name' => 'Protein-coding Gene (MIPS)','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label','cytoview' => 'gene_label'},'key' => 'ensembl'}
38 gene feature annotated in ENA ENA features 1 {'multi_name' => 'Protein-coding Gene (MIPS)','caption' => 'Protein-coding Gene (MIPS)','label_key' => '[text_label] [display_label]','name' => 'Protein-coding Gene (MIPS)','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label','cytoview' => 'gene_label'},'key' => 'ensembl'}
469283 317101 30263615 30426490 101 162976 1
469283 317332 30811187 30929959 101 118873 1
469283 317348 31086734 31205532 101 118899 1
469283 317523 30451401 30562994 101 111694 1
469283 318770 62800643 62842997 1 42355 1
469283 318770 00000001 00042355 42356 84710 1
469283 319047 30562995 30708200 101 145306 1
469283 319456 30993469 31086733 101 93365 1
469283 339816 30426491 30451400 101 25010 1
469283 345105 30708201 30726576 1 18376 1
469283 368744 30249935 30263614 101 13780 1
469283 376992 31205533 31254640 101 49208 1
469283 469270 30929960 30993468 101 63609 1
469341 376992 1 49208 1 49208 1
469339 318770 1 84710 1 84710 1
469336 345105 1 18376 1 18376 1
469335 319456 1 93365 1 93365 1
469343 469270 1 63609 1 63609 1
469333 317523 1 111694 1 111694 1
469340 339816 1 25010 1 25010 1
469342 319047 1 145414 1 145414 1
469338 368744 1 13780 1 13780 1
469332 317348 1 118899 1 118899 1
469334 317101 1 162976 1 162976 1
469337 317332 1 118873 1 118873 1
469348 368744 657970 671649 101 13780 1
469348 317101 671650 834525 101 162976 1
469348 339816 834526 859435 101 25010 1
469348 317523 859436 971029 101 111694 1
469348 319047 971030 1116235 101 145306 1
469348 345105 1116236 1134611 1 18376 1
469348 318770 1134612 1219221 101 84710 1
469348 317332 1219222 1337994 101 118873 1
469348 469270 1337995 1401503 101 63609 1
469348 319456 1401504 1494768 101 93365 1
469348 317348 1494769 1613567 101 118899 1
469348 376992 1613568 1662675 101 49208 1
965899 965888 1 16571 1 16571 1
469271 965905 1 200 10 209 1
469271 965905 201 400 401 600 -1
469282 965905 1 100 701 800 -1
965906 469283 563 3444 31222465 31225346 1
965906 469283 397 562 31214121 31214286 1
965906 469283 100 396 31210077 31210373 1
3 2 1 235425 1 235425 1
7 8 1 349599 1 349599 1
11 12 1 5495278 1 5495278 1
4 13 1 224872 1 224872 1
1 469294 10000000 10500000 PAR 469283 30300000 30800000 1
2 469351 30500000 30599999 HAP 469283 30500000 30699999 1
1 synonym Alternate names for clone Synonyms
2 FISHmap FISH information FISH map
3 organisation Organisation sequencing clone
4 state Current state of clone
5 BACend_flag BAC end flags
6 embl_acc EMBL accession number
7 superctg Super contig id.
8 seq_len Accession length
9 fp_size FP size
10 note Note
11 positioned_by Positioned by
12 bac_acc BAC end accession
13 htg_phase HTG Phase High Throughput Genome Phase
14 toplevel Top Level Top Level Non-Redundant Sequence Region
15 _rna_edit RNA editing \N
16 _selenocysteine Selenocysteine \N
17 codon_table Codon Table Alternate codon table
18 non_ref Non Reference Non Reference Sequence Region
19 circular_seq Circular sequence Circular chromosome or plasmid molecule
1 embl_acc European Nucleotide Archive (was EMBL) accession \N
2 status Status \N
3 synonym Synonym \N
4 name Name Alternative/long name
5 type Type of feature \N
6 toplevel Top Level Top Level Non-Redundant Sequence Region
7 GeneCount Gene Count Total Number of Genes
8 KnownGeneCount Known Gene Count Total Number of Known Genes
9 PseudoGeneCount PseudoGene Count Total Number of PseudoGenes
10 SNPCount SNP Count Total Number of SNPs
11 codon_table Codon Table Alternate codon table
12 _selenocysteine Selenocysteine \N
13 bacend bacend \N
14 htg htg High Throughput phase attribute
15 miRNA Micro RNA Coordinates of the mature miRNA
16 non_ref Non Reference Non Reference Sequence Region
17 sanger_project Sanger Project name \N
18 clone_name Clone name \N
19 fish FISH location \N
21 org Sequencing centre \N
22 method Method \N
23 superctg Super contig id \N
24 inner_start Max start value \N
25 inner_end Min end value \N
26 state Current state of clone \N
27 organisation Organisation sequencing clone \N
28 seq_len Accession length \N
29 fp_size FP size \N
30 BACend_flag BAC end flags \N
31 fpc_clone_id fpc clone \N
32 KnwnPCCount protein_coding_KNOWN Number of Known Protein Coding
33 NovPCCount protein_coding_NOVEL Number of Novel Protein Coding
34 NovPTCount processed_transcript_NOVEL Number of Novel Processed Transcripts
35 PutPTCount processed_transcript_PUTATIVE Number of Putative Processed Transcripts
36 PredPCCount protein_coding_PREDICTED Number of Predicted Protein Coding
37 IGGeneCount IG_gene Number of IG Genes
38 IGPsGenCount IG_pseudogene Number of IG Pseudogenes
39 TotPsCount total_pseudogene Total Number of Pseudogenes
42 KnwnPCProgCount protein_coding_in_progress_KNOWN Number of Known Protein Coding in progress
43 NovPCProgCount protein_coding_in_progress_NOVEL Number of Novel Protein Coding in progress
44 AnnotSeqLength Annotated sequence length Annotated Sequence
45 TotCloneNum Total number of clones Total Number of Clones
46 NumAnnotClone Fully annotated clones Number of Fully Annotated Clones
47 ack Acknowledgement Acknowledgement for manual annotation
48 htg_phase High throughput phase High throughput genomic sequencing phase
49 description Description A general descriptive text attribute
50 chromosome Chromosome Chromosomal location for supercontigs that are not assembled
51 nonsense Nonsense Mutation Strain specific nonesense mutation
52 author Author Group resonsible for Vega annotation
53 author_email Author email address Author email address
54 remark Remark Annotation remark
55 transcr_class Transcript class Transcript class
56 KnwnPTCount processed_transcript_KNOWN Number of Known Processed Transcripts
57 ccds CCDS CCDS identifier
58 CCDS_PublicNote CCDS Public Note Public Note for CCDS identifier, provided by http://www.ncbi.nlm.nih.gov/CCDS
59 Frameshift Frameshift Frameshift modelled as intron
60 PTCount processed_transcript Number of Processed Transcripts
61 PredPTCount processed_transcript_PREDICTED Number of Predicted Processed Transcripts
62 ncRNA Structure RNA secondary structure line
63 skip_clone skip clone Skip clone in align_by_clone_identity.pl \N
64 coding_cnt Protein coding gene count Number of protein coding Genes
65 GeneNo_novCod novel protein_coding Gene Count Number of novel protein_coding Genes
66 GeneNo_rRNA rRNA Gene Count Number of rRNA Genes
67 pseudogene_cnt Pseudogene count Number of pseudogenes
68 GeneNo_snRNA snRNA Gene Count Number of snRNA Genes
69 GeneNo_snoRNA snoRNA Gene Count Number of snoRNA Genes
70 GeneNo_miRNA miRNA Gene Count Number of miRNA Genes
71 GeneNo_mscRNA misc_RNA Gene Count Number of misc_RNA Genes
72 GeneNo_scRNA scRNA Gene Count Number of scRNA Genes
73 GeneNo_MTrRNA Mt_rRNA Gene Count Number of Mt_rRNA Genes
74 GeneNo_MTtRNA Mt_tRNA Gene Count Number of Mt_tRNA Genes
75 GeneNo_RNA_pseu RNA_pseudogene Gene Count Number of RNA_pseudogene Genes
76 GeneNo_tRNA tRNA Gene Count Number of tRNA Genes
77 GeneNo_rettran retrotransposed Gene Count Number of retrotransposed Genes
78 GeneNo_snlRNA snlRNA Gene Count Number of snlRNA Genes
79 GeneNo_proc_tr processed_transcript Gene Count Number of processed transcript Genes
80 supercontig SuperContig name \N
81 well_name Well plate name \N
82 bacterial Bacterial \N
83 NovelCDSCount Novel CDS Count \N
84 NovelTransCount Novel Transcript Count \N
85 PutTransCount Putative Transcript Count \N
86 PredTransCount Predicted Transcript Count \N
87 UnclassPsCount Unclass Ps count \N
88 KnwnprogCount Known prog Count \N
89 NovCDSprogCount Novel CDS prog count \N
90 bacend_well_nam BACend well name \N
91 alt_well_name Alt well name \N
92 TranscriptEdge Transcript Edge \N
93 alt_embl_acc Alt European Nucleotide Archive (was EMBL) acc \N
94 alt_org Alt org \N
95 intl_clone_name International Clone Name \N
96 embl_version European Nucleotide Archive (was EMBL) Version \N
97 chr Chromosome Name Chromosome Name Contained in the Assembly
98 equiv_asm Equivalent EnsEMBL assembly For full chromosomes made from NCBI AGPs
99 GeneNo_ncRNA ncRNA Gene Count Number of ncRNA Genes
100 GeneNo_Ig Ig Gene Count Number of Ig Genes
109 HitSimilarity hit similarity percentage id to parent transcripts
110 HitCoverage hit coverage coverage of parent transcripts
111 PropNonGap proportion non gap proportion non gap
112 NumStops number of stops \N
113 GapExons gap exons number of gap exons
114 SourceTran source transcript source transcript
115 EndNotFound end not found end not found
116 StartNotFound start not found start not found
117 Frameshift Fra Frameshift modelled as intron \N
118 ensembl_name Ensembl name Name of equivalent Ensembl chromosome
119 NoAnnotation NoAnnotation Clones without manual annotation
120 hap_contig Haplotype contig Contig present on a haplotype
121 annotated Clone Annotation Status \N
122 keyword Clone Keyword \N
123 hidden_remark Hidden Remark \N
124 mRNA_start_NF mRNA start not found \N
125 mRNA_end_NF mRNA end not found \N
126 cds_start_NF CDS start not found \N
127 cds_end_NF CDS end not found \N
128 write_access Write access for Sequence Set 1 for writable , 0 for read-only
129 hidden Hidden Sequence Set \N
130 vega_name Vega name Vega seq_region.name
131 vega_export_mod Export mode E (External), I (Internal) etc
132 vega_release Vega release Vega release number
133 atag_CLE Clone_left_end Clone_lef_end feature marked in GAP database
134 atag_CRE Clone_right_end Clone_right_end feature marked in GAP database
135 atag_Misc Misc miscellaneous feature marked in GAP database
136 atag_Unsure Unsure region of uncertain DNA sequence marked in GAP database
137 MultAssem Multiple Assembled seq region Part of Seq Region is part of more than one assembly
140 wgs WGS contig WGS contig integrated into the map
141 bac AGP clones tiling path of clones
142 GeneGC Gene GC Percentage GC content for this gene
143 TotAssemblyLeng Finished sequence length Length of the assembly not counting sequence gaps
144 amino_acid_sub Amino acid substitution Some translations have been manually curated for amino acid substitiutions. For example a stop codon may be changed to an amino acid in order to prevent premature truncation, or one amino acid can be substituted for another.
145 _rna_edit rna_edit RNA edit
146 kill_reason Kill Reason Reason why a transcript has been killed
147 strip_UTR Strip UTR Transcript needs bad UTR removing
148 TotAssLength Finished sequence length Finished Sequence
149 PsCount pseudogene Number of Pseudogenes
152 TotPTCount total_processed_transcript Total Number of Processed Transcripts
153 TotPCCount total_protein_coding Total Number of Protein Coding
154 NovNcCount novel_non_coding Number of Novel Non Coding
155 KnwnPolyPsCount known_polymorphic Number of Known Polymorphic Pseudogenes
156 PolyPsCount polymorphic_pseudogene Number of Polymorphic Pseudogenes
157 TotIGGeneCount total_IG_gene Total Number of IG Genes
158 ProcPsCount proc_pseudogene Number of Processed Pseudogenes
159 UnPsCount unproc_pseudogene Number of Unprocessed Pseudogenes
160 TPsCount transcribed_pseudogene Number of Transcribed Pseudogenes
161 TECCount TEC Number of TEC Genes
162 KnwnIGGeneCount IG_gene_KNOWN Number of Known IG Genes
163 KnwnIGPsGeCount IG_pseudogene_KNOWN Number of Known IG Pseudogenes
164 IsoPoint Isoelectric point Pepstats attributes
165 Charge Charge Pepstats attributes
166 MolecularWeight Molecular weight Pepstats attributes
167 NumResidues Number of residues Pepstats attributes
168 AvgResWeight Ave. residue weight Pepstats attributes
170 initial_met Initial methionine Set first amino acid to methionine
171 NonGapHCov NonGapHCov \N
172 otter_support otter support Evidence ID that was used as supporting feature for building a gene in Vega
173 enst_link enst link Code to link a OTTT with an ENST when they both share the CDS of ENST
174 upstream_ATG upstream ATG Alternative ATG found upstream of the defined as start ATG for the transcript
175 TPPsCount transcribed_processed_pseudogene Number of Transcribed Processed Pseudogenes
176 TUPsCount transcribed_unprocessed_pseudogene Number of Transcribed Unprocessed Pseudogenes
177 UniPsCount unitary_pseudogene Number of Unitary Pseudogenes
178 KnwnTECCount TEC_KNOWN Number of Known TEC genes
179 TotTECGeneCount TEC_all Total number of TEC genes
180 TUyPsCount transcribed_unitary_pseudogene Number of Transcribed Unitary Pseudogenes
181 PolyCount polymorphic Number of Polymorphic Genes
182 KnwnPolyCount polymorphic Number of Known Polymorphic Genes
183 KnwnTRCount TR_gene_known Number of Known TR Genes
184 TRGeneCount TR_gene Number of TR Genes
185 TRPsCount TR_pseudo Number of TR Pseudogenes
186 tp_ott_support otter protein transcript support Evidence ID that was used as supporting feature for building a gene in Vega
187 td_ott_support otter dna transcript support Evidence ID that was used as supporting feature for building a gene in Vega
188 ep_ott_support otter protein exon support Evidence ID that was used as supporting feature for building a gene in Vega
189 ed_ott_support otter dna exon support Evidence ID that was used as supporting feature for building a gene in Vega
190 GeneNo_lincRNA lincRNA Gene Count Number of lincRNA Genes
191 StopGained SNP causes stop codon to be gained This transcript has a variant that causes a stop codon to be gained in at least 10 percent of a HapMap population
192 StopLost SNP causes stop codon to be lost This transcript has a variant that causes a stop codon to be lost in at least 10 percent of a HapMap population
193 GeneNo_class_I_ class_I_RNA Gene Count Number of class_I_RNA Genes
194 GeneNo_SRP_RNA SRP_RNA Gene Count Number of SRP_RNA Genes
195 GeneNo_class_II class_II_RNA Gene Count Number of class_II_RNA Genes
196 GeneNo_P_RNA RNase_P_RNA Gene Count Number of RNase_P_RNA Genes
197 GeneNo_RNase_MR RNase_MRP_RNA Gene Count Number of RNase_MRP_RNA Genes
198 lost_frameshift lost_frameshift Frameshift on the query sequence is lost in the target sequence
199 AltThreePrime Alternate three prime end The position of other possible three prime ends for the transcript
216 GeneInLRG Gene in LRG This gene is contained within an LRG region
217 GeneOverlapLRG Gene overlaps LRG This gene is partially overlapped by a LRG region (start or end outside LRG)
218 readthrough_tra readthrough transcript Havana readthrough transcripts
300 CNE Constitutive exon An exon that is always included in the mature mRNA, even in different mRNA isoforms
301 CE Cassette exon One exon is spliced out of the primary transcript together with its flanking introns
302 IR Intron retention A sequence is spliced out as an intron or remains in the mature mRNA transcript
303 MXE Mutually exclusive exons In the simpliest case, one or two consecutive exons are retained but not both
304 A3SS Alternative 3' sites Two or more splice sites are recognized at the 5' end of an exon. An alternative 3' splice junction (acceptor site) is used, changing the 5' boundary of the downstream exon
305 A5SS Alternative 5' sites Two or more splice sites are recognized at the 3' end of an exon. An alternative 5' splice junction (donor site) is used, changing the 3' boundary of the upstream exon
306 AFE Alternative first exon The second exons of each variant have identical boundaries, but the first exons do not overlap
307 ALE Alternative last exon Penultimate exons of each splice variant have identical boundaries, but the last exons do not overlap
308 II Intron isoform Alternative donor or acceptor splice sites lead to truncation or extension of introns, respectively
309 EI Exon isoform Alternative donor or acceptor splice sites leads to truncation or extension of exons, respectively
310 AI Alternative initiation Alternative choice of promoters
311 AT Alternative termination Alternative choice of polyadenylation sites
312 patch_fix Assembly Patch Fix Assembly patch that will, in the next assembly release, replace the corresponding sequence found in the current assembly
313 patch_novel Assembly Patch Novel Assembly patch that will, in the next assembly release, be retained as an alternate non-reference sequence in a similar way to haplotypes
314 LRG Locus Reference Genomic Locus Reference Genomic sequence
315 NoEvidence Evidence for transcript removed Supporting evidence for this projected transcript has been removed
316 circular_seq Circular sequence Circular chromosome or plasmid molecule
317 external_db External database External database to which seq_region name may be linked
318 split_tscript split_tscript split_tscript
319 Threep Three prime end Alternate three prime end
320 gene_cluster Gene cluster Havana annotated gene cluster
328 _rib_frameshift Ribosomal Frameshift Position and magnitude of frameshift
345 vega_ref_chrom Vega reference chromosome Haplotypes reference a regular chromosome (indicated in the value of the attribute)
346 PutPCCount protein_coding_PUTATIVE Number of Putative Protein Coding
347 proj_alt_seq Projection altered sequence Projected sequence differs from original
348 hav_gene_type Havana gene biotype Gene biotype assigned by Havana
349 GeneNo_asense antisense Gene Count Number of antisense Genes
350 GeneNo_sense_in sense_intronic Gene Count Number of sense_intronic Genes
351 GeneNo_amb_orf ambiguous_orf Gene Count Number of ambiguous_orf Genes
352 GeneNo_ret_int retained_intron Gene Count Number of retained_intron Genes
353 noncoding_cnt Non coding gene count Number of non coding genes
354 GeneNo_ncrna_h ncrna_host Gene Count Number of ncrna_host Genes
355 GeneNo_sens_ov sense_overlapping Gene Count Number of sense_overlapping Genes
356 GeneNo_3prime 3prime_overlapping Gene Count Number of 3prime_overlapping Genes
357 GeneNo_tmRNA tmRNA Gene Count Number of tmRNA Genes
358 PHIbase_mutant PHI-base mutant PHI-base phenotype of the mutants
359 GeneNo_ribozyme ribozyme Gene Count Number of ribozyme Genes
360 ncrna_host ncrna_host Havana ncrna_host gene
361 peptide-class Peptide classification The classification of the gene or transcript based on alignment to NR (values: TE WH NH)
362 working-set Working Gene Set High-confidence set of genes, composed of evidence-based genes and non-overlapping protein-coding ab initio gene models
363 filtered-set Filtered Gene Set v1 Working genes that are screened for TE content and orthology with sorghum and rice.
364 super-set Super Working Gene Set Set of all working gene set loci from both Builds 4a and 5a
365 projected4a2 Projected by alignment Temporary (Monday, August 23, 2010)
366 merged Merged species \N
367 karyotype_rank Rank in the karyotype For a given seq_region, if it is part of the species karyotype, will indicate its rank
368 noncoding_acnt Alternate non coding gene count Number of non coding genes on alternate sequences
369 coding_acnt Alternate protein coding gene count Number of protein coding genes on alternate sequences
370 pseudogene_acnt Alternate pseudogene count Number of pseudogenes on alternate sequences
371 clone_end Clone end Side of the contig on which a vector lies (enum:RIGHT, LEFT).
372 contig_scaffold Contig Scaffold Scaffold that contains mutually ordered contigs.
373 current_version Current Accession Version Identifies the most recent version of an accession.
374 seq_status Sequence Status Sequence status.
375 clone_vector Vector sequence A clone-end vector associated with a contig (enum:SP6, T7).
376 creation_date Creation date Creation date of annotation
377 update_date Update date Last update date of annotation
378 seq_date Sequence date Sequence date
379 has_stop_codon Contains stop codon Translation attribute
380 havana_cv Havana CV term Controlled vocabulary terms from Havana
381 TlPPsCount translated_processed_pseudogene Number of Translated Processed Pseudogenes
382 NoTransRefError No translations due to reference error This gene is believed to include protein coding transcripts, but no transcript has a translation due to a reference assembly error making specifying the translation impossible.
383 parent_exon_key parent_exon_key The exon key to identify a projected transcript's parent transcript.
386 parent_sid parent_sid The parent stable ID to identify a projected transcript's parent transcript. For internal statistics use only since this method does not work in all cases.
387 snoncoding_acnt Alternate short non coding gene count Number of short non coding genes on alternate sequences
388 lnoncoding_acnt Alternate long non coding gene count Number of long non coding genes on alternate sequences
389 snoncoding_cnt Short non coding gene count Number of short non coding genes
390 lnoncoding_cnt Long non coding gene count Number of long non coding genes
1 1 chromosome NCBI33 1 default_version
2 1 supercontig \N 2 default_version
3 1 clone \N 3 default_version
4 1 contig \N 4 default_version,sequence_level
6 1 chunk \N 5 default_version
7 1 alt_chrom \N 6 default_version
1 1 plasmid GCA_000292705.1 2 default_version
2 1 contig 4 default_version,sequence_level
3 1 chromosome GCA_000292705.1 1 default_version
1 1 469283 1 100 50.3
2 1 469283 101 200 68.8
3 1 469283 201 300 32
4 1 469283 301 400 90.9
5 1 469283 401 500 88.8
6 1 469283 501 600 12
7 2 469283 1 100 50.3
8 2 469283 101 200 68.8
9 2 469283 201 300 32
10 2 469283 301 400 90.9
11 2 469283 401 500 88.8
12 2 469283 501 600 12
3278356 162B08-6 ZZ13 1 GAAGCAAAACACTACAATGGCGGTGCGCTCGACGCGCC
3278355 198K08-11 ZZ13 1 TTTGATAATTTATTTCAGGCTCTGCGAGAATGAACTC
3278354 137M14-9 ZZ13 1 GCCTGTTAGATTCACCCCGGTCTCGAGTCTCTCTCTC
3278353 201K11-2 ZZ13 1 CCTTGAAATTAAGAAGAAGGGCTCGCCGCCGCCGCAC
3278352 151H04-2 ZZ13 1 GTTCAGTATGGTTTTCAAGAGGGTTAATTGAAAGAG
3278351 119K09-6 ZZ13 2 TGGAATCTCCATTATTATGAGCGCGCCGCCCCACTCC
3278350 187N08-1 ZZ13 1 GGAGGTGGATGAGAGGCCAACTCCAGCTCTCTCTCT
3278349 211G12-2 ZZ13 1 GGGCTGCGATTGCAGCTCACCCTTCGTTAGATAATCA
3278348 181D04-4 ZZ13 4 TTCTTTCTTTTTTAAAGGGGGTCACGGTCGCCAAACC
3278347 213E17-4 ZZ13 1 ACATGGGAGACATTTTGCCAAGGCAGCATCAAAAAAC
3278346 112H10-9 ZZ13 2 AGAGAGAGCTGGAGTTTTGGTAAGAGCATCCTAAGC
3278345 229D11-1 ZZ13 1 GGCTGGGATAGAGGAGGACGTCTCGAGTCTCTCTCTC
3278344 235I06-3 ZZ13 1 CGAGAAAAGGAGAGGAGCCTGCTGCCTCCGCCAGAC
3278343 191F14-10 ZZ13 1 TCCAGCTGCATGAGCAAGGGCCGGGGGCGCGTCAGACC
3278342 188P19-10 ZZ13 2 ATGGTTATGAGCTGCTCTGCGAGCGCGCGAGTGAGCA
3278341 224L18-4 ZZ13 1 GTGGCTCCAGGCATCCAGCATAAAATACATGTCTCAA
3278340 156G19-5 ZZ13 3 GAGAGAGCTGGAGTTTTTACGGAAGCTGCTGCCGCCC
3278339 189F08-4 ZZ13 1 CACGGCAATTCCTAGGTCCCGAGCGGAATAACAGGCCC
3278338 134O12-6 ZZ13 1 GGAGACTGCTGGGAGTCCTAAACTCCAGCTCTCTCTCT
3278337 219H13-10 ZZ13 1 TGCAGCCGCCCATCAAGAATGGGCTCTGGAGGCAGACC
4828567 3278337 1 469273 120635196 120635214 1 1503 1 19 1 19M L
4828568 3278337 1 469273 120640125 120640142 1 1503 20 37 1 18M R
4828699 3278342 1 469288 78423526 78423543 1 1503 1 18 1 18M L
4828700 3278342 1 469288 78482803 78482821 1 1503 19 37 1 19M R
4828469 3278343 1 469271 75028243 75028261 -1 1503 1 19 1 19M L
4828470 3278343 1 469271 75026001 75026018 -1 1503 20 37 1 18M R
4828725 3278346 1 469273 98254679 98254696 1 1503 1 18 1 18M L
4828726 3278346 1 469273 97923747 97923763 1 1503 20 36 1 17M R
4828557 3278347 1 469275 100525017 100525035 -1 1503 1 19 1 19M L
4828558 3278347 1 469275 100522073 100522091 -1 1503 19 37 1 19M R
4828761 3278348 1 469288 111529629 111529647 -1 1503 1 19 1 19M L
4828762 3278348 1 469288 111528958 111528976 -1 1503 19 37 1 19M R
4828459 3278351 1 469273 83236330 83236347 -1 1503 1 18 1 18M L
4828460 3278351 1 469273 83225874 83225891 -1 1503 19 36 1 18M R
4828401 3278355 1 469275 24825423 24825440 1 1503 1 18 1 18M L
4828402 3278355 1 469275 24838796 24838814 1 1503 19 37 1 19M R
4828583 3278356 1 469272 40626355 40626373 -1 1503 1 19 1 19M L
4828584 3278356 1 469272 40572274 40572291 -1 1503 20 37 1 18M R
This diff is collapsed.
161874 21717 4
161875 21717 1
161876 21716 2
161877 21716 6
161878 21717 3
161879 21716 5
161880 21716 3
161881 21716 1
161882 21716 4
161882 21717 2
161883 21718 5
161884 21718 6
161885 21718 1
161886 21718 3
161887 21718 4
161888 21718 2
161889 21719 3
161890 21719 1
161891 21719 5
161892 21719 4
161893 21719 2
161894 21720 1
161895 21720 2
161896 21721 6
161897 21722 1
161898 21722 8
161899 21721 2
161900 21722 10
161901 21721 8
161902 21721 9
161903 21721 3
161903 21722 3
161904 21722 7
161905 21721 7
161906 21722 6
161907 21722 2
161908 21722 5
161909 21721 5
161910 21721 4
161910 21722 4
161911 21721 10
161912 21721 1
161913 21722 9
161914 21723 2
161915 21723 7
161916 21723 10
161917 21723 9
161918 21723 5
161919 21723 8
161920 21723 11
161921 21723 12
161922 21723 3
161923 21723 4
161924 21723 13
161925 21723 6
161926 21723 1
161927 21724 14
161928 21724 10
161929 21724 17
161930 21724 6
161931 21724 18
161932 21724 4
161933 21724 8
161934 21724 16
161935 21724 2
161936 21724 3
161937 21724 15
161938 21724 9
161939 21724 13
161940 21724 7
161941 21724 1
161942 21724 12
161943 21724 11
161944 21724 5
161945 21725 1
161946 21726 3
161947 21726 2
161948 21726 1
161949 21727 2
161949 21728 2
161950 21727 5
161951 21727 3
161951 21728 3
161952 21727 4
161952 21728 4
161953 21728 6
161954 21728 5
161955 21727 1
161955 21728 1
161956 21728 7
161957 21729 2
161958 21729 3
161959 21729 8
161960 21729 6
161961 21729 1
161962 21729 7
161963 21729 9
161964 21729 5
161965 21729 4
161966 21731 1
161967 21731 12
161968 21731 2
161969 21730 2
161969 21731 8
161970 21730 4
161970 21731 10
161971 21731 3
161972 21730 5
161972 21731 11
161973 21730 1
161974 21730 6
161975 21731 5
161976 21730 3
161976 21731 9
161977 21731 4
161978 21731 7
161979 21731 6
161980 21733 3
161981 21732 6
161982 21733 2
161983 21732 4
161984 21733 1
161985 21732 3
161986 21732 7
161987 21732 2