Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
Z
zmap
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Iterations
Wiki
Requirements
Jira
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package Registry
Container Registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
ensembl-gh-mirror
zmap
Commits
b3c1f7ff
Commit
b3c1f7ff
authored
16 years ago
by
edgrif
Browse files
Options
Downloads
Patches
Plain Diff
first version
parent
74f988e7
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
ZMAP_LACE_PROJECT/2009/zmap_lace.2009_03_06
+381
-0
381 additions, 0 deletions
ZMAP_LACE_PROJECT/2009/zmap_lace.2009_03_06
with
381 additions
and
0 deletions
ZMAP_LACE_PROJECT/2009/zmap_lace.2009_03_06
0 → 100755
+
381
−
0
View file @
b3c1f7ff
==============================================================================
ZMap/Otterlace Development
Date: Thursday 12th Feb 2009
Attendees: jgrg, jla1, lw2, edgrif, st3, br2
------------------------------------------------------------------------------
CURRENT ITEMS
Items Completed
---------------
automating DE line creation
adding all checks fropm kj2's script to lace
3/ Data for zebra fish DAS tracks needs mapping between assemblies is done.
High priority
-------------
1/ Tick boxed for controlled vocabulary
***** jla1 wants this to be top, top priority *****
jla1 said there is an urgent need to add "tick boxes" to the lace interface to
ensure that certain properties of annotated features can only be chosen from
a controlled vocabulary. lw2 to check whether "fragmented_loci" is included
in the tags. lw2 said all other tags are in the RT ticket: NNNNNNNNN
1a/ Locus Finished button
st3 asked if there could be a tag on a Locus to say it was Finished,
implemented via a button so that the correct tag(s) were automatically
entered. jgrg to implement.
1b/ Clone Finished button
st3 would like a "Clone finished" button with same function as Locus Finished
button. jgrg to implement.
2/ Clone summary info / Quality Control
jgrg has done lots of work on this and summarised his progress in an
email circulated to us all.
Following on jla1 also suggested that it would be good to have
automated QC scripts trawling through the database regularly looking for
duff data. Tina Eyre wrote one that could be co-opted and st3 also has
some. This is becoming an important issue for Havana to ensure really
good quality data. Add automated checking against SwissProt for CDS.
jgrg said that much of the checking was done for annotation and he will circulate
an email summarising this. QC for save to data back to Otterlace need doing though.
We need an "end_missing" tag as well as the current "end_not_found" tag.
3/ SNP tracks
jla1 would like some of the DAS tracks & other data sources currently available
to be put into lace and hence zmap. jgrg said that this is not immediately
straight forward as they don't all say which assembly they are based on but some
can be done fairly soon. e.g. comparacon ? jgrg to investigate.
4/ Styles
James working to introduce this now, zmap code is all there. jgrg is to set a week
when he can work on this and edgrif will set aside the week also.
edgrif needs to do 4 weeks approx of acedb work in two 2 weeks lots so suggests
we set week beginning 16th March as the week to do styles (i.e. the week after
the RT course).
5/ Solexa reads
jla1 would like to get Solexa reads into pipeline including data from Bronwyn.
As an addition to this edgrif and rds will think about how we might give some
kind of "overview" for alignment columns that could show where the aligns are
without drawing them all.
Also requires looking at blixem to see if we can view this data in it.
In fact John Collins has initial data for gene models and confirmed introns
that can be added now without code changes. This data is coming from Simon
Whitehead.
James will indicate in display where sequence is missing.
6/ Alias/renaming of Loci
lw2 to contact MGI as there are problems with IDs from them.
7/ clone path
lw2 would like the full clone extents displayed with the non-golden sections displayed.
Do we need the clone ends information for this, edgrif to check ?? Check with Leo's
smapped example with several sections of a single clone...
Would like this info. in navigator panel + navigator panel needs to display both
the foocanvas scrolled window area _and_ the actual area on the screen, and both
should be draggable...
edgrif will do tile path information/display, rds will do navigator bit.
8/ multi-view interactions
kj2 would like to click on a feature in one view and see it highlighted in another
so that she can look for genes present in more than one clone. edgrif to do this.
9/ RT numbers
It was agreed that where possible RT ticket numbers would be included in the
meetings notes. lw2, edgrif, jgrg to look up numbers.
NEEDS DOING.....
10/ feature grouping tags (e.g. for 5'and 3' EST read pairs)
wormdb uses paired tags specific to EST read pairs but we need a more flexible
generalisation of this to handle multiple features and different types of
feature. jgrg's group have been working on filtering hits in a better way and
so have more information about grouping for display.
Ed, James, Graham and Roy met to discuss design and edgrif has implemented
code in zmap to support this but adding it to the system will require a
time when both he and James can work on it.
11/ Naming of Alternative Alleles
-st3 asked about naming of alternative alleles in different mouse strains / human
haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after
the clones on the reference seqiuence. jla1 suggested correctly naming them after the
clones they are on, but making sure that the annotators can see the associated
'reference assembly' gene. st3 said this could be done via the alt_allele table, and
if it were done across the board, ie including KNOWN genes, then this would make Vega
prep easier
Medium priority
---------------
0/ new column bump to show inconsistent matches
Often annotator has many matches that fit against an existing transcript, be good
to have a mode that hid these and only showed the ones inconsistent with the
transcripts splices.
1/ dotter error messages
lw2 said that sometimes dotter just does not appear. edgrif to check that dotter
is reporting errors properly and to make sure they show in dialog windows not on
the terminal which is often not available to the annotator.
2/ removing evidence already used *************
annotators would like to be able to remove from display homologies that
have already been used to annotate variants etc. Does this need to be
persistent in the database in some way ?? edgrif & jgrg will get
together to arrange this via styles so it can persist in a natural way
in the database.
**24526: Showing which evidence has been used
Differential coloring of matches that have been used already as evidence
for a transcript
mainly requires jgrg to mark features and then tell zmap to move the features
to a new column or repaint them with a new style.
3/ Locus list
jgrg to provide a list of loci as another tab window. + searching on ensembl ids.
5/ bug in acedb server
jgrg raised a bug in the server which was causing it run out of memory, edgrif
to investigate. There is a ticket for this: 51894
edgrif to make jgrg has up to date binaries for dotter etc.
6/ popups/labels for transcripts
jla1 said that apollo had a neat way of showing a label for a transcript
that remained in one place on the screen as the window was scrolled. edgrif
to investigate + look at "tool tips" for transcripts....especially with
locus information.
7/ Best in Genome matches
jla1 also said she would like to "best in genome" displayed. jgrg said this
is not easy as Otterlace works on a clone by clone basis. It was agreed
that would be worthwhile to show at least "best in clone" or better to do
a crude "best in genome".
------------------------------------------------------------------------------
BACK-BURNER ITEMS
ZMap/acedb
----------
1/ Interface issues:
jla1 and lw2 said they would like the marked area to be less obvious an also to
be a "greying" out rather than blue and with less dense dots. edgrif to implement.
jla1 said she would like to be able to click on an exon and see evidence (and
transcripts ?) with the same splice be highlighted. laurens also wants this
as it would often avoid having to open dotter to check.
2/ Display of multiple compara alignments
multiple alignments: edgrif is about a third of the way through implementing a
more general way of displaying arbitrary blocks. This will become a high
priority item as we move to haplotypes etc.
th said this would be needed soon so it should be moved up the priority list.
jgrg said they have mappings in lace that could be passed on to zmap easily
and also said that annotators can already annotate assemblies from variants
and different species alongside each other as needed.
We need to decide on the format for specifying the alignments.
3/ alternative translations: edgrif about half way through code to do this.
edgrif is doing this as part of the protein search code since this code
does translations itself. edgrif will talk to jgrg about how alternative
genetic codes can be specified with acedb.
We need a test database for this. jgrg said this would come soon.
edgrif will add field to transcript feature to hold alternative translation
table.
4/ Blixem enhancements
two areas:
- display multiple overlapping transcripts better (includes removing the many
yellow lines introduced by this...clarify this point), have a scrolled window
of the transcripts. jgrg said that perhaps only the transcripts made by havana
should be displayed. jla1 said she would like to be able to dynamically update
the transcripts displayed.
- better interaction with zmap, e.g. click on things in zmap and see them
highlighted in blixem and vice versa....
we had better have a more generalised protocol for communicating with external
programs....
- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches
will be added.
Perhaps one way to get this done would be employ a good C programmer on a
short contract.
5/ acedb server performance
edgrif investigating two possibilities for improving performance:
- make sgifaceserver stream data rather than batch it up, would
save a lot of memory.
- deferred loading, only load features when needed and load in
zone requested by user....design done...now need to implement.
6/ A new canvas
rds has been looking at alternative canvas implementations which offer an MVC
model. He has managed to get goocanvas developers to fix some bugs and make
some changes to support our needs.
the goocanvas MVC model will mean we do not have to copy data to split windows
meaning greatly reduced memory usage.
the goocanvas will cope automatically with the X Windows window size limit, this
combined with changes in the gtk scrolling model means we will be able to do away
with having two scroll bars.
We will introduce the new canvas this year.
Otterlace
---------
1/ Alternative alignment programs
There has been some discussion about using splice aware alignment programs.
jgrg is waiting for a fix to exonerate to support the new pipeline mustapha
has written.
edgrif and jgrg both commented that some changes to acedb data structures
would be needed to represent both HSP's that are "joined up" but also
protein matches that start part of the way through a peptide. BUT one
possibility would be for zmap to access this data directly from a mysql
database thus sidestepping the need to put it in acedb first. gffv3 will also
be needed to represent this kind of joined up HSP data in a natural and
robust way.
Changes will also be required to represent codons that are spliced across
introns as perhaps surprisingly none of the acedb programs can cope with
this currently (and neither can zmap).
2/ Spell checker
jla1 reported a problem that free text fields and some fixed text fields
have misspellings (is that a mis-spelling ?) and it would be good to have
some autocorrection facility. The ideal would be to have some widget that
allowed other dictionaries (e.g. science) to be attached to it and could thus
be used as a general text entry tool.
3/ Sequence exceptions
kj2 raised the subject of how to indicate sequence exceptions,
e.g. when bases are skipped in translations. kj2 wondered if alternative
translations could be registered as sequence exceptions, edgrif said he
prefer a separate mechanism as much of the code is already done for this.
We should therefore include a mechanism in zmap for sequence exceptions,
this would require a similar mechanism in acedb. This is yet another reason
for GFF 3 which has standards for frame shifts and other things.
There should be a way of tagging transcripts where there are sequence
exceptions.
------------------------------------------------------------------------------
Next Meeting
Will be at 2pm, 19th March 2009
==============================================================================
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment