From 393162cb8dd6914030184b3c4ce1472c67b0e485 Mon Sep 17 00:00:00 2001 From: edgrif <edgrif> Date: Fri, 13 Feb 2009 13:03:07 +0000 Subject: [PATCH] moved from parent dir. --- ZMAP_LACE_PROJECT/2006/zmap_lace.2006_06_21 | 150 +++++++ ZMAP_LACE_PROJECT/2006/zmap_lace.2006_07_12 | 181 ++++++++ ZMAP_LACE_PROJECT/2006/zmap_lace.2006_07_26 | 111 +++++ ZMAP_LACE_PROJECT/2006/zmap_lace.2006_09_13 | 160 +++++++ ZMAP_LACE_PROJECT/2006/zmap_lace.2006_10_11 | 140 ++++++ ZMAP_LACE_PROJECT/2006/zmap_lace.2006_11_02 | 174 ++++++++ ZMAP_LACE_PROJECT/2006/zmap_lace.2006_11_29 | 310 +++++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_01_24 | 239 ++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_02_01 | 203 +++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_02_23 | 360 +++++++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_03_14 | 264 +++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_04_13 | 300 +++++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_04_23 | 234 ++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_05_09 | 234 ++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_06_06 | 265 +++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_06_14 | 126 ++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_07_04 | 223 ++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_08_15 | 190 ++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_09_12 | 211 +++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_09_26 | 218 +++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_10_11 | 253 +++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_10_25 | 246 ++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_11_08 | 306 +++++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_11_22 | 283 ++++++++++++ ZMAP_LACE_PROJECT/2007/zmap_lace.2007_12_06 | 324 ++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_01_10 | 389 ++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_02_07 | 469 ++++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_02_21 | 445 +++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_03_06 | 456 +++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_03_20 | 432 ++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_04_16 | 421 ++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_05_08 | 409 +++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_05_22 | 426 ++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_06_05 | 425 ++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_06_26 | 422 ++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_07_10 | 421 ++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_07_24 | 452 +++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_08_07 | 452 +++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_09_25 | 450 +++++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_10_09 | 417 +++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_10_23 | 377 ++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_11_20 | 376 ++++++++++++++++ ZMAP_LACE_PROJECT/2008/zmap_lace.2008_12_04 | 376 ++++++++++++++++ ZMAP_LACE_PROJECT/2009/zmap_lace.2009_01_15 | 373 ++++++++++++++++ ZMAP_LACE_PROJECT/2009/zmap_lace.2009_01_29 | 369 +++++++++++++++ 45 files changed, 14062 insertions(+) create mode 100755 ZMAP_LACE_PROJECT/2006/zmap_lace.2006_06_21 create mode 100755 ZMAP_LACE_PROJECT/2006/zmap_lace.2006_07_12 create mode 100755 ZMAP_LACE_PROJECT/2006/zmap_lace.2006_07_26 create mode 100755 ZMAP_LACE_PROJECT/2006/zmap_lace.2006_09_13 create mode 100755 ZMAP_LACE_PROJECT/2006/zmap_lace.2006_10_11 create mode 100755 ZMAP_LACE_PROJECT/2006/zmap_lace.2006_11_02 create mode 100755 ZMAP_LACE_PROJECT/2006/zmap_lace.2006_11_29 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_01_24 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_02_01 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_02_23 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_03_14 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_04_13 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_04_23 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_05_09 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_06_06 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_06_14 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_07_04 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_08_15 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_09_12 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_09_26 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_10_11 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_10_25 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_11_08 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_11_22 create mode 100755 ZMAP_LACE_PROJECT/2007/zmap_lace.2007_12_06 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_01_10 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_02_07 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_02_21 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_03_06 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_03_20 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_04_16 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_05_08 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_05_22 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_06_05 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_06_26 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_07_10 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_07_24 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_08_07 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_09_25 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_10_09 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_10_23 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_11_20 create mode 100755 ZMAP_LACE_PROJECT/2008/zmap_lace.2008_12_04 create mode 100755 ZMAP_LACE_PROJECT/2009/zmap_lace.2009_01_15 create mode 100755 ZMAP_LACE_PROJECT/2009/zmap_lace.2009_01_29 diff --git a/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_06_21 b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_06_21 new file mode 100755 index 000000000..c99d96d34 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_06_21 @@ -0,0 +1,150 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wens 21st June 2006 + +Attendees: th, jla1, kj2, jgrg, edgrif + + +1) otterlace + zmap +=================== + + +Phase 1 +------- + +Strategy is to have Liz test for now as she is an experienced user +and we will be able to respond to her suggestions quickly. + +For this phase lace is set up so that xace/fmap and zmap can be +used side by side for comparison. + +ZMap issues arising: + + locus names missing + bumping (maybe solved, not tested) + 3 frame translation (roy) + spandit (fixed, not tested) + keyboard shortcuts: check against spandit + +otterlace issues: + + needs new tk for clone_descriptions + 'annotated' should be a status, not text + + +Phase 2 +------- + +Extend usage to other experienced but more varied users, Charlie and +Gavin suggested as good candidates. + + +Genefinding +----------- + +Some annotators make considerable usage of the built in genefinder in +fmap, acedb can export these features, zmap needs some additional drawing +routines to display them. + + +Linux/Mac builds only +--------------------- + +Currently because of /usr/local issues zmap cannot be built on the Alpha, +unless there is some big change in requirements zmap will only be supported +on Linux and the Mac. + + +Suggestion/Bug tracking +----------------------- + +Both acedb and zmap use the Request Tracker system, send emails to "acedb-bug" +and "zmap" respectively. + +Queues should also be set up for vega (with external connection) and anacode. + + +Xace usage +---------- + +Currently there is still a small and decreasing need for write access via +xace to allow comments/notes to be added to some acedb objects. This will +need to disappear when xace is removed. James is working on this. + + + +ZMap development issues +----------------------- + +- multiple segments: + +A lot of discussion about how this would work, there are several different +requirements and several ways this could work. + +James would like to keep the way lace extracts data (i.e. as a single +contiguous segment) the same as he has spent much time producing code +that does just that in a reliable way. Ed said that zmap could extract +regions from this in a way that would be useful to the annotator. + +When it comes to multiple alignments it may be best for zmap to retrieve +them from several separate otterlace databases. + +This needs much more discussion/refinement. + + +- navigation in zmap: + +Issue is gestures to allow easy navigation, James suggested a 'back' button +(like browser) going back to previous zoom and/or position ? + +Jen would like to be able to skip from exon to exon, Ed suggested overview +of gene with second pane that shows just one exon but can skip to the next. + + + + +Otterlace +--------- + +- External rollout: + +last non http removed +ssh access first +SSL access externally later +Jen needs by August + + +- DAS issues: + +otterlace can display das sources on clone +most of das is on chromosome coordinates +Jen wants to see ditags +James/Tim will work out a solution for this by time of next meeting + + +- Ensembl + +Conservation plots, when are they coming? +TH to check: Jen says Adam S's plan is excellent. + + +New schema: + +end of July +more intelligent comparison of output + + +Realign_offtrack_genes + +producing random shifts +use previous versions of annotation to find large shifts +either +by genome genome alignment +by tracking missing evidence + +new schema will address this + +mouse is priority, but also problem in human (CCDS, merge) +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_07_12 b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_07_12 new file mode 100755 index 000000000..2e5220d7a --- /dev/null +++ b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_07_12 @@ -0,0 +1,181 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wens 12th July 2006 + +Attendees: th, jla1, kj2, jgrg, edgrif + + +1) otterlace + zmap progress +============================ + + +new features +------------ + +- display of locus names, supposed to be done. + +- bumping, supposed to be done + +both these items require new version, this highlighted +the following points: + +1) zmap team must deliver regular updates to anacode in a controlled +and visible way (via RT ticket pointing James to latest release). + +2) anacode team must then release new zmap + their code quickly. + +3) havana guinea pig testers need to ensure zmap + anacode teams +appreciate urgency/priority of requests, e.g. "I can't work without +this item". + + + +- liz's treeview editing requests + +Liz needs to see extra data about objects when editting, +James/Ed agreed that the correct way to do this is to have zmap +communicate the selected object id to lace and then for lace to +display the information in a separate window. This would mirror +exactly the existing situation whereby Liz selects and object +in fmap and its details are displayed in the acedb treeview display. + +This is a priority, since testing of zmap is blocked until this is +done. Roy has done his bit so that feature information is passed +back to lace, James to implement lace code in the next week. + + +- RT queues + +We need zmap/anacode queues linked so tickets can be transferred +between the two. + +James has started process of setting up RT - will be default for +anacode email. anacode-people for people. + +RT tickets: can't see other people's by default? Would be better +if you can. + + +- match/alignment display + +kerstin would like to be able to cluster/join up alignments that clearly +are all part of the same match. Ed says this will require a more generalised +mechanism than exists in acedb currently. We could use the Join_homols tag +with an Int field: + + Join_homols nnnn // where nnnn specfies an upper threshold in align + // gap for joining two consecutive homols of the + // same match. + + + +- blixem + +search for sequence string? Horrible code, so just simple searching [ed] + +fetching sequences for blixem: pfetch quite fast now, but would be +faster if pfetch cache? either in acedb or separate [] + + +- acedb compilation [ed]: + +compiling to gtk2 - now working +building for mac/linux; universal binaries soon. + + +- external users: + +requires version checking [james] +under apache can run multiple servers, so can handle version transition. + +- selenocysteine/alternative genetic codes : + +ed will check acedb code and make sure zmap can do the alternative +translations. + +james will implement selenocysteine in new schema + + +- zmap/otterlace demo + +In one weeks time: Wednesday 19th. + + +- Zmap/otterlace new guinea pigs + +Charlie has been using it but stopped because he needs a bit of +help in orientation...Ed/Roy need to help here. + +Gavin would also be good but needs the Gene Finder Features, +Ed to expedite this...but will check with Kerstin/Gavin to +make sure zmap provides the right display. + + +- multiple alignments + +Ed said he has done more work on the flexibility of zmap to +display these but is still unsure (as are we all ??) about +exactly what sort of display is required. Zmap plan is to +produce a straight forward display and get feedback. + + + + +2) other matters +================ + + +- DAS source and builds + +TH: DAS source for ditag data within 1 week (19th) +TH: will build 19_36 mouse db + 19_36 human db + + +- genelists were discussed: + +looked at genetracker developed by Roger/webteam: + +http://intweb.sanger.ac.uk/cgi-bin/utils/genetracker + +not being able to add extra columns a major issue. + + +- extra tags + +Discussion about tags on clones and genes indicating status. Need +some long term solution for this. Unclear if this should be in the +gene list tracking system or in otterlace. + +For now, using Annotated_remark- annotated and Annotated_remark- +inprogress for clones. Should be possible to set this status from +the interface [James] + +Similar tags for states for genes should also be added. + + + +- EMBL dumping + +Needs to be possible for annotators to submit a clone on demand [James] + +EMBL dumping needs to be aware of gene + clone status tags. If +genes are annotated, but clone is not set to 'annotated' should add a +comment to the header to warn that the annotation for the clone is +currently partial. + +Need to be able to lock objects (transcripts) if a tag is set - +i.e. CCDS tag [james says easy to fix] + + +============================================================================== + +-- + ------------------------------------------------------------------------ +| Ed Griffiths, Acedb development, Informatics Group, | +| The Morgan Building, Sanger Institute, Wellcome Trust Genome Campus | +| Hinxton, Cambridge CB10 1HH | +| | +| email: edgrif@sanger.ac.uk Tel: +44-1223-496844 Fax: +44-1223-494919 | + ------------------------------------------------------------------------ diff --git a/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_07_26 b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_07_26 new file mode 100755 index 000000000..3f8265e5c --- /dev/null +++ b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_07_26 @@ -0,0 +1,111 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wens 26th July 2006 + +Attendees: th, jla1, kj2, jgrg, edgrif + + +1) otterlace + zmap progress +============================ + + +new features +------------ + +- locus names, edgrif and eah agreed this was done but may need to add +names as text to display. + + +- bumping, the compact algorithm was discussed and the concensus was that +it will not do the job. Instead edgrif will write a variant of this that +does not allow matches to be interleaved. Matches from the same est will +have a background colour painted between them to make it obvious which +matches go with which. + + +- tickets: The only item to be resolved is that users cannot see all tickets +in a queue, this is essential. edgrif will submit a request to get this done +for acedb, anacode and zmap. + +- ests: + add 'intron' where a feature perfectly aligns over a splice, zmap needs to look + through matches and join up ones whose coords are completely correct. + + +- glyph for single base features: we need scale independent glyphs for a number +of features including the gene finder features. edgrif or rds will implment a +new scale independent drawing type as an add on for foocanvas, this is the most +natural/efficient way to implement this. + + +- alternative translations: edgrif about half way through code to do this. + + +- Gene finder features: we can export this data but we need a new drawing type +(see glyph item). + + +- multiple alignments: edgrif is about a third of the way through implementing +a more general way of displaying arbitrary blocks. + + +- zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +- we need a "back" button + + +Builds +------ + +- gtk2: is now done for acedb and zmap, all code now builds on the same level of +gtk2. BUT there are now no more alpha builds of either, only the mac and linux +are supported. + +- universal binaries: requires upgrade of software on our laptop which we will now +do. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. + + +- pfetch: caching? NOT DONE + where should we cache ? + + + +2) Other matters +================ + +- client/server: version checking NOT DONE + + +- alignments for genomic alignments: ?? + + +- das - NOT DONE (th) + + +- genelists: + done for zebrafish + using mig list for mouse + no human + needs multiple list interfaces + + +- interface for tags: NOT DONE + + +- submit button for EMBL dumping + + +- gene_fragments tag - broken + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_09_13 b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_09_13 new file mode 100755 index 000000000..2b4626a32 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_09_13 @@ -0,0 +1,160 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wens 13th September 2006 + +Attendees: th, jla1, jgrg, edgrif + + +1) otterlace + zmap progress +============================ + + +new features +------------ + +- reverse complement code, the code is there and working in the test +otterlace system. BUT there is a bug in the interface to blixem which +results in blixem freezing when used on a reverse complemented sequence. + + +- for a combination of reasons the reverse complement fix did not make +it through to the annotators for 3 weeks. To try and fix this we will +do the following: + + 1) with each new release of the lace/zmap system we (a lace + person and a zmap person) will sit with one of the annotators + to check that everything is working + + 2) publish a list of the changes/fixes for that release + + +- 3 frame translation is high priority, Ed is working on this and will +have something working by the end of this week so that there can be a +release next week. + + +- Navigator with clickable locus names and a display of the clone assembly +is high priority, Roy is working on this now and it should be ready for +the next release. We need to allow the user to search for locus names within +the navigator window. + + +- locus names, as well as having locus names in the navigator, we need them +in a column in the main display. + + +- bumping, the compact algorithm was discussed and the concensus was that +it will not do the job. Instead edgrif will write a variant of this that +does not allow matches to be interleaved. Matches from the same est will +have a background colour painted between them to make it obvious which +matches go with which. + + +- the release notes for each release and the code version numbers will be +put in a web page that will be viewable from either the lace or zmap +programs as a main menu item. This will help the annotators to be sure of +which version they are using and of the changes/updates made. + + +- ests: + add 'intron' where a feature perfectly aligns over a splice, zmap needs to look + through matches and join up ones whose coords are completely correct. + This is still be worked on. + + +- glyph for single base features: we need scale independent glyphs for a number +of features including the gene finder features. edgrif or rds will implment a +new scale independent drawing type as an add on for foocanvas, this is the most +natural/efficient way to implement this. +This has been implemented and we can display gene finder features, we need the +3 frame translation first before this can be properly used. + + +- alternative translations: edgrif about half way through code to do this. + + +- multiple alignments: edgrif is about a third of the way through implementing +a more general way of displaying arbitrary blocks. + + +- zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +- we need a "back" button + + +Builds +------ + +- universal binaries: we have an updated mac laptop with the latest XCode set up +which will allow us to produce the binaries. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. + + +- pfetch: caching? NOT DONE + where should we cache ? + + + +multiple species +---------------- + +- James and Ed need to discuss how this will be done. Zmap can display +multiple sequences in one zmap window but there will need to changes to +the supporting infrastructure including: + + - should the species go in one or multiple acedb databases + + - the lace <-> zmap protocol needs to allow specification of + multiple sequence display. + + - lace needs to allow the user to pick multiple species + + +Acedb +----- + +- Treeview has a bug whereby if you click on an item, the window is no longer +preserved. This is because of a fix by Ed and now needs to be refixed. + +- need to make sure that both acedb and zmap use the Mac "open" command to +show urls. You can just give the url and "open" will invoke the users default +browser. + + + + +2) Other matters +================ + + +sorry, this section is rather sparse...give me more information if you need to. + +- das - NOT DONE (th) + + +- genelists: + done for zebrafish + using mig list for mouse + no human + needs multiple list interfaces + + +- interface for tags: NOT DONE + + +- submit button for EMBL dumping + + +- gene_fragments tag - broken + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_10_11 b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_10_11 new file mode 100755 index 000000000..60e02af8e --- /dev/null +++ b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_10_11 @@ -0,0 +1,140 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wens 11th October 2006 + +Attendees: th, jla1, jgrg, kj2, edgrif + + +1) otterlace + zmap progress +============================ + + +new features +------------ + +- Bug that led to blixem freezing for reverse complemented sequence +is fixed. + + +- We will do a complete acedb/zmap build this week and after it is +integrated into lace we will do the following: + + 1) with each new release of the lace/zmap system we (a lace + person and a zmap person) will sit with one of the annotators + to check that everything is working + + 2) publish a list of the changes/fixes for that release + + +- 3 frame translation is done. + + +- Navigator with clickable locus names and a display of the clone assembly +is high priority. We need to allow the user to search for locus names within +the navigator window. The locus names will also be displayed in the main +column. Roy is working on this now and it should be ready for the next release. + + +- bumping of homols: Matches from the same est will have a background colour +painted between them to make it obvious which matches go with which and perfect +alignments will have "intron" connectors drawn between them. The intron conector +code still needs to be written. This is a high priority item. + + +- the release notes for each release and the code version numbers will be +put in a web page that will be viewable from either the lace or zmap +programs as a main menu item. This will help the annotators to be sure of +which version they are using and of the changes/updates made. This release +will be the first time we have this. + + +- glyph for single base features is done. + + +- alternative translations: edgrif about half way through code to do this. + + +- multiple alignments: edgrif is about a third of the way through implementing +a more general way of displaying arbitrary blocks. This is a high priority item. + + +- zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +- we need a "back" button + + +- DNA searching in zmap is done. + + +- There is still a problem with the shutting down of the ace server which +leads to a broken pipe message. edgrif will check that server shuts down +in a timely way. + + +Builds +------ + +- universal binaries: we have an updated mac laptop with the latest XCode set up +which will allow us to produce the binaries. edgrif is pursuing both nfs mounts +for automated builds and how to import our standard build system into XCode. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. + + +- pfetch: caching? NOT DONE + where should we cache ? + + + +multiple species +---------------- + +- James and Ed need to discuss how this will be done. Zmap can display +multiple sequences in one zmap window but there will need to changes to +the supporting infrastructure including: + + - should the species go in one or multiple acedb databases + + - the lace <-> zmap protocol needs to allow specification of + multiple sequence display. + + - lace needs to allow the user to pick multiple species + + +Acedb +----- + +- Treeview has a bug whereby if you click on an item, the window is no longer +preserved. This is because of a fix by Ed and now needs to be refixed. + +- need to make sure that both acedb and zmap use the Mac "open" command to +show urls. DONE + + + +2) Other matters +================ + + +sorry, this section is rather sparse...give me more information if you need to. + +- das - NOT DONE (th) + + +- interface for tags: NOT DONE but Leo working on related code for import of +Ensemble genes that will be directly applicable. + + +- submit button for EMBL dumping: not done. + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_11_02 b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_11_02 new file mode 100755 index 000000000..12ff05e98 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_11_02 @@ -0,0 +1,174 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thurs 2nd November 2006 + +Attendees: th, jla1, jgrg, kj2, edgrif + + +1) otterlace + zmap progress +============================ + + +new features +------------ + +- Bug 14058, blixem reverse coords (reported by Gavin) is fixed. + + +- Roy sat with Gavin (only) and that seemed to go well except for the above +problem. + +BUT there is still a problem with turn around of bugs/fixes meaning that +zmap is simply not being used enough. + +edgrif suggests that he and Roy sit with annotators this time to record bugs and +then immediately go and fix them. + + +- Navigator with clickable locus names and a display of the clone assembly +is high priority. We need to allow the user to search for locus names within +the navigator window. The locus names will also be displayed in the main +column. + +Roy pretty much has this finished now, will be in next release. + + +- bumping of homols: Matches from the same est will have a background colour +painted between them to make it obvious which matches go with which and perfect +alignments will have "intron" connectors drawn between them. The intron conector +code still needs to be written. This is a high priority item. + +edgrif raised performance as a problem when drawing _all_ homol matches as gapped. +Problem is that zmap can end up drawing tens/hundreds of thousands of boxes, most +of which are not needed. + +After some discussion it was agreed that code should sift out the "perfect" matches +as defined by a "slop" factor set in the feature style and only draw those as +gapped. They will need to be visually easily distinguishable by the annotators if +this is to work. + +edgrif agreed to do a prototype within 2 working days and demo to Jen/Kerstin/others. + + +- the release notes for each release and the code version numbers are now +available via the help menu in zmap. + +They include: + + - ZMap build version/release date + + - ZMap Request Tracker tickets resolved + + - acedb Request Tracker tickets resolved + + - summary of ZMap Changes/Fixes + + - summary of acedb Changes/Fixes + +this enables annotators to see what's been fixed and what code version they are +running at a glance. + + +- alternative translations: edgrif about half way through code to do this. + +- multiple alignments: edgrif is about a third of the way through implementing +a more general way of displaying arbitrary blocks. This is a high priority item. + +- zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +- we need a "back" button + + +- DNA searching in zmap is done. + +- There is still a problem with the shutting down of the ace server which +leads to a broken pipe message. edgrif will check that server shuts down +in a timely way. + +edgrif said he had trouble reproducing but James has added new information to +request tracker ticket and both jla1 and kj2 said it seemed to be linked to +long running sessions. edgrif will investigate further. + + +Builds +------ + +- we can now debug zmap using XCode environment + build universal binaries. + +- build system is still a problem as systems are dragging their feet about +nfs mounts on mac boxes, edgrif will have another go and then th will +help expedite. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. + + +- pfetch: caching? NOT DONE + where should we cache ? + + + +multiple species +---------------- + +- James and Ed need to discuss how this will be done. Zmap can display +multiple sequences in one zmap window but there will need to changes to +the supporting infrastructure including: + + - should the species go in one or multiple acedb databases + + - the lace <-> zmap protocol needs to allow specification of + multiple sequence display. + + - lace needs to allow the user to pick multiple species + +jla1 said a better test than Charlie and NOD mouse is Richard and haplotypes +of which he may have up to 6 (!). jla1 said is there any limit to number, edgrif +said "no, only machine memory....". To quickly test this edgrif will get data from +Richard and try it out to look at performance. Simplest way is to just display +data as is, this mimics what Richard does currently in fmap. edgrif said that +if we used Compara alignment data we could just display the "sub-blocks" of the +alignments that actually contained data. This would produce much more compact +displays. + + +Acedb +----- + +- Treeview "preserve" bug is fixed. + + + +2) Other matters +================ + + +sorry, this section is rather sparse...give me more information if you need to. + +- das - NOT DONE (th) + + +- interface for tags: this is now done thanks to Leo/James. + + +- submit button for EMBL dumping: needs testing by Jen. + + +kj2 raised a question about how alignments were done for otter, jgrg said using +BLAST and est2genome. Kerstin said that much more useful alignments could be made +using BLAT or exonerate or ??? In particular, other alignment methods join up their +matches where they are obviously "perfect" overall alignments (e.g. to consecutive +exons) and cope with alignments where they go over clone boundaries. + +jgrg to think about this. + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_11_29 b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_11_29 new file mode 100755 index 000000000..0a6f29519 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2006/zmap_lace.2006_11_29 @@ -0,0 +1,310 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Weds 29th November 2006 + +Attendees: jla1, jgrg, kj2, edgrif + + +1) otterlace + zmap progress +============================ + + +new features +------------ + + +- new release: We've sat with Liz/Gavin and that seemed to work really well. + +Gavin gave us a list of requests and said that if we could fix the main ones +he would now rather work with zmap than with xace (!). + + +His list of fixes and status of them are given below, he said that none of them +are show stoppers: + +Higher priority: + +1/ Showing the co-ordinates/position when you highlight a piece of evidence +2/ Bumping; being able to designate a region based on object/evidence/gene +prediction/position so that everything within or partly within that region will be +clustered; one column per piece of evidence - like xace. Also to have some sort of +shading or connection between pieces of evidence that splice together. +3/ Zooming; being able to designate a region to zoom to, at the moment it's difficult +to zoom out to a particular region in a single click. +4/ CDS not showing up - fixed (?) +5/ Blastx - needs to be strand specific + +ALL THESE ARE FIXED. + + +Medium priority; + +6/ 3 frame translation; proteins aren't split up into the 3 separate frames. GF +details needn't be shown automatically in the 3 frames. + +Items 6, 8, and 13 below are all the same bug which edgrif will fix. + + +7/ DNA not highlighted when u click on a piece of evidence or object + + +8/ Genefinder shouldn't be on as standard. + +FIXED + + +9/ - ve's on the navigator bar when u rev. complement - fixed (?) + +FIXED + + +10/ DNA finder - not visible on the zmap. Would also be nice to be colour coded +depending on the strand. Also it shows the wrong co-ordinates when searching for seq +when you're rev. complemented. + + +11/ Differentiation between swissprot and trembl - separate columns or different +shading within that column (?) + +This is a data issue, we could easily colour these entries differently if that was +what was wanted by giving them seperate methods. We'll do this when edgrif/rds +have written the new style code to replace acedb methods. + + + +Lower priority; + + +12/ when you click on a locus in the navigator bar for it to zoom to the full locus +rather than the start, (depending on how far you've zoomed in) + +FIXED + + +13/ DNA should be after the evidence rather than before it - actually a funny one - +the first time u turn DNA on it goes between the yellow DNA bar and the +genomic_canonical bar but if u turn it off then on again it goes at the right of the +screen where u want it (?) + + +14/ Colours when u highlight evidence, esp confusing as hightlighted protein hits look +like cDNA's and vice versa at the moment. Shading like xace has at the moment for +highlighted evidence would be nice. + + +15/ Does the red band - Genomic_canonical have to be on as standard (?) + +FIXED + + + +some additional items from Gavin/Liz: + +- can zmap start up "maximised" vertically + +FIXED + + +- can there be a short cut to zoom to a feature + +FIXED - new "z" and "Z" short cuts for this. jgrg suggested that these +short cuts should be given on menus/mouse-over popups to remind the +user that they exist...a good idea. + + + +- can the coords for evidence be reported + +FIXED + + +- bumping, can it be still further improved + +FIXED edgrif has added a new method for bumping which shows only matches that overlap a transcript. +Works very well, much quicker and much easier to see stuff. + + +- cursor keys - can now cursor through cols and but for a small glitch could cursor +up and down features in a column. + + +- bumping of homols: Matches from the same est will have a background colour +painted between them to make it obvious which matches go with which and perfect +alignments will have "intron" connectors drawn between them. The intron conector +code still needs to be written. This is a high priority item. + +edgrif has written code to add colour coded bars between the all the hits for a +single match sequence. + +The colours are: + + green = perfectly colinear +orange = colinear but with missing sequence + red = not colinear + +The threshold for a "perfect" match is set via a new tag, "Join_aligns" in the +method for the feature. + +There was some discussion about colours, whether we should be using "intron" like +connectors and other aspects but it was agreed to do some trials with users and +get feedback. + + +edgrif has also added code (following a discussion with annotators) to allow bumping +of _just_ those matches that overlap a selected feature (e.g. transcript). This, +combined with the new zoom function, makes for much more efficient zooming/bumping. + +This code needs to be extended to allow the user to select a region and do the bumping +just for that region, this is needed for when there is no existing transcript and the +annotator will make one for that region based on the evidence. + + + + +- edgrif raised performance as a problem when drawing _all_ homol matches as gapped. +Problem is that zmap can end up drawing tens/hundreds of thousands of boxes, most +of which are not needed. + +After some discussion it was agreed that code should sift out the "perfect" matches +as defined by a "slop" factor set in the feature style and only draw those as +gapped. They will need to be visually easily distinguishable by the annotators if +this is to work. + +edgrif agreed to do a prototype within 2 working days and demo to Jen/Kerstin/others. + +FIXED + + + +- alternative translations: edgrif about half way through code to do this. + +- multiple alignments: edgrif is about a third of the way through implementing +a more general way of displaying arbitrary blocks. This is a high priority item. + +- zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +- we need a "back" button, the new zoom functions help with this though. + + +- There is still a problem with the shutting down of the ace server which +leads to a broken pipe message. edgrif will check that server shuts down +in a timely way. + +edgrif said he had trouble reproducing but James has added new information to +request tracker ticket and both jla1 and kj2 said it seemed to be linked to +long running sessions. edgrif will investigate further. + + + +- kj2 would like more information available for features such as the species +and the DE lines. edgrif said they could display this via the current mechanism +of mouse-over popups in the "Info" line at the top of the ZMap. jgrg is going +to make sure this information is in the acedb objects so it can be exported +to zmap. + + + + +Builds +------ + +- build system is still a problem as systems are dragging their feet about +nfs mounts on mac boxes, edgrif will have another go and then th will +help expedite. + +FIXED NOW ?? + + +- next build: jgrg has a few features still to add to lace to allow xace to +be removed completely (EUCOMM & other genomic feature editting). He will do +this asap so that we can do a build next Tues to allow users to test on Wens +and Thurs before various people disappear for Christmas. edgrif said they +are ready with zmap and will do a trial build this week for annotators to +try out. + + +- kj2 said she would like to be able to select a contiguous set of hits and +use them to create a new transcript (much like the "Create Temp Gene" function +in acedb Gene Finder. edgrif and jgrg will look at the code required to do +this. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. + + + + +multiple species +---------------- + +- James and Ed need to discuss how this will be done. Zmap can display +multiple sequences in one zmap window but there will need to changes to +the supporting infrastructure including: + + - should the species go in one or multiple acedb databases + + - the lace <-> zmap protocol needs to allow specification of + multiple sequence display. + + - lace needs to allow the user to pick multiple species + +jla1 said a better test than Charlie and NOD mouse is Richard and haplotypes +of which he may have up to 6 (!). jla1 said is there any limit to number, edgrif +said "no, only machine memory....". To quickly test this edgrif will get data from +Richard and try it out to look at performance. Simplest way is to just display +data as is, this mimics what Richard does currently in fmap. edgrif said that +if we used Compara alignment data we could just display the "sub-blocks" of the +alignments that actually contained data. This would produce much more compact +displays. + +HAD A MEETING AND ARE IMPLEMENTING CODE IN LACE AND ZMAP TO DO THIS. + + + + +Acedb +----- + +- Server bug still needs fixing. + + + +2) Other matters +================ + + +sorry, this section is rather sparse...give me more information if you need to. + + +- script for EMBL dumping: needs testing by Jen. + + + + +kj2 raised a question about how alignments were done for otter, jgrg said using +BLAST and est2genome. Kerstin said that much more useful alignments could be made +using BLAT or exonerate or ??? In particular, other alignment methods join up their +matches where they are obviously "perfect" overall alignments (e.g. to consecutive +exons) and cope with alignments where they go over clone boundaries. + +jgrg to think about this. + + + +3) Next Meeting +=============== + +Will be at 11am, 7/12/2006. This will be the last meeting with all of use here +before Christmas. + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_01_24 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_01_24 new file mode 100755 index 000000000..a27dfed2e --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_01_24 @@ -0,0 +1,239 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thurs 11th January 2007 + +Attendees: jla1, jgrg, kj2, edgrif + + +1) otterlace + zmap progress +============================ + + +latest test_otterlace +--------------------- + +- we have a new release of zmap, jgrg just needs to check a couple of things +and then we give it to Liz to try. jgrg & edgrif need to do release notes to +accompany this release. + + +Demo to havana +-------------- + +A lively discussion on a number of items including: + +- where to find the docs we have written for zmap. DONE edgrif + +- showing alignment information for hits of a match in a list/edit window. DONE edgrif + +- how to tell otterlace that there has been a multiple select by the user. + +- sorting of dna matches, Adam would like the current 5' exon style sort. + +- fixed/frozen columns like in excel ? + +- order of split windows, the bumped window should be on the right when a +vertical split is done. + +- windows can be unlocked but can they be relocked ? + +- locus readability DONE rds + +- drag of mark region + +- zoom to show all protein when 3 frame translation is displayed. + +- select features by lasso + +- more zoom increments + +- proteins should be bumped by best score (not how acedb works in fact) DONE edgrif + +- put exons in capitals in exported sequence and consider allowing option +of coloured coded export. + + +High priority +------------- + +1/ multiple alignments: edgrif is about a third of the way through +implementing a more general way of displaying arbitrary blocks. This is a high +priority item. + + +Medium priority +--------------- + +1/ Saturated_EST bug: because of some ambiguity if tag usage in acedb methods +these don't get shown in the right way. This will be properly fixed when we +have zmap styles which are completely separate from acedb methods. + + +2/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches, edgrif will talk to kj2 +about this and clarify what's required. + +Searching should be restricted to any marked region is one is set, edgrif to +do this. + +3/ Differentiation between swissprot and trembl - separate columns or +different shading within that column (?) + +This is a data issue, we could easily colour these entries differently if that +was what was wanted by giving them seperate methods. We'll do this when +edgrif/rds have written the new style code to replace acedb methods. + +4/ kj2 would like more information available for features such as the species +and the DE lines. edgrif said they could display this via the current +mechanism of mouse-over popups in the "Info" line at the top of the ZMap. jgrg +is going to make sure this information is in the acedb objects so it can be +exported to zmap. + +5/ zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + +6/ we need a "back" button, the new zoom functions help with this though. + +7/ kj2 said she would like to be able to select a contiguous set of hits and +use them to create a new transcript (much like the "Create Temp Gene" function +in acedb Gene Finder. edgrif and jgrg will look at the code required to do +this. + +8/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. edgrif or rds will do this. + +9/ edgrif said users could now click on features to delete them. kj2 said she +would like a stack of sets of deleted features that she could step back +through, edgrif to implement this. + +10/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + +11/ jgrg pointed out a bug in the menu which has a "mark" but not an "unmark" +item. Also, selecting "Set feature for Bump" only marks the selected exon, not +the whole transcript. edgrif will fix these. + +12/ jla1 asked if we could make the lines joining up matches much thinner and +have an invisible background that allows easy clicking. edgrif will do this. + +13/ Kerstin needs some acedb keyset like functions in lace to allow her to +perform operations on multiple features. jgrg to discuss/implement with kj2. + + + +Low priority +------------ + +1/ Feature highlight colour has been fixed but it doesn't work that well for +transcripts where often there is only an outline, we need to highlight more +intelligently so that for features that do not have a fill colour we do +highlight by filling the boxes. + +2/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + +3/ cannot currently cursor through features in a column, there are some +technical problems here to do with raising items so they are visible. + +4/ edgrif fixed a performance problem when drawing _all_ homol matches as +gapped. But this relies on slop factor set for the 'Gapped' tag in the +method, edgrif will sort this out with jgrg. + +5/ alternative translations: edgrif about half way through code to do this. + + +Builds +------ + +- we now have the mac machine for overnight/regular builds but are having some +library problems which we hope to sort out, in the meantime we can do "hand" +builds via the laptop. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +multiple species +---------------- + +jgrg, edgrif, lg4 and rds met to discuss this and we need to revist it to +check on progress. It requires some alterations to lace infrastructure to +support editting multiple species. + +jla1 asked if we could get some kind of demo going for a workshop on the +20/21st Jan, edgrif and jgrg will see what can be done. + +================== from last time ================================================= +James and Ed need to discuss how this will be done. Zmap can display multiple +sequences in one zmap window but there will need to changes to the supporting +infrastructure including: + + - should the species go in one or multiple acedb databases + + - the lace <-> zmap protocol needs to allow specification of + multiple sequence display. + + - lace needs to allow the user to pick multiple species + +jla1 said a better test than Charlie and NOD mouse is Richard and haplotypes +of which he may have up to 6 (!). jla1 said is there any limit to number, +edgrif said "no, only machine memory....". To quickly test this edgrif will +get data from Richard and try it out to look at performance. Simplest way is +to just display data as is, this mimics what Richard does currently in +fmap. edgrif said that if we used Compara alignment data we could just display +the "sub-blocks" of the alignments that actually contained data. This would +produce much more compact displays. +================== from last time ================================================= + + + +2) Other matters +================ + + +sorry, this section is rather sparse...give me more information if you need +to. + + +- script for EMBL dumping: needs testing by Jen. James thinks this all +works. There was a discussion about submissions more generally and it was +agreed that all regions going into VEGA should be submitted. This let into a +request for some kind of "Clone status" flag. jgrg said there used to be one +and that it should be recreated. + + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +jgrg to think about this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 24/01/2007. + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_02_01 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_02_01 new file mode 100755 index 000000000..4f24c37da --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_02_01 @@ -0,0 +1,203 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thurs 11th January 2007 + +Attendees: jla1, jgrg, kj2, edgrif + + +1) otterlace + zmap progress +============================ + + +latest test_otterlace +--------------------- + +- we have a new release of zmap, jgrg just needs to check a couple of things +and then we give it to Liz to try. jgrg & edgrif need to do release notes to +accompany this release. + + +High priority +------------- + +1/ multiple alignments: edgrif is about a third of the way through +implementing a more general way of displaying arbitrary blocks. This is a high +priority item. + + +Medium priority +--------------- + +1/ Saturated_EST bug: because of some ambiguity if tag usage in acedb methods +these don't get shown in the right way. This will be properly fixed when we +have zmap styles which are completely separate from acedb methods. + + +2/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches, edgrif will talk to kj2 +about this and clarify what's required. + +Searching should be restricted to any marked region is one is set, edgrif to +do this. + +3/ Differentiation between swissprot and trembl - separate columns or +different shading within that column (?) + +This is a data issue, we could easily colour these entries differently if that +was what was wanted by giving them seperate methods. We'll do this when +edgrif/rds have written the new style code to replace acedb methods. + +4/ kj2 would like more information available for features such as the species +and the DE lines. edgrif said they could display this via the current +mechanism of mouse-over popups in the "Info" line at the top of the ZMap. jgrg +is going to make sure this information is in the acedb objects so it can be +exported to zmap. + +5/ zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + +6/ we need a "back" button, the new zoom functions help with this though. + +7/ kj2 said she would like to be able to select a contiguous set of hits and +use them to create a new transcript (much like the "Create Temp Gene" function +in acedb Gene Finder. edgrif and jgrg will look at the code required to do +this. + +8/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. edgrif or rds will do this. + +9/ edgrif said users could now click on features to delete them. kj2 said she +would like a stack of sets of deleted features that she could step back +through, edgrif to implement this. + +10/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + +11/ jgrg pointed out a bug in the menu which has a "mark" but not an "unmark" +item. Also, selecting "Set feature for Bump" only marks the selected exon, not +the whole transcript. edgrif will fix these. + +12/ jla1 asked if we could make the lines joining up matches much thinner and +have an invisible background that allows easy clicking. edgrif will do this. + +13/ Kerstin needs some acedb keyset like functions in lace to allow her to +perform operations on multiple features. jgrg to discuss/implement with kj2. + + + +Low priority +------------ + +1/ Feature highlight colour has been fixed but it doesn't work that well for +transcripts where often there is only an outline, we need to highlight more +intelligently so that for features that do not have a fill colour we do +highlight by filling the boxes. + +2/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + +3/ cannot currently cursor through features in a column, there are some +technical problems here to do with raising items so they are visible. + +4/ edgrif fixed a performance problem when drawing _all_ homol matches as +gapped. But this relies on slop factor set for the 'Gapped' tag in the +method, edgrif will sort this out with jgrg. + +5/ alternative translations: edgrif about half way through code to do this. + + +Builds +------ + +- we now have the mac machine for overnight/regular builds but are having some +library problems which we hope to sort out, in the meantime we can do "hand" +builds via the laptop. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +multiple species +---------------- + +jgrg, edgrif, lg4 and rds met to discuss this and we need to revist it to +check on progress. It requires some alterations to lace infrastructure to +support editting multiple species. + +jla1 asked if we could get some kind of demo going for a workshop on the +20/21st Jan, edgrif and jgrg will see what can be done. + +================== from last time ================================================= +James and Ed need to discuss how this will be done. Zmap can display multiple +sequences in one zmap window but there will need to changes to the supporting +infrastructure including: + + - should the species go in one or multiple acedb databases + + - the lace <-> zmap protocol needs to allow specification of + multiple sequence display. + + - lace needs to allow the user to pick multiple species + +jla1 said a better test than Charlie and NOD mouse is Richard and haplotypes +of which he may have up to 6 (!). jla1 said is there any limit to number, +edgrif said "no, only machine memory....". To quickly test this edgrif will +get data from Richard and try it out to look at performance. Simplest way is +to just display data as is, this mimics what Richard does currently in +fmap. edgrif said that if we used Compara alignment data we could just display +the "sub-blocks" of the alignments that actually contained data. This would +produce much more compact displays. +================== from last time ================================================= + + + +2) Other matters +================ + + +sorry, this section is rather sparse...give me more information if you need +to. + + +- script for EMBL dumping: needs testing by Jen. James thinks this all +works. There was a discussion about submissions more generally and it was +agreed that all regions going into VEGA should be submitted. This let into a +request for some kind of "Clone status" flag. jgrg said there used to be one +and that it should be recreated. + + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +jgrg to think about this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 24/01/2007. + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_02_23 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_02_23 new file mode 100755 index 000000000..6eec9c247 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_02_23 @@ -0,0 +1,360 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Friday 23rd February 2007 + +Attendees: lw2, jgrg, th, edgrif + + +1) otterlace + zmap progress +============================ + +jla1 sent a list of bugs reported by users for the latest test_otterlace +system, I have incorporated these below with comments: + + +High priority +------------- + +1/ multiple alignments: edgrif is about a third of the way through +implementing a more general way of displaying arbitrary blocks. This is a high +priority item. + + +2/ Rev strand issues: +>> Saving gives wrong co-ordinates, object goes to the wrong +>> location ??..sorted by re-synch (possible fixed for next release?) + +this is fixed + + +>> Using evidence for co-ordinates wrong, 3 frame translation wrong (maybe +>> fixed in next realease). + +No one has mentioned this until now, edgrif/rds to investigate + + + +Medium priority +--------------- + +1/ Saturated_EST bug: because of some ambiguity if tag usage in acedb methods +these don't get shown in the right way. This will be properly fixed when we +have zmap styles which are completely separate from acedb methods. + +edgrif is implementing the new styles now. + + +2/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches. Searching should be +restricted to any marked region is one is set, edgrif to do this. + +edgrif is implementing protein search now, location results to come (note that +currently the user is shown a list of all hits and clicking on a hit takes +the user to that hit). + + +3/ kj2 would like more information available for features such as the species +and the DE lines. edgrif said they could display this via the current +mechanism of mouse-over popups in the "Info" line at the top of the ZMap. jgrg +is going to make sure this information is in the acedb objects so it can be +exported to zmap. + +jgrg, lw2 and edgrif to meet to establish "policy" for which data is displayed +by zmap and which by lace. edgrif said that they will need to establish a +"tag - value" system to make information display more generalised. + + +4/ zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +5/ kj2 said she would like to be able to select a contiguous set of hits and +use them to create a new transcript (much like the "Create Temp Gene" function +in acedb Gene Finder. + +rds code will allow lace to do this, jgrg to sort out lace implementation to +build the transcript. + + +6/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this. + + +7/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +8/ jgrg pointed out a bug in the menu which has a "mark" but not an "unmark" +item. Also, selecting "Set feature for Bump" only marks the selected exon, not +the whole transcript. edgrif will fix these. + + +9/ Kerstin needs some acedb keyset like functions in lace to allow her to +perform operations on multiple features. jgrg to discuss/implement with kj2. + +jgrg and edgrif agreed that the way to do this is to allow users to "detach" +lace from a database, the user then uses xace to do their keyset stuff and +then lace reattaches. This is viable because most users do not do this +kind of operation so why spend large amounts of time reimplementing xace +function. + + +10/ Display comes up full screen and if you click on it or resize it before +>> it loads the data it will crash. + +the resizing is fixed, but it still comes up full screen. Ed coded this to +come up full size of the screen in only the vertical direction and about 60% +in the horizontal. Unfortunately a bug in gnome window managed for the debian +version we're using mean it's full screen in both. This is not an issue if +using KDE or gnoe under ubuntu for example. edgrif will fix resizing for +faulty window managers + investigate the crash. + + +11/>> 3. Can't get otterid in zmap. +>> +>> 4. Halfwise hits: No info when you click on them. +>> + +These are both data display issues and will be covered by item 3/ above. + + +12/ >> 5. No crosshairs. + +just needs user to use correct short cut, all in the help pages. + + +13/ >> 6. Can't display Swissport and TrEmbl in one blixem in zmap. +>> Is this because they are now in separate columns? + +Yes, edgrif will alter code so that for proteins user has choice of one of +these columns or both when invoking blixem. In an ideal world we would allow +user to select multiple columns on which to run blixem, edgrif might do this +if simple. + + +14/ >> 7. Clone boundaries unclear. + +This is a user issue, the information is available from zmap and lace now. + + +15/ >> 8. Evidence co-ordinates come up as genomic co-ordinates (rather than cDNA +>> base numbers) aren't visible on alignments (also clicking on evidence +>> centres the display). + +OK, issue here is that coord data is not displayed by lace so needs to be +displayed by zmap when displaying an alignment feature. edgrif/rds will fix. + + +16/ >> 9. Highlighting an object hides the CDS. + +This will be fixed when new styles come in. + + + +17/ >> 10. Bumping: +>> Gaps between columns need to be removed. +>> Coloured bars are incorrect when in reverse. +>> Need to discuss how to prioritise bumping i.e. most 5^Ã’ first as in fmap. + +edgrif will sort out gaps stuff, needs to be bigger for transcripts but is +already zero for matches, perceived gap is because matches have different +widths according to score. + + +18/ >> 11. Gene finder not there! + +This is probably an otterlace setup bug... + + +19/ >> 12. Evidence does not display that it has been used with an associated +>> object. + +Requires lace to pass zmap extra information, jgrg to work on this. + + +20/ >> 13. 3-frame and DNA does not get highlighted when you click on +evidence, +>> object or prediction. + +edgrif to check with rds if this working, it should be. + + +21/ >> 14. V-split issues (update of v-split window?): +>> If you save an existing object again, you will get a duplicated object. +>> Cannot display CDS in V-split, have to remove v-split window then save CDS. +>> Duplicated objects (shadowing) when saving to existing objects in 2nd +>> window (correct in 1st window), solved by re-synch. +>> Very unstable when using v-split. + +Looks like a problem with updating multiple windows, edgrif/rds to fix. + + +22/ >> 15. Clicking on gene predictions doesn't let you paste co-ordinates in +>> spandit window on both strands. + +About 50% coded, will possibly require coding from James although I'm +hoping not. + + +23/ >> 16. Protein CDS includes includes stop codon in the aa count. + +edgrif will fix. + + +24/ >> 17. Flanking region of marked area needs to be increased. + +User can already do this via lasso mechanism. + + + +Low priority +------------ + +1/ Feature highlight colour has been fixed but it doesn't work that well for +transcripts where often there is only an outline, we need to highlight more +intelligently so that for features that do not have a fill colour we do +highlight by filling the boxes. + +this will be fixed when edgrif has finished the new styles. + + +2/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +3/ cannot currently cursor through features in a column, there are some +technical problems here to do with raising items so they are visible. + + +4/ edgrif fixed a performance problem when drawing _all_ homol matches as +gapped. But this relies on slop factor set for the 'Gapped' tag in the +method. + +edgrif will sort this out with jgrg, it just requires that the slop factor +is set in the relevant methods. + + +5/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + + +6/ we need a "back" button, the new zoom functions make this a lower priority +than before. + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + + +Builds +------ + +- we now have the mac machine for overnight/regular builds but are having some +library problems which we hope to sort out, in the meantime we can do "hand" +builds via the laptop. We continue to suffer from the macs not properly being +on the network in the sense of being able to see nfs mounted home directories. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +multiple species +---------------- + +jgrg, edgrif, lg4 and rds met to discuss this and we need to revist it to +check on progress. It requires some alterations to lace infrastructure to +support editting multiple species. + +edgrif has inserted the code to allow the user (i.e. lace) to specify that +particular sequences be fetched from particular servers. This allows zmap to +display different species/haplotypes/whatever in split windows. Once the +multiple alignment code is done then then could also be displayed in one +window. + +The below notes from a previous meeting are still relevant: + +============================================================================== +James and Ed need to discuss how this will be done. Zmap can display multiple +sequences in one zmap window but there will need to changes to the supporting +infrastructure including: + + - should the species go in one or multiple acedb databases + + - the lace <-> zmap protocol needs to allow specification of + multiple sequence display. + + - lace needs to allow the user to pick multiple species + +jla1 said a better test than Charlie and NOD mouse is Richard and haplotypes +of which he may have up to 6 (!). jla1 said is there any limit to number, +edgrif said "no, only machine memory....". To quickly test this edgrif will +get data from Richard and try it out to look at performance. Simplest way is +to just display data as is, this mimics what Richard does currently in +fmap. edgrif said that if we used Compara alignment data we could just display +the "sub-blocks" of the alignments that actually contained data. This would +produce much more compact displays. +================== from last time ================================================= + + + +2) Other matters +================ + +Both these items are still pending: + +- script for EMBL dumping: needs testing by Jen. James thinks this all +works. There was a discussion about submissions more generally and it was +agreed that all regions going into VEGA should be submitted. This let into a +request for some kind of "Clone status" flag. jgrg said there used to be one +and that it should be recreated. + + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +jgrg to think about this. + + + +3) Next Meeting +=============== + +Will be at XXXXXXXXX + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_03_14 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_03_14 new file mode 100755 index 000000000..8831bc85c --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_03_14 @@ -0,0 +1,264 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wednesday 14th March 2007 + +Attendees: lw2, jgrg, th, edgrif, te3, st3 + + +Steve Trevanion started attending the meetings today to provide a closer link +between annotation efforts in havana/zebra fish and the Vega web site. + + +1) otterlace + zmap progress +============================ + +High priority +------------- + +1/ Can't get otterid in zmap. + + +2/ Halfwise hits: No info when you click on them. + + +3/ Need to redo bumping yet again, Adam has surveyed Havana and says +they "want it to work like it always did". edgrif to sort this out. +Included in this will be taking into account the strand of the nucleotide +sequence. Also the width of matches needs to be checked, its not informative +enough at the moment. + + +4/ Evidence does not display that it has been used with an associated object. + + +5/ Clicking on gene predictions doesn?t let you paste co-ordinates in spandit +window on both strands. + + +6/ highlighting of exons/cds/dna/protein etc does not work. + + + +Medium priority +--------------- + +1/ Saturated_EST bug: because of some ambiguity if tag usage in acedb methods +these don't get shown in the right way. This will be properly fixed when we +have zmap styles which are completely separate from acedb methods. + +edgrif is implementing the new styles now. + + +2/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches. Searching should be +restricted to any marked region is one is set, edgrif to do this. + +edgrif is implementing protein search now, location results to come (note that +currently the user is shown a list of all hits and clicking on a hit takes +the user to that hit). + + +3/ kj2 would like more information available for features such as the species +and the DE lines. edgrif said they could display this via the current +mechanism of mouse-over popups in the "Info" line at the top of the ZMap. jgrg +is going to make sure this information is in the acedb objects so it can be +exported to zmap. + +jgrg, lw2 and edgrif to meet to establish "policy" for which data is displayed +by zmap and which by lace. edgrif said that they will need to establish a +"tag - value" system to make information display more generalised. + + +4/ zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +5/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this. + + +6/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +7/ jgrg pointed out a bug in the menu which has a "mark" but not an "unmark" +item. Also, selecting "Set feature for Bump" only marks the selected exon, not +the whole transcript. edgrif will fix these. + + +8/ Kerstin needs some acedb keyset like functions in lace to allow her to +perform operations on multiple features. jgrg to discuss/implement with kj2. + +jgrg and edgrif agreed that the way to do this is to allow users to "detach" +lace from a database, the user then uses xace to do their keyset stuff and +then lace reattaches. This is viable because most users do not do this +kind of operation so why spend large amounts of time reimplementing xace +function. + + +9/ Highlighting an object hides the CDS. + +This will be fixed when new styles come in. + + + +Low priority +------------ + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ cannot currently cursor through features in a column, there are some +technical problems here to do with raising items so they are visible. + + +3/ edgrif fixed a performance problem when drawing _all_ homol matches as +gapped. But this relies on slop factor set for the 'Gapped' tag in the +method. + +edgrif will sort this out with jgrg, it just requires that the slop factor +is set in the relevant methods. + + +4/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + + +5/ we need a "back" button, the new zoom functions make this a lower priority +than before. + + + +Mulitple alignments +------------------- + +multiple alignments: edgrif is about a third of the way through +implementing a more general way of displaying arbitrary blocks. This is a high +priority item. + + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +a plan for extracting the smap code from acedb and making it a stand alone +package. + + + +Builds +------ + +- ISG have now built us a separate software stack of the latest stable GTK +realease. We are moving our builds to use this on Linux and the Mac. + +- there is still a problem that we do not have a mac on the network that can +so proper single sign on, home directories etc etc. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +multiple species +---------------- + +jgrg, edgrif, lg4 and rds met to discuss this and we need to revist it to +check on progress. It requires some alterations to lace infrastructure to +support editting multiple species. + +edgrif has inserted the code to allow the user (i.e. lace) to specify that +particular sequences be fetched from particular servers. This allows zmap to +display different species/haplotypes/whatever in split windows. Once the +multiple alignment code is done then then could also be displayed in one +window. + +The below notes from a previous meeting are still relevant: + +============================================================================== +James and Ed need to discuss how this will be done. Zmap can display multiple +sequences in one zmap window but there will need to changes to the supporting +infrastructure including: + + - should the species go in one or multiple acedb databases + + - the lace <-> zmap protocol needs to allow specification of + multiple sequence display. + + - lace needs to allow the user to pick multiple species + +jla1 said a better test than Charlie and NOD mouse is Richard and haplotypes +of which he may have up to 6 (!). jla1 said is there any limit to number, +edgrif said "no, only machine memory....". To quickly test this edgrif will +get data from Richard and try it out to look at performance. Simplest way is +to just display data as is, this mimics what Richard does currently in +fmap. edgrif said that if we used Compara alignment data we could just display +the "sub-blocks" of the alignments that actually contained data. This would +produce much more compact displays. +================== from last time ================================================= + + + +2) Other matters +================ + +Both these items are still pending: + +- script for EMBL dumping: needs testing by Jen. James thinks this all +works. There was a discussion about submissions more generally and it was +agreed that all regions going into VEGA should be submitted. This let into a +request for some kind of "Clone status" flag. jgrg said there used to be one +and that it should be recreated. + + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +jgrg to think about this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 28th March 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_04_13 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_04_13 new file mode 100755 index 000000000..46474df66 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_04_13 @@ -0,0 +1,300 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Friday 13th April 2007 + +Attendees: lw2, jgrg, rds, te3, kj2 + + +1) otterlace + zmap progress +============================ + +High priority +------------- + +1/ Can't get otterid in zmap. + +kj2 raised the point that searching zmap/lace for a otter id is a +requirement. ZFIN Requests are based on otter ids. jgrg suggested +lace is a good place to implement this. lace knows the data best, can +search for otter id and instruct zmap to spotlight the feature. +Displaying the information in zmap depends on the ability to show tag +- value (below). + + +2/ Halfwise hits: No info when you click on them. + +rds asked if they have a URL object. lw2 pointed out that a domain +description is more important than the URL as it takes time to go to +the webpage for each hit. jgrg suggested it's another use for the tag +- value system (below) + + +3/ Need to redo bumping yet again, Adam has surveyed Havana and says +they "want it to work like it always did". edgrif to sort this out. +Included in this will be taking into account the strand of the nucleotide +sequence. Also the width of matches needs to be checked, its not informative +enough at the moment. + +edgrif has added a further menu option to the bump menu to include +this fMap style bumping. This new version is currently available in +otterlace. The hits are sorted by strand first and then by the +position and overlap with the marked region. This is effectively what +fMap does as it has an implicit marked region which is the area of the +display it's zoomed into. + + +4/ Evidence does not display that it has been used with an associated object. + +We need something to replace the xref of fmap which is displayable in +the treeview. Disucssion on whether the new styles are enough +followed, but although showing evidence in the same column goes some +way, an explicit link would be better. kj2 remarked that some method +of removing the feature from the display once it has been used as +evidence, allowing the visualisation of what's left to build objects +with, could be the ideal solution. + +5/ Clicking on gene predictions doesn't let you paste co-ordinates in spandit +window on both strands. + +we believe this is fixed. + +6/ highlighting of exons/cds/dna/protein etc does not work. + +rds has 2/3 of the code for this done. Some of this depends on the new +zmap styles. + + +Medium priority +--------------- + +1/ Saturated_EST bug: because of some ambiguity if tag usage in acedb methods +these don't get shown in the right way. This will be properly fixed when we +have zmap styles which are completely separate from acedb methods. + +edgrif has implemented the new styles now. Needs discussion with jgrg +on how and when to do the big switch. + +2/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches. Searching should be +restricted to any marked region is one is set, edgrif to do this. + +edgrif is implementing protein search now, location results to come (note that +currently the user is shown a list of all hits and clicking on a hit takes +the user to that hit). + +3/ kj2 would like more information available for features such as the species +and the DE lines. edgrif said they could display this via the current +mechanism of mouse-over popups in the "Info" line at the top of the ZMap. jgrg +is going to make sure this information is in the acedb objects so it can be +exported to zmap. + +jgrg, lw2 and edgrif to meet to establish "policy" for which data is displayed +by zmap and which by lace. edgrif said that they will need to establish a +"tag - value" system to make information display more generalised. +rds to set up meeting. + +4/ zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +5/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this. +nothing done, but it was noted that NO enduser configuration of +colours should be possible! + +6/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +7/ jgrg pointed out a bug in the menu which has a "mark" but not an "unmark" +item. Also, selecting "Set feature for Bump" only marks the selected exon, not +the whole transcript. edgrif will fix these. + +rds mentioned there's a possibility that this has been fixed by edgrif. + +8/ Kerstin needs some acedb keyset like functions in lace to allow her to +perform operations on multiple features. jgrg to discuss/implement with kj2. + +jgrg and edgrif agreed that the way to do this is to allow users to "detach" +lace from a database, the user then uses xace to do their keyset stuff and +then lace reattaches. This is viable because most users do not do this +kind of operation so why spend large amounts of time reimplementing xace +function. + + +9/ Highlighting an object hides the CDS. + +Should be fixed, and depends on the new styles. + +10/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +Low priority +------------ + +Consensus was nothing has changed in this section. + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ cannot currently cursor through features in a column, there are some +technical problems here to do with raising items so they are visible. + + +3/ edgrif fixed a performance problem when drawing _all_ homol matches as +gapped. But this relies on slop factor set for the 'Gapped' tag in the +method. + +edgrif will sort this out with jgrg, it just requires that the slop factor +is set in the relevant methods. + + +4/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + + +5/ we need a "back" button, the new zoom functions make this a lower priority +than before. + + + +Mulitple alignments +------------------- + +multiple alignments: edgrif is about a third of the way through +implementing a more general way of displaying arbitrary blocks. This is a high +priority item. + + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +a plan for extracting the smap code from acedb and making it a stand alone +package. + + + +Builds +------ + +- ISG have now built us a separate software stack of the latest stable GTK +realease. We are moving our builds to use this on Linux and the Mac. + +- there is still a problem that we do not have a mac on the network that can +so proper single sign on, home directories etc etc. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +multiple species +---------------- + +jgrg, edgrif, lg4 and rds met to discuss this and we need to revist it to +check on progress. It requires some alterations to lace infrastructure to +support editting multiple species. + +edgrif has inserted the code to allow the user (i.e. lace) to specify that +particular sequences be fetched from particular servers. This allows zmap to +display different species/haplotypes/whatever in split windows. Once the +multiple alignment code is done then then could also be displayed in one +window. + +The below notes from a previous meeting are still relevant: + +============================================================================== +James and Ed need to discuss how this will be done. Zmap can display multiple +sequences in one zmap window but there will need to changes to the supporting +infrastructure including: + + - should the species go in one or multiple acedb databases + + - the lace <-> zmap protocol needs to allow specification of + multiple sequence display. + + - lace needs to allow the user to pick multiple species + +jla1 said a better test than Charlie and NOD mouse is Richard and haplotypes +of which he may have up to 6 (!). jla1 said is there any limit to number, +edgrif said "no, only machine memory....". To quickly test this edgrif will +get data from Richard and try it out to look at performance. Simplest way is +to just display data as is, this mimics what Richard does currently in +fmap. edgrif said that if we used Compara alignment data we could just display +the "sub-blocks" of the alignments that actually contained data. This would +produce much more compact displays. +================== from last time ================================================= + + + +2) Other matters +================ + +Both these items are still pending: + +- script for EMBL dumping: needs testing by Jen. James thinks this all +works. There was a discussion about submissions more generally and it was +agreed that all regions going into VEGA should be submitted. This let into a +request for some kind of "Clone status" flag. jgrg said there used to be one +and that it should be recreated. + + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +jgrg to think about this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 25th April 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_04_23 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_04_23 new file mode 100755 index 000000000..f4eefef88 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_04_23 @@ -0,0 +1,234 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wednesday 23rd April 2007 + +Attendees: jgrg, edgrif, te3, st3 + + +1) otterlace + zmap progress +============================ + +High priority +------------- + +1/ Can't get otterid in zmap. + +kj2 raised the point that searching zmap/lace for a otter id is a requirement. +ZFIN Requests are based on otter ids. jgrg suggested lace is a good place to +implement this. lace knows the data best, can search for otter id and instruct +zmap to spotlight the feature. Displaying the information in zmap depends on +the ability to show tag - value (below). + + +2/ Display of tag-values by zmap + +edgrif, jgrg, rds and lg4 have met and agreed a plan for this. edgrif is +implementing a new version of the individual feature display code which will +do the following: + +- make reuse of existing feature windows the default to avoid loads of +windows. + +- make the feature display window into a "tabbed" style window to avoid the +window being too huge. + +- add tag-value information that will be cut/pastable in a generalised way +that is readily extensible + +An important development of this is the realisation that instead of passing +all information across via the server for every feature, zmap will ask lace +for the information for the individual feature. This is more flexible and +avoids excessive amounts of data being passed across when zmap starts up. + + +The following are all candidates to be displayed: + +- Halfwise hits (rds asked if they have a URL object. lw2 pointed out that a +domain description is more important than the URL as it takes time to go to +the webpage for each hit. + +- kj2 would like more information available for features such as the species +and the DE lines. edgrif said they could display this via the current +mechanism of mouse-over popups in the "Info" line at the top of the ZMap. jgrg +is going to make sure this information is in the acedb objects so it can be +exported to zmap. + + +- Evidence does not display that it has been used with an associated object. +We need something to replace the xref of fmap which is displayable in the +treeview. Disucssion on whether the new styles are enough followed, but +although showing evidence in the same column goes some way, an explicit link +would be better. kj2 remarked that some method of removing the feature from +the display once it has been used as evidence, allowing the visualisation of +what's left to build objects with, could be the ideal solution. + + +3/ highlighting of exons/cds/dna/protein etc does not work. + +rds has 2/3 of the code for this done. Some of this depends on the new zmap +styles. + + +4/ New zmap styles + +edgrif has now implemented this (with inheritance), jgrg and edgrif need +to decide when to do the big swop. + +This will fix the following problems: + +Saturated_EST bug: because of some ambiguity if tag usage in acedb methods +these don't get shown in the right way. This will be properly fixed when we +have zmap styles which are completely separate from acedb methods. + + + +Medium priority +--------------- + +1/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches. Searching should be +restricted to any marked region is one is set, edgrif to do this. + +edgrif is implementing protein search now, location results to come (note that +currently the user is shown a list of all hits and clicking on a hit takes +the user to that hit). + + +2/ zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +3/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this. +nothing done, but it was noted that NO enduser configuration of +colours should be possible! + +4/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +5/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + + +Low priority +------------ + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ cannot currently cursor through features in a column, there are some +technical problems here to do with raising items so they are visible. + + +3/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + + +4/ we need a "back" button, the new zoom functions make this a lower priority +than before. + + + +Mulitple alignments +------------------- + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +a plan for extracting the smap code from acedb and making it a stand alone +package. + + + +Builds +------ + +- We now have the latest GTK (2.10) for linux and are moving to it. We need to +move to it for macs as well. + + +- there is still a problem that we do not have a mac on the network that can +so proper single sign on, home directories etc etc. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +2) Other matters +================ + +Both these items are still pending: + +- script for EMBL dumping: needs testing by Jen. James thinks this all +works. There was a discussion about submissions more generally and it was +agreed that all regions going into VEGA should be submitted. This let into a +request for some kind of "Clone status" flag. jgrg said there used to be one +and that it should be recreated. + + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +jgrg to think about this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 9th May 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_05_09 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_05_09 new file mode 100755 index 000000000..d638447be --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_05_09 @@ -0,0 +1,234 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wednesday 23rd April 2007 + +Attendees: jgrg, edgrif, te3, st3 + + +1) otterlace + zmap progress +============================ + +High priority +------------- + +1/ Can't get otterid in zmap. + +kj2 raised the point that searching zmap/lace for a otter id is a requirement. +ZFIN Requests are based on otter ids. jgrg suggested lace is a good place to +implement this. lace knows the data best, can search for otter id and instruct +zmap to spotlight the feature. Displaying the information in zmap depends on +the ability to show tag - value (below). + + +2/ Display of tag-values by zmap + +edgrif, jgrg, rds and lg4 have met and agreed a plan for this. edgrif is +implementing a new version of the individual feature display code which will +do the following: + +- make reuse of existing feature windows the default to avoid loads of +windows. + +- make the feature display window into a "tabbed" style window to avoid the +window being too huge. + +- add tag-value information that will be cut/pastable in a generalised way +that is readily extensible + +An important development of this is the realisation that instead of passing +all information across via the server for every feature, zmap will ask lace +for the information for the individual feature. This is more flexible and +avoids excessive amounts of data being passed across when zmap starts up. + + +The following are all candidates to be displayed: + +- Halfwise hits (rds asked if they have a URL object. lw2 pointed out that a +domain description is more important than the URL as it takes time to go to +the webpage for each hit. + +- kj2 would like more information available for features such as the species +and the DE lines. edgrif said they could display this via the current +mechanism of mouse-over popups in the "Info" line at the top of the ZMap. jgrg +is going to make sure this information is in the acedb objects so it can be +exported to zmap. + + +- Evidence does not display that it has been used with an associated object. +We need something to replace the xref of fmap which is displayable in the +treeview. Disucssion on whether the new styles are enough followed, but +although showing evidence in the same column goes some way, an explicit link +would be better. kj2 remarked that some method of removing the feature from +the display once it has been used as evidence, allowing the visualisation of +what's left to build objects with, could be the ideal solution. + + +3/ highlighting of exons/cds/dna/protein etc does not work. + +rds has 2/3 of the code for this done. Some of this depends on the new zmap +styles. + + +4/ New zmap styles + +edgrif has now implemented this (with inheritance), jgrg and edgrif need +to decide when to do the big swop. + +This will fix the following problems: + +Saturated_EST bug: because of some ambiguity if tag usage in acedb methods +these don't get shown in the right way. This will be properly fixed when we +have zmap styles which are completely separate from acedb methods. + + + +Medium priority +--------------- + +1/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches. Searching should be +restricted to any marked region is one is set, edgrif to do this. + +edgrif is implementing protein search now, location results to come (note that +currently the user is shown a list of all hits and clicking on a hit takes +the user to that hit). + + +2/ zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +3/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this. +nothing done, but it was noted that NO enduser configuration of +colours should be possible! + +4/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +5/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + + +Low priority +------------ + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ cannot currently cursor through features in a column, there are some +technical problems here to do with raising items so they are visible. + + +3/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + + +4/ we need a "back" button, the new zoom functions make this a lower priority +than before. + + + +Mulitple alignments +------------------- + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +a plan for extracting the smap code from acedb and making it a stand alone +package. + + + +Builds +------ + +- We now have the latest GTK (2.10) for linux and are moving to it. We need to +move to it for macs as well. + + +- there is still a problem that we do not have a mac on the network that can +so proper single sign on, home directories etc etc. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +2) Other matters +================ + +Both these items are still pending: + +- script for EMBL dumping: needs testing by Jen. James thinks this all +works. There was a discussion about submissions more generally and it was +agreed that all regions going into VEGA should be submitted. This let into a +request for some kind of "Clone status" flag. jgrg said there used to be one +and that it should be recreated. + + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +jgrg to think about this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 23rd May 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_06_06 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_06_06 new file mode 100755 index 000000000..f644ed6b4 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_06_06 @@ -0,0 +1,265 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wednesday 6th June 2007 + +Attendees: jgrg, edgrif, te3, st3............ + + +1) otterlace + zmap progress +============================ + +High priority +------------- + +0/ Performance + +Issue raised by Laurens and others that we have "turned a blind eye to" until +now. + +- basic performance + +We knew there was a problem in glib, part of the gtk library we use for graphics. +This is now fixed and produces dramatic speed ups in parsing of data in zmap. + +In addition as we have now moved on to gtk 2.10 we use their new improved slice +allocator for memory allocation which improves performance substantially. + + +- a new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn. + + + +1/ Can't get otterid in zmap. + +kj2 raised the point that searching zmap/lace for a otter id is a requirement. +ZFIN Requests are based on otter ids. jgrg suggested lace is a good place to +implement this. lace knows the data best, can search for otter id and instruct +zmap to spotlight the feature. Displaying the information in zmap depends on +the ability to show tag - value (below). + + +2/ Display of tag-values by zmap + +I'm working on this now...... + +edgrif, jgrg, rds and lg4 have met and agreed a plan for this. edgrif is +implementing a new version of the individual feature display code which will +do the following: + +- make reuse of existing feature windows the default to avoid loads of +windows. + +- make the feature display window into a "tabbed" style window to avoid the +window being too huge. + +- add tag-value information that will be cut/pastable in a generalised way +that is readily extensible + +An important development of this is the realisation that instead of passing +all information across via the server for every feature, zmap will ask lace +for the information for the individual feature. This is more flexible and +avoids excessive amounts of data being passed across when zmap starts up. + + +The following are all candidates to be displayed: + +- Halfwise hits (rds asked if they have a URL object. lw2 pointed out that a +domain description is more important than the URL as it takes time to go to +the webpage for each hit. + +- kj2 would like more information available for features such as the species +and the DE lines. edgrif said they could display this via the current +mechanism of mouse-over popups in the "Info" line at the top of the ZMap. jgrg +is going to make sure this information is in the acedb objects so it can be +exported to zmap. + + +- Evidence does not display that it has been used with an associated object. +We need something to replace the xref of fmap which is displayable in the +treeview. Disucssion on whether the new styles are enough followed, but +although showing evidence in the same column goes some way, an explicit link +would be better. kj2 remarked that some method of removing the feature from +the display once it has been used as evidence, allowing the visualisation of +what's left to build objects with, could be the ideal solution. + + +3/ highlighting of exons/cds/dna/protein etc does not work. + +rds has fixed this...yippeeee + + +4/ New zmap styles + +edgrif has now implemented this (with inheritance), jgrg and edgrif need +to decide when to do the big swop. + +This will fix the following problems: + +Saturated_EST bug: because of some ambiguity if tag usage in acedb methods +these don't get shown in the right way. This will be properly fixed when we +have zmap styles which are completely separate from acedb methods. + + + +Medium priority +--------------- + +0/ Homology gaps display - in future homolgy gaps will not be displayed +normally. Instead they will be displayed when the user bumps a column. +This will save a lot of mostly wasted processing, especially if the user +makes use of the "mark" method of restricting how much is bumped. + +1/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches. Searching should be +restricted to any marked region is one is set, edgrif to do this. + +edgrif is implementing protein search now, location results to come (note that +currently the user is shown a list of all hits and clicking on a hit takes +the user to that hit). + + +2/ zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. + + +3/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this. +nothing done, but it was noted that NO enduser configuration of +colours should be possible! + +4/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +5/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + + +Low priority +------------ + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ cannot currently cursor through features in a column, there are some +technical problems here to do with raising items so they are visible. + + +3/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + + +4/ we need a "back" button, the new zoom functions make this a lower priority +than before. + + + +Multiple alignments +------------------- + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +a plan for extracting the smap code from acedb and making it a stand alone +package. + + + +Builds +------ + +- We now have the latest GTK (2.10) for linux and are moving to it. We need to +move to it for macs as well. + + +- there is still a problem that we do not have a mac on the network that can +so proper single sign on, home directories etc etc. Ed to raise a help ticket. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +2) Other matters +================ + +Both these items are still pending: + + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 20th June 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_06_14 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_06_14 new file mode 100755 index 000000000..b456768ed --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_06_14 @@ -0,0 +1,126 @@ +============================================================================== +ZMap/lace sub-meeting + +14/06/2007 + +jgrg, rds, edgrif + + +1) ZMap <-> lace communication + +We need to check that lace and zmap behave correctly when either +quits in a controlled way. This is vital if annotators are to be +able to restart zmap from lace in a controlled way. + +Ed/Roy have done quite a lot of work on zmap to ensure that it sends +commands in the correct way to clear up. + +James to check that lace also clears up correctly. + +Roy has done a lot of work to improve the xremote codes handling and +tracking of its own state. + +NOTE that we still have a hole in that we don't have a good way for +lace to "notice" that zmap has died unexpectedly. + +lace does not support the "single_select" request from zmap, nor +does it make "single_select" requests to zmap. This needs doing to +support feature highlighting. + + +2) otterids + +Display of otterids is still outstanding and is essential. The +current plan is for this to happen in lace, not in zmap (although +with the new tagvalue display the otterid can be displayed in zmap +as well). + + +3) TagValue display + +Ed has been working on this and its now complete. The data is displayed +in the usual "notebook" or "tabbed" style window. Some of the data +comes directly from zmap and some of it can come from lace. + +When the user selects the feature menu "Show Feature Details" item, +zmap sends an xremote request to lace for extra feature details. lace +can then return information as xml in the form of pages containing +paragraphs containing tagvalues in a number of formats. + +lace will need to support the new "feature_details" request and +reply with data in the form: + +<zmap> + <response> + <notebook> + <page name="NNN"> + <paragraph name="NNN" type="TTT"> + <tagvalue name="NNN" type="TTT"> + The contents of first tag + </tagvalue> + </paragraph> + </page> + </notebook> + </response> +</zmap> + + +The page, paragraph and tagvalue elements can be repeated as often as +required but must be nested as above. + + +Page element: + +The "name" attribute is compulsory. + + +Paragraph element: + +The "name" attribute is optional. + +The type attribute must be one of: + +"simple" tagvalues will be a simple vertical list. +"tagvalue_table" tagvalues will be aligned in a table. +"homogenous" as for "tagvalue_table" but all tagvalues should + have the same tag which will only be displayed once. +(default is "simple") + +Tagvalue element: + +The "name" attribute is compulsory. + +The type attribute must be one of: + +"simple" the value is displayed in a single line text widget +"scrolled_text" the value is displayed in a multiline/scrolled window +(default is "simple") + +The content fo the tagvalue must be text. + + + +4) New styles + +Code is all waiting, we just need to do the big switch. + + + +5) Mac builds + +Both Roy and Ed have worked on the build system to make it easier/quicker +to do mac builds. Currently we are waiting to get GTK 2.10 installed on +our iMac and then the build should be automated. + + + +6) The plan + +James is to get a "test otterlace" system of some sort going before his +C++ course next week so Roy and Ed can try the latest zmap out with a +couple of annotators. + +Ed is to finish the tagvalue code. + +After the C++ course James, Ed and Roy will tie up all the current loose +ends that are preventing annotators from using zmap. diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_07_04 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_07_04 new file mode 100755 index 000000000..0f7a394e1 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_07_04 @@ -0,0 +1,223 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wednesday 4th July 2007 + +Attendees: th, edgrif, te3, lw2, eah + + +1) otterlace + zmap progress +============================ + +High priority +------------- + +1/ Can't get otterid in zmap. + +kj2 raised the point that searching zmap/lace for a otter id is a requirement. +ZFIN Requests are based on otter ids. jgrg suggested lace is a good place to +implement this. lace knows the data best, can search for otter id and instruct +zmap to spotlight the feature. Displaying the information in zmap depends on +the ability to show tag - value (below). + + +2/ Display of tag-values by zmap + +The code has been implemented in zmap to dynamically build tabbed pages from +xml content sent from lace. The following remain to be added: + +- Halfwise hits (rds asked if they have a URL object. lw2 pointed out that a +domain description is more important than the URL as it takes time to go to +the webpage for each hit. + +- kj2 would like more information available for features such as the species +and the DE lines. edgrif said they could display this via the current +mechanism of mouse-over popups in the "Info" line at the top of the ZMap. jgrg +is going to make sure this information is in the acedb objects so it can be +exported to zmap. + +- Evidence does not display that it has been used with an associated object. +We need something to replace the xref of fmap which is displayable in the +treeview. Discussion on whether the new styles are enough followed, but +although showing evidence in the same column goes some way, an explicit link +would be better. kj2 remarked that some method of removing the feature from +the display once it has been used as evidence, allowing the visualisation of +what's left to build objects with, could be the ideal solution. + + +3/ New zmap styles + +edgrif has now implemented this (with inheritance), jgrg needs +to decide when to do the big swop. + + +4/ Single select + +lace does not support the "single_select" request from zmap, nor +does it make "single_select" requests to zmap. This needs doing to +support feature highlighting. + + +5/ Demo + +edgrif & eah will organise a zmap/havana demo to make remind everyone of +the way to use zmap. + + + +Medium priority +--------------- + +1/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches. Searching should be +restricted to any marked region is one is set, edgrif to do this. + +edgrif is implementing protein search now, location results to come (note that +currently the user is shown a list of all hits and clicking on a hit takes +the user to that hit). + + +2/ zmap needs to get sequences that are in acedb from there for blixem, means +issuing a call to the server to get them but this is possible. NOTE that some +sequences are _only_ in acedb. edgrif is doing this now. + + +3/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +4/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + + + +Low priority +------------ + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ cannot currently cursor through features in a column, there are some +technical problems here to do with raising items so they are visible. +edgrif doing this now. + + +3/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + + +4/ we need a "back" button, the new zoom functions make this a lower priority +than before. + +5/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this. +nothing done, but it was noted that NO enduser configuration of +colours should be possible! + + + + +A new canvas +------------ + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn. + + + +Multiple alignments +------------------- + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +a plan for extracting the smap code from acedb and making it a stand alone +package. + + + +Builds +------ + +We now have a properly networked Mac, we need to include universal binaries. + + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +2) Other matters +================ + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 18th July 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_08_15 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_08_15 new file mode 100755 index 000000000..393bef8b5 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_08_15 @@ -0,0 +1,190 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wednesday 15th August 2007 + +Attendees: jgrg, lw2, edgrif, te3 + + +1) otterlace + zmap progress +============================ + +High priority +------------- + + +0/ ZMap performance + +The only issue here is caused by the huge numbers of homologies. There is much +confusion here as there are several confounding factors: + +- slow machines + +- slow network meaning slow startup/perceived time + +- otter pipeline problems meaning that vast numbers of useless homologies +sometimes get through + +- bumping is adversely affected by duplicated regions, current algorithm tries +to join up the regions, EG to deal with this. + +I plan to do some profiling/statistics to get some facts/figures. + +I have a couple of plans to deal with issues such as repeated regions which +create problems for my bumping algorithm. + + + +1/ New zmap styles ************* + +edgrif has now implemented this (with inheritance), jgrg needs +to decide when to do the big swop. + + + +Medium priority +--------------- + +1/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches. Searching should be +restricted to any marked region is one is set, edgrif to do this. + +edgrif is implementing protein search now, location results to come (note that +currently the user is shown a list of all hits and clicking on a hit takes +the user to that hit). + + +2/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +3/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + + + +Low priority +------------ + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + + +3/ we need a "back" button, the new zoom functions make this a lower priority +than before. + +4/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this. +nothing done, but it was noted that NO enduser configuration of +colours should be possible! + + + + +A new canvas +------------ + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn. + + + +Multiple alignments +------------------- + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +a plan for extracting the smap code from acedb and making it a stand alone +package. + + + +Builds +------ + +We now have a properly networked Mac, we need to include universal binaries. +jgrg, edgrif & rds to work on this. + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +2) Other matters +================ + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 29th August 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_09_12 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_09_12 new file mode 100755 index 000000000..b6f54a862 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_09_12 @@ -0,0 +1,211 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wednesday 15th August 2007 + +Attendees: th, edgrif, rds, jgrg, lw2, te3, st3 + + +1) otterlace + zmap progress +============================ + +High priority +------------- + + +0/ ZMap performance + +The only issue here is caused by the huge numbers of homologies. + +rds and edgrif have applied a number of fixes and we await feedback +from users, initial feedback has been good. + +There is still a problem with the pipeline and pathological conditions +that produce huge numbers of essentially useless homologies. + + + +1/ New zmap styles ************* + +edgrif has now implemented this (with inheritance), jgrg needs +to decide when to do the big swop. Waiting for jgrg. + + +2/ Display of otter information + +edgrif and rds have implemented code in zmap to deal with otter ids etc but lace +needs to export these so we can display them. + +3/ Vega + +can vega be built using the new otter schema ? + + +4/ Ticket priority + +edgrif and rds need to sit down with Havana and reasign ticket priorities as there are +too many on priority 7 making it impossible to prioritise. + + +Medium priority +--------------- + +1/ DNA finder - Needs to show its find results location on the zmap as in +fmap, would be nice to be colour coded depending on the strand. The finder +needs to be extended to also do protein searches. Searching should be +restricted to any marked region is one is set, edgrif to do this. + +edgrif is implementing protein search now, location results to come (note that +currently the user is shown a list of all hits and clicking on a hit takes +the user to that hit). + + +2/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +3/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + + +4/ look no mouse + +edgrif has implemented various ways of navigating which mean that +zmap can largely be used without touching the mouse. + + +5/ removing evidence already used + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? + + + +Low priority +------------ + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. + + +3/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this. +nothing done, but it was noted that NO enduser configuration of +colours should be possible! + + + + +A new canvas +------------ + + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn by which time it is likely to have become +part of gtk as _the_ gtk canvas. + + + +Multiple alignments +------------------- + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +a plan for extracting the smap code from acedb and making it a stand alone +package. + + + +Builds +------ + +We now have a properly networked Mac, we need to include universal binaries. +jgrg, edgrif & rds to work on this. + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +- another issue is to incorporate James proxy into blixem and zmap, input +needed from jgrg for this. + + + +2) Other matters +================ + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 26th September 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_09_26 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_09_26 new file mode 100755 index 000000000..b121108ba --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_09_26 @@ -0,0 +1,218 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Wednesday 26th September 2007 + +Attendees: th, eah, edgrif, jgrg, lw2, te3, st3 + + +1) otterlace + zmap progress +============================ + +High priority +------------- + + +0/ ZMap performance + +edgrif & rds still awaiting feedback. + +There is still a problem with the pipeline and pathological conditions +that produce huge numbers of essentially useless homologies. jgrg is +thinking about how to do this. edgrif said he could add a warning message +to report excessive numbers of hits so that at least users would be aware +of any problem. + + + +1/ New zmap styles ************* + +edgrif has now implemented this (with inheritance), jgrg needs +to decide when to do the big swop. jgrg said with the new schema +work now almost done he is in a position to do this. + + +2/ Display of otter information + +edgrif and rds have implemented code in zmap to deal with otter ids etc but lace +needs to export these so we can display them. jgrg said with the new schema +work now almost done he is in a position to do this. + +3/ Vega + +can vega be built using the new otter schema ? st3 said not yet buts its on the way. + + +4/ Ticket priority + +edgrif and rds went through all tickets with eah and jpa and recategorised +all tickets. This resulted in many being resolved. Currently we are down to +15 tickets with none in category 7. edgrif explained that he added a new +ticket field which was a text version of the seven ticket categories so that +Havana could just set one of these and forget all the other fields. edgrif +will readvertise this over the next couple of weeks. + + +Medium priority +--------------- + +1/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +2/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +3/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +jgrg said that this should be more possible with the new ensembl schema. + + +4/ look no mouse + +edgrif has implemented various ways of navigating which mean that +zmap can largely be used without touching the mouse. Waiting for feedback. +edgrif to readvertise shortcuts with next zmap release. Will also check +about lassoing and why it isn't just button 1 down/drag. + + +5/ removing evidence already used + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + + + +Low priority +------------ + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + + +3/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this but are waiting for eah and lw2 to report back with +list of what users would like to be able to configure. + + + + +A new canvas +------------ + + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn when it has become part of gtk +as _the_ gtk canvas. + + + +Multiple alignments +------------------- + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +a plan for extracting the smap code from acedb and making it a stand alone +package. + + + +Builds +------ + +We now have a properly networked Mac, we need to include universal binaries. +jgrg, edgrif & rds to work on this. edgrif working on installing autoconf +stuff on the mac, needed to build our special foocanvas version. Once that +has been done we wil harmonise with jgrg's script for building lace installs. + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +- another issue is to incorporate James proxy into blixem and zmap, input +needed from jgrg for this. + + + +2) Other matters +================ + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 10th October 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_10_11 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_10_11 new file mode 100755 index 000000000..96170cebe --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_10_11 @@ -0,0 +1,253 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 11 October 2007 + +Attendees: th, eah, edgrif, jgrg, lw2, te3, st3, jla1 + + +1) otterlace + zmap progress +============================ + +High priority +------------- + + +0/ ZMap performance + +edgrif & rds still awaiting feedback. + +There is still a problem with the pipeline and pathological conditions +that produce huge numbers of essentially useless homologies. jgrg is +thinking about how to do this. + +edgrif explained two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user. + +jla1 said there was still a problem with bumping, edgrif to investigate +further. + +edgrif is to follow up installing X performance tools on linux/mac so we +measure performance identify poorly performing/badly configured boxes. + + +1/ New zmap styles ************* + +edgrif has now implemented this (with inheritance), jgrg needs +to decide when to do the big swop. jgrg said with the new schema +work now almost done he is in a position to do this. + +should be done this week. + + +2/ Display of otter information + +edgrif and rds have implemented code in zmap to deal with otter ids etc but lace +needs to export these so we can display them. jgrg said with the new schema +work now almost done he is in a position to do this. + +should be done this week. + + +3/ Vega + +can vega be built using the new otter schema ? st3 said not yet buts its on the way, +should be here in a week or two. + + +4/ Ticket priority + +eah will talk about RT and raising tickets at next Havana meeting. edgrif +will implement a menu to take users directly to relevant pages to raise +tickets. + +5/ CDS translation + +rds is working on showing all possible translations etc in zmap window. +This is his current top priority. + +6/ Blixem + +edgrif will set up an option so that blixem can be run on swissprot and trembl +entries simultaneously. + + +Medium priority +--------------- + +1/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +2/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +3/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +jgrg said that this should be more possible with the new ensembl schema. + + +4/ look no mouse + +edgrif has implemented various ways of navigating which mean that +zmap can largely be used without touching the mouse. Waiting for feedback. +edgrif to readvertise shortcuts with next zmap release. Will also check +about lassoing and why it isn't just button 1 down/drag. + +we discussed the option of having a flag to remove mouse usage, would be +good for training and also for experienced users to get them to use +the keyboard short cuts. + +5/ removing evidence already used + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + + +6/ zmap width + +Because of the way zmap works, screen space can be used up by columns +that have nothing in them for the coordinate range the user is looking +at. edgrif will look at a compress button to hide columns where this +is true. + + +Low priority +------------ + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + + +3/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif or rds will do this but are waiting for eah and lw2 to report back with +list of what users would like to be able to configure. + + + + +A new canvas +------------ + + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn when it has become part of gtk +as _the_ gtk canvas. + + + +Multiple alignments +------------------- + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +a plan for extracting the smap code from acedb and making it a stand alone +package. + + + +Builds +------ + +We now have a properly networked Mac, we need to include universal binaries. +jgrg, edgrif & rds to work on this. edgrif working on installing autoconf +stuff on the mac, needed to build our special foocanvas version. Once that +has been done we wil harmonise with jgrg's script for building lace installs. + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +- another issue is to incorporate James proxy into blixem and zmap, input +needed from jgrg for this. + + + +2) Other matters +================ + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 10th October 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_10_25 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_10_25 new file mode 100755 index 000000000..7ef6230f3 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_10_25 @@ -0,0 +1,246 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 25th October 2007 + +Attendees: edgrif, jgrg, lw2, st3, jla1 + + +1) otterlace + zmap progress +============================ + + +High priority +------------- + + +0/ ZMap performance + +edgrif & rds still awaiting feedback. + +There is still a problem with the pipeline and pathological conditions +that produce huge numbers of essentially useless homologies. jgrg is +thinking about how to do this. + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user. + +edgrif is to follow up installing X performance tools on linux/mac so we +measure performance identify poorly performing/badly configured boxes. + + +1/ Bumping + +jla1 had reported performance problems with bumping, edgrif has implemented +"compress" option, we will wait for user feedback. + + +2/ New zmap styles ************* + +edgrif has now implemented this (with inheritance), jgrg needs +to decide when to do the big swop. jgrg said with the new schema +work now almost done he is in a position to do this. + +should be done this week. + + +3/ Display of otter information + +edgrif and rds have implemented code in zmap to deal with otter ids etc but lace +needs to export these so we can display them. jgrg said with the new schema +work now almost done he is in a position to do this. + +should be done this week. + +lw2 reported a bug in that the "transcript" section of the feature display is +always empty, edgrif to look at this. + +lw2 also asked about DE line information display, edgrif asked lw2 to collect +together requirements for information to be displayed. + + +4/ CDS translation + +rds is working on showing all possible translations etc in zmap window. +This is his current top priority. edgrif to ask where we are with this. + + + +5/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif is implementing this now as there are a couple of outstanding requests +for this. + +edgrif & rds still waiting for eah and lw2 to report back with +list of what users would like to be able to configure. + + + + +Medium priority +--------------- + + +1/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +2/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +3/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +jgrg said that this should be more possible with the new ensembl schema. + + +4/ lasso with left button down only + +edgrif explained that this is not simple because we use left button to +select objects, he is looking at a possible way to implement this. + +we discussed the option of having a flag to remove mouse usage, would be +good for training and also for experienced users to get them to use +the keyboard short cuts. edgrif to investigate. + + +5/ removing evidence already used + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + + +6/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. + + +7/ pfetch proxy + +jgrg has provided pseudo code, rds to implement. + + + +Low priority +------------ + + +0/ jgrg asked if edgrif could check what level of gtk acedb is built against. + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + + + +A new canvas +------------ + + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn when it has become part of gtk +as _the_ gtk canvas. + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +started work on an autoconf'd package to do this. + + +Builds +------ + +We now have a properly networked Mac, we need to include universal binaries. +jgrg, edgrif & rds to work on this. edgrif working on installing autoconf +stuff on the mac, needed to build our special foocanvas version. Once that +has been done we wil harmonise with jgrg's script for building lace installs. + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + +2) Other matters +================ + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 8th November 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_11_08 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_11_08 new file mode 100755 index 000000000..aad233fc1 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_11_08 @@ -0,0 +1,306 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 8th November 2007 + +Attendees: edgrif, lw2, jla1, jgrg, eah, st3, te3 + + +1) otterlace + zmap progress +============================ + + +High priority +------------- + + +0/ ZMap performance + +edgrif & rds still awaiting feedback. + +There is still a problem with the pipeline and pathological conditions +that produce huge numbers of essentially useless homologies. jgrg is +thinking about how to do this. + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user. + +rds has been making changes to the way the foocanvas draws which will +make operations like reverse complement significantly faster. + +edgrif is to follow up installing X performance tools on linux/mac so we +measure performance identify poorly performing/badly configured boxes. In +particular this is going to be done on the new boxes. + + +1/ New zmap styles ************* + +edgrif has now implemented this (with inheritance), jgrg needs +to decide when to do the big swop. jgrg said with the new schema +work now almost done he is in a position to do this. + +should be done this week. + + +2/ Display of otter information + +edgrif and rds have implemented code in zmap to deal with otter ids etc but lace +needs to export these so we can display them. jgrg said with the new schema +work now almost done he is in a position to do this. + +should be done this week. + +lw2 reported a bug in that the "transcript" section of the feature display is +always empty, edgrif to look at this. + +lw2 also asked about DE line information display, edgrif asked lw2 to collect +together requirements for information to be displayed. jla1 to chase up. + +39329: PFAM info +Possible to show a description associated with Halfwise (Pfam) objects in Zmap? +Currently in lace we see the domain description as well as the pfam accession number +(EAH) jgrg will pass this information over to zmap, in particular Ensembl URLs will +be passed over. + +BLAST evidence info: +Would it be possible to see what organism a piece of evidence belongs to, without +having to pfetch each one individually? Also no info (field Title in Fmap) on EST/mRNA +hits unless you pfetch (RJK, AF2, MT4) We need species and taxon data but some of +this needs new database fields (jgrg to implement). + + +3/ CDS translation + +TOP PRIORITY rds is working on showing all possible translations etc in zmap window. +This is his current top priority. edgrif to ask where we are with this. + +10155 & 40988: Show CDS translation in Zmap + +40989: 3 frame translation on the Zmap +It would be nice when clicking on a transcript to have highlighted the correct frame +in the 3 frame translation that corresponds to a particular exon +- 3 frame translation did not work at all for forward or revcomp (RJK) + +rds has implemented fmap functionality but to go further requires styles. + + + +4/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif has implemented code for this but now need eah and lw2 to report back with +list of what users would like to be able to configure. + +jla1 said that configuration is required more at the group or DB level, we can +already do this via lace setting up zmaps configuration files. + +lw2 suggested users would like to configure column order, he will put in a low +priority ticket for this. + + + + +Medium priority +--------------- + + +1/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +2/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +3/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +jgrg said that this should be more possible with the new ensembl schema. + + +4/ clone overlap display + +lw2 said users need to see more clone overlap data, jgrg has provided a +suggestion for how this might be displayed. + +They would also like to be able to list features by clone rather than just +by hit name or whatever. + +They would also like to see the clone name in lists of hsps.... + + + +5/ removing evidence already used + +TOP PRIORITY + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + + +6/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. + + +7/ pfetch proxy + +jgrg has provided pseudo code, rds to implement. + + +8/ Interface issues: + + +extending marked region: + +jla1, eah and lw2 said users would like some way of interactively extending the +marked area. edgrif to look at this perhaps using mouse action over the marked area. + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + + +Would it be possible to have an undo or back (like on a web browser) button? (AF2) + +rds is implementing this. + + + + +Low priority +------------ + +0/ jgrg asked if edgrif could check what level of gtk acedb is built against. + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + + +3/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +A new canvas +------------ + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn when it has become part of gtk +as _the_ gtk canvas. + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +started work on an autoconf'd package to do this. + + +Builds +------ + +We now have a properly networked Mac, we need to include universal binaries. +jgrg, edgrif & rds to work on this. edgrif working on installing autoconf +stuff on the mac, needed to build our special foocanvas version. Once that +has been done we wil harmonise with jgrg's script for building lace installs. + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + +2) Other matters +================ + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 22nd November 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_11_22 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_11_22 new file mode 100755 index 000000000..7698f587f --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_11_22 @@ -0,0 +1,283 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 22nd November 2007 + +Attendees: edgrif, eah + + +1) otterlace + zmap progress +============================ + + +High priority +------------- + + +0/ ZMap performance - contd..... + +There is still a problem with the pipeline and pathological conditions +that produce huge numbers of essentially useless homologies. jgrg is +thinking about how to do this. + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + +Have found X performance tools, very comprehensive. Plan is to run them +on my machine as a benchmark and compare. Will produce noddy script that +records the machines configuration, runs the X performance tool and produces +a comparison with the benchmark. This way we can profile anyones machine +who is having problems with performance. + + + +1/ New zmap styles ************* + +edgrif has now implemented this (with inheritance), jgrg needs +to decide when to do the big swop. jgrg said with the new schema +work now almost done he is in a position to do this. + +should be done this week. + + +2/ Display of otter information + +Otter ids are now displayed. edgrif is reworking the feature display to match +requirements from Mark who coordinated Havana input on this. ZMap code should be +done this week and should include: + +lw2 also asked about DE line information display, edgrif asked lw2 to collect +together requirements for information to be displayed. jla1 to chase up. + +39329: PFAM info +Possible to show a description associated with Halfwise (Pfam) objects in Zmap? +Currently in lace we see the domain description as well as the pfam accession number +(EAH) jgrg will pass this information over to zmap, in particular Ensembl URLs will +be passed over. + +BLAST evidence info: +Would it be possible to see what organism a piece of evidence belongs to, without +having to pfetch each one individually? Also no info (field Title in Fmap) on EST/mRNA +hits unless you pfetch (RJK, AF2, MT4) We need species and taxon data but some of +this needs new database fields (jgrg to implement). + + +3/ CDS translation + +Will be in next build which is early next week. + + + +4/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif has implemented code for this but now need eah and lw2 to report back with +list of what users would like to be able to configure. + +jla1 said that configuration is required more at the group or DB level, we can +already do this via lace setting up zmaps configuration files. + + + +Medium priority +--------------- + + +1/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +2/ There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. + + +3/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +jgrg said that this should be more possible with the new ensembl schema. + + +4/ clone overlap display + +lw2 said users need to see more clone overlap data, jgrg has provided a +suggestion for how this might be displayed. + +They would also like to be able to list features by clone rather than just +by hit name or whatever. + +They would also like to see the clone name in lists of hsps.... + + + +5/ removing evidence already used + +TOP PRIORITY + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + + +6/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. + + +7/ pfetch proxy + +jgrg has provided pseudo code, rds to implement. + + +8/ Interface issues: + + +extending marked region: + +jla1, eah and lw2 said users would like some way of interactively extending the +marked area. edgrif to look at this perhaps using mouse action over the marked area. + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + + +Would it be possible to have an undo or back (like on a web browser) button? (AF2) + +rds is implementing this. + + + + +Low priority +------------ + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + + +3/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +A new canvas +------------ + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn when it has become part of gtk +as _the_ gtk canvas. + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +started work on an autoconf'd package to do this. + + +Builds +------ + +rds has done the universal binary builds. So we can provide hand builds of them +but still need to incorporate the system into our overnight build procedure. +A labour of love not helped by fairly poor docs from Apple. But it is now +working. + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + +2) Other matters +================ + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 22nd November 2007 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_12_06 b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_12_06 new file mode 100755 index 000000000..732079499 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2007/zmap_lace.2007_12_06 @@ -0,0 +1,324 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 22nd November 2007 + +Attendees: edgrif, eah, jgrg, br2, st3, jla1, lw2 + + +1) otterlace + zmap progress +============================ + + +High priority +------------- + + +0/ ZMap performance - contd..... + +There is still a problem with the pipeline and pathological conditions +that produce huge numbers of essentially useless homologies. jgrg is +thinking about how to do this. + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + +Have found X performance tools, very comprehensive. Plan is to run them +on my machine as a benchmark and compare. Will produce noddy script that +records the machines configuration, runs the X performance tool and produces +a comparison with the benchmark. This way we can profile anyones machine +who is having problems with performance. + + + +1/ New zmap styles ************* + +edgrif has now implemented this (with inheritance), jgrg needs +to decide when to do the big swop. jgrg said with the new schema +work now almost done he is in a position to do this. + +should be done this week. edgrif to mail jgrg with info. about how +to turn styles "on" in zmap. + + +2/ Display of otter information + +Otter ids are now displayed. edgrif is reworking the feature display to match +requirements from Mark who coordinated Havana input on this. ZMap code should be +done this week and should include: + +lw2 also asked about DE line information display, edgrif asked lw2 to collect +together requirements for information to be displayed. jla1 to chase up. + +39329: PFAM info +Possible to show a description associated with Halfwise (Pfam) objects in Zmap? +Currently in lace we see the domain description as well as the pfam accession number +(EAH) jgrg will pass this information over to zmap, in particular Ensembl URLs will +be passed over. + +BLAST evidence info: +Would it be possible to see what organism a piece of evidence belongs to, without +having to pfetch each one individually? Also no info (field Title in Fmap) on EST/mRNA +hits unless you pfetch (RJK, AF2, MT4) We need species and taxon data but some of +this needs new database fields (jgrg to implement). + + +3/ CDS translation + +Will be in next build which is early next week. + + + +4/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif has implemented code for this but now need eah and lw2 to report back with +list of what users would like to be able to configure. + +jla1 said that configuration is required more at the group or DB level, we can +already do this via lace setting up zmaps configuration files. + + +5/ parsing names with embedded spaces... + +jgrg said there is a problem with zmap parsing gff where there are feature names +with embedded spaces, edgrif to fix this asap. Also, the text following the object +name needs to be displayed in the info. line when the feature is selected. + + +6/ Consistency Test using ZMap + +jla1 said there will be a test in February using zmap, not xace. edgrif said he +and Roy will make sure there is a stable version for then. + + + +Medium priority +--------------- + +0/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + + +1/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +3/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +jgrg said that this should be more possible with the new ensembl schema. + + +4/ clone overlap display + +lw2 said users need to see more clone overlap data, jgrg has provided a +suggestion for how this might be displayed. + +They would also like to be able to list features by clone rather than just +by hit name or whatever. + +They would also like to see the clone name in lists of hsps.... + + + +5/ removing evidence already used + +TOP PRIORITY + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + + +6/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. + + +7/ pfetch proxy + +jgrg has provided pseudo code, rds to implement. + + +8/ Interface issues: + + +extending marked region: + +jla1, eah and lw2 said users would like some way of interactively extending the +marked area. edgrif to look at this perhaps using mouse action over the marked area. + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + + +Would it be possible to have an undo or back (like on a web browser) button? (AF2) + +rds is implementing this. + + +9/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + + + +Low priority +------------ + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + + +3/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + + +2) Back To The Future +===================== + + +Leo's transcript display +------------------------ + +There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. edgrif suggested that it might be good to have a separate window +but jgrg said he thought it could be done within existing zmap by adding +some new features. + + +A new canvas +------------ + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn when it has become part of gtk +as _the_ gtk canvas. + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +started work on an autoconf'd package to do this. + + +Builds +------ + +rds has done the universal binary builds. So we can provide hand builds of them +but still need to incorporate the system into our overnight build procedure. +A labour of love not helped by fairly poor docs from Apple. But it is now +working. + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + +2) Other matters +================ + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + + +3) Next Meeting +=============== + +Will be at 2pm, 3rd January 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_01_10 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_01_10 new file mode 100755 index 000000000..070d51981 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_01_10 @@ -0,0 +1,389 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 10th January 2008 + +Attendees: rds, eah, jgrg, kj2, st3, jla1, lw2 + + +1) otterlace + zmap progress +============================ + + +High priority +------------- + + +0/ ZMap performance - contd..... + +Although performance is still proving to be an issue in some cases, it +is felt that incorporating 'missing' features is a higher priority. +The plan is to remove xace from the otterlace environment as soon as +possible, so features that are absolutely essential to the annotators +working should be addressed before the issue of performance. + +There is still a problem with the pipeline and pathological conditions +that produce huge numbers of essentially useless homologies. jgrg is +thinking about how to do this. + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + +Have found X performance tools, very comprehensive. Plan is to run them +on my machine as a benchmark and compare. Will produce noddy script that +records the machines configuration, runs the X performance tool and produces +a comparison with the benchmark. This way we can profile anyones machine +who is having problems with performance. + + + +1/ New zmap styles ************* + +edgrif has now implemented this (with inheritance), jgrg needs +to decide when to do the big swop. jgrg said with the new schema +work now almost done he is in a position to do this. + +The lack of zmap styles is holding up development and implementation +of certain zmap features. This makes this a high priority item. + +Should be done this week. edgrif & jgrg to discuss how to turn styles +"on" in zmap. + +2/ Display of otter information + +In particular the Feature Details dialog/replicating some of treeview +(ticket #49520). Mark provided a very good mock up of the way the +dialog should look which edgrif is working to. + +Otter ids are now displayed. edgrif is reworking the feature display to match +requirements from Mark who coordinated Havana input on this. ZMap code should be +done this week and should include: + +lw2 also asked about DE line information display, edgrif asked lw2 to collect +together requirements for information to be displayed. jla1 to chase up. + +39329: PFAM info +Possible to show a description associated with Halfwise (Pfam) objects in Zmap? +Currently in lace we see the domain description as well as the pfam accession number +(EAH) jgrg will pass this information over to zmap, in particular Ensembl URLs will +be passed over. + +BLAST evidence info: +Would it be possible to see what organism a piece of evidence belongs to, without +having to pfetch each one individually? Also no info (field Title in Fmap) on EST/mRNA +hits unless you pfetch (RJK, AF2, MT4) We need species and taxon data but some of +this needs new database fields (jgrg to implement). + +st3 raised a user request to add the author field into this display. + +3/ Clone summary info/Automating DE line creation. + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + + +4/ CDS translation + +From Jane + +Showing the peptide translation of an object next to the object in +zmap with the residue numbers next to the exon boundaries is still not +working. It took me about 5 times as long to check my object without +this functionality. + +Will be in next build which is early next week. Actually it's not +quite finished as it's currently quite unstable. + +rds is working on this. the code will depend on zmap styles to work. + + +5/ parsing names with embedded spaces... + +This is causing a problem saving EUCOMM Exons. + +jgrg said there is a problem with zmap parsing gff where there are feature names +with embedded spaces, edgrif to fix this asap. Also, the text following the object +name needs to be displayed in the info. line when the feature is selected. + +6/ Consistency Test using ZMap + +jla1 said there will be a test in February using zmap, not xace. edgrif said he +and Roy will make sure there is a stable version for then. + +jla1 said this would be only possible when points 1, 2 & 4 have been actioned. + + + +Medium priority +--------------- + +0/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + + +1/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +2/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +jgrg said that this should be more possible with the new ensembl schema. + + +3/ clone overlap display + +lw2 said users need to see more clone overlap data, jgrg has provided a +suggestion for how this might be displayed. + +They would also like to be able to list features by clone rather than just +by hit name or whatever. + +They would also like to see the clone name in lists of hsps.... + + + +4/ removing evidence already used + +*** TOP PRIORITY *** + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + + +5/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. + + +6/ pfetch proxy + +jgrg has provided pseudo code, rds to implement. + + +7/ Interface issues: + + +extending marked region: + +jla1, eah and lw2 said users would like some way of interactively +extending the marked area. edgrif to look at this perhaps using mouse +action over the marked area. eah also requested that this should +alter the bump column so that evidence that has been hidden, as it did +not overlap, gets shown. + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +lw2 said the sensitivity of the lasso is still too sensitive. edgrif +has decreased this so that the lasso must be dragged >20 pixels. This +can be changed, but we need to have a consensus on this. + +Would it be possible to have an undo or back (like on a web browser) button? (AF2) + +rds said that this has been provided to enable stepping back 1 event, +which can either be a mark, unmark, or zoom operation. + + +8/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif has implemented code for this but now need eah and lw2 to report back with +list of what users would like to be able to configure. + +jla1 said that configuration is required more at the group or DB level, we can +already do this via lace setting up zmaps configuration files. + + + +9/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + + + +Low priority +------------ + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + + +3/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + + +2) Back To The Future +===================== + + +Leo's transcript display +------------------------ + +There was a discussion about good ways to interact with +alignments/transcripts for deletion and other actions. Leo and James have +already worked out a lot of this with Leos transcript viewer. edgrif will get +the rules from Leo and James and then we can discuss them and decide which to +implement. edgrif suggested that it might be good to have a separate window +but jgrg said he thought it could be done within existing zmap by adding +some new features. + + +A new canvas +------------ + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We plan to introduce this in the Autumn when it has become part of gtk +as _the_ gtk canvas. + + + +Changing the Assembly +--------------------- + +th made the good but frightening point that the world of annotation has +changed to one where the annotator may well be making changes to the +underlying assembly, he said that ensembl supports this and lace/zmap will +need to. The consequences feel alarming..... + +edgrif and rds to think about this: would it mean that zmap would have to +change the position of features on the fly as the user edits the +assembly...some tricky pathological cases here.... + +We will need smapping support in zmap to do this kind of thing. edgrif has +started work on an autoconf'd package to do this. + + +Builds +------ + +rds has done the universal binary builds. So we can provide hand builds of them +but still need to incorporate the system into our overnight build procedure. +A labour of love not helped by fairly poor docs from Apple. But it is now +working. + +update has been incorporated into overnight builds, but we still have +library clashes with those in the curernt otterlace (xace only) +distribution. James and I are working to resolve these. + +jgrg working on mac distribution to incorporate zmap. + + +Blixem/pfetch +-------------- + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + +Turning off xace +---------------- + +The plan is to turn off xace in the otterlace client as soon as +possible. This requires actioning of 1, 2 and 4. Inevitably there +will be a need to either quickly switch back or have a version which +still incorporates xace. jgrg to think how to do this. + +How many external users is this going to effect? Do they need training? + + +2) Other matters +================ + +kj2 raised a question about how alignments were done for otter, jgrg said +using BLAST and est2genome. Kerstin said that much more useful alignments +could be made using BLAT or exonerate or ??? In particular, other alignment +methods join up their matches where they are obviously "perfect" overall +alignments (e.g. to consecutive exons) and cope with alignments where they go +over clone boundaries. + +Mustapha is working on this. + + +- jgrg, st3 & jla1 discussed the upcoming Vega (mouse) release + including which schema version and assembly version it should be + built with. jla1 asked about updating to assembly NCBI37. + + +3) Next Meeting +=============== + + + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_02_07 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_02_07 new file mode 100755 index 000000000..2689dfeea --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_02_07 @@ -0,0 +1,469 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 7th February 2008 + +Attendees: jgrg, kj2, st3, jla1, lw2, edgrif + + +1) otterlace + zmap progress +============================ + + +High priority +------------- + + +0/ ZMap performance - contd..... + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + +We have a performance script that we can now use to measure X Windows as +compared to edgrif's machine. The script records the salient parts of a +machines configuration as well. + +Need to check script works on Mac machines. edgrif to follow this up. + +edgrif has also been in contact with Tim Nickerson who says the performance +problem with the new machines is not just the display drivers but something +more fundamental. Hopefully fixed with a new linux build due out shortly. +jla1 pointed out that Adam Frankishs machine had been fine so perhaps systems +should look at that installation. edgrif to follow this up. + + + +1/ Consistency Test using ZMap + +A number of items are required for the annotation test, they are indentified with +"*************" and must be in place _prior_ to the start of the test. + +jla1 said there will be a test in February using zmap, not xace. edgrif said he +and Roy will make sure there is a stable version for then. + +jla1 said this would be only possible when points 1, 2 & 4 have been actioned. + + + +2/ New zmap styles ************* + +jgrg working on this with edgrif, should be finished this week. Then we can start +to take advantage of the styles and also fix a number of outstanding problems. + + +3/ Display of otter information ************* + +In particular the Feature Details dialog/replicating some of treeview +(ticket #49520). Mark provided a very good mock up of the way the +dialog should look which edgrif is working to. + +Otter ids are now displayed. edgrif is reworking the feature display to match +requirements from Mark who coordinated Havana input on this. ZMap code should be +done this week and should include: + +lw2 also asked about DE line information display, edgrif asked lw2 to collect +together requirements for information to be displayed. jla1 to chase up. + +39329: PFAM info +Possible to show a description associated with Halfwise (Pfam) objects in Zmap? +Currently in lace we see the domain description as well as the pfam accession number +(EAH) jgrg will pass this information over to zmap, in particular Ensembl URLs will +be passed over. + +BLAST evidence info: +Would it be possible to see what organism a piece of evidence belongs to, without +having to pfetch each one individually? Also no info (field Title in Fmap) on EST/mRNA +hits unless you pfetch (RJK, AF2, MT4) We need species and taxon data but some of +this needs new database fields (jgrg to implement). + +st3 raised a user request to add the author field into this display. + + +edgrif and jgrg to check that what data is displayed and also _where_ data is +displayed (e.g. in zmap or lace) provides a consistent/useful display. + + + +4/ Clone summary info/Automating DE line creation. + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + + + + +5/ CDS translation ************* + +From Jane + +Showing the peptide translation of an object next to the object in +zmap with the residue numbers next to the exon boundaries is still not +working. It took me about 5 times as long to check my object without +this functionality. + +Will be in next build which is early next week. Actually it's not +quite finished as it's currently quite unstable. + +rds has nearly finished this and it will be in the build for the test +set up. + + + +6/ parsing names with embedded spaces... ************* + +This is causing a problem saving EUCOMM Exons. + +jgrg said there is a problem with zmap parsing gff where there are feature names +with embedded spaces, edgrif to fix this asap. Also, the text following the object +name needs to be displayed in the info. line when the feature is selected. + + + +7/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + + + + +Medium priority +--------------- + + +0/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + + +1/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +2/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +jgrg said that this should be more possible with the new ensembl schema. + + +3/ clone overlap display + +Discussions with Mindi showed that main requirement is for annotators to be +able to see what features are in the overlap region of the section of clone +_not_ mapped. This can be done currently using the ZMap -> File -> New Sequence +and specifying the clone for the new sequence. The resulting display shows +the "non-golden" column which marks which section(s) of the clone were not +mapped allowing the annotator to identify which features lie in that zone +and hence are not mapped themselves. + +edgrif to publicise this facility... + + +They would also like to be able to list features by clone rather than just +by hit name or whatever. + +They would also like to see the clone name in lists of hsps.... + +edgrif to add to GFF dump data from acedb server to zmap to allow this data +to be viewed. + + +4/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. + + +5/ pfetch proxy + +jgrg has provided pseudo code, rds to implement. + + +6/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + +7/ Quality Control + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + + +8/ Interface issues: + + +extending marked region: + +jla1, eah and lw2 said users would like some way of interactively +extending the marked area. edgrif to look at this perhaps using mouse +action over the marked area. eah also requested that this should +alter the bump column so that evidence that has been hidden, as it did +not overlap, gets shown. + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +lw2 said the sensitivity of the lasso is still too sensitive. edgrif +has decreased this so that the lasso must be dragged >20 pixels. This +can be changed, but we need to have a consensus on this. + +Would it be possible to have an undo or back (like on a web browser) button? (AF2) + +rds said that this has been provided to enable stepping back 1 event, +which can either be a mark, unmark, or zoom operation. + + +9/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. + +edgrif has implemented code for this but now need eah and lw2 to report back with +list of what users would like to be able to configure. + +jla1 said that configuration is required more at the group or DB level, we can +already do this via lace setting up zmaps configuration files. + + + +10/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test. + + +10/ 5'and 3' EST read pairs and Ditags + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + + + +11/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + + + +Low priority +------------ + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +3/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. + + + + +2) Back To The Future +===================== + + +Leo's transcript display +------------------------ + +A discussion about this showed that most of it was not used but some features +such as highlighting all matches that exactly align to an exon would be very +useful. These should be imported to zmap (need to think about how to do this +in terms of short cuts and colour used for highlighting...shoudl it be a mask ?). + + +The discussion went on to talk about enhancements to blixem in two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point) + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + +A new canvas +------------ + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + +Builds +------ + +rds has done the universal binary builds. So we can provide hand builds of them +but still need to incorporate the system into our overnight build procedure. +A labour of love not helped by fairly poor docs from Apple. But it is now +working. + +update has been incorporated into overnight builds, but we still have +library clashes with those in the curernt otterlace (xace only) +distribution. James and I are working to resolve these. + +jgrg working on mac distribution to incorporate zmap. + + + +Turning off xace +---------------- + +The plan is to turn off xace in the otterlace client as soon as +possible. This requires actioning of 1, 2 and 4. Inevitably there +will be a need to either quickly switch back or have a version which +still incorporates xace. jgrg to think how to do this. + +How many external users is this going to effect? Do they need training? + +jgrg said he would make xace readonly and turn it off by default in the +next release. + + +Comparison of annotation viewers +-------------------------------- + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + + + + +2) Other matters +================ + + +- jgrg, st3 & jla1 discussed the upcoming Vega (mouse) release + including which schema version and assembly version it should be + built with. jla1 asked about updating to assembly NCBI37. + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + + +- jla1 requested that redundant external annotation should not be shown +in Ensembl as it was out of date and inaccurate and gave Vega a bad name. +This annotation will be removed in the future. + + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. + + +3) Next Meeting +=============== + +Will be at 2pm, 21st February 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_02_21 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_02_21 new file mode 100755 index 000000000..11e346e47 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_02_21 @@ -0,0 +1,445 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 21st February 2008 + +Attendees: jgrg, st3, jla1, lw2, edgrif + + +1) otterlace + zmap progress +============================ + + +High priority +------------- + + +0/ ZMap performance - contd..... + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + +We have a performance script that we can now use to measure X Windows as +compared to edgrif's machine. The script records the salient parts of a +machines configuration as well. + +Need to check script works on Mac machines. edgrif to follow this up. + +systems have found a fix for the poor display performance and it should +all be ok now. + +edgrif to email havana about performance measuring script. + + + +1/ Consistency Test using ZMap + +A number of items are required for the annotation test, they are indentified with +"*************" and must be in place _prior_ to the start of the test. + +jla1 said there will be a test in February using zmap, not xace. edgrif said he +and Roy will make sure there is a stable version for then. + +jla1 said this would be only possible when points 1, 2 & 4 have been actioned. + + + +2/ Display of otter information ************* + +zmap feature display is done but needs further details to be passed from +otterlace: + +- lw2 also asked about DE line information display, edgrif asked lw2 to collect +together requirements for information to be displayed. jla1 to chase up. + +- Several users want species info. for matches. + +- 39329: PFAM info +Possible to show a description associated with Halfwise (Pfam) objects in Zmap? +Currently in lace we see the domain description as well as the pfam accession number +(EAH) jgrg will pass this information over to zmap, in particular Ensembl URLs will +be passed over. + +- BLAST evidence info: +Would it be possible to see what organism a piece of evidence belongs to, without +having to pfetch each one individually? Also no info (field Title in Fmap) on EST/mRNA +hits unless you pfetch (RJK, AF2, MT4) We need species and taxon data but some of +this needs new database fields (jgrg to implement). + + +3/ Clone summary info/Automating DE line creation. + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + + + + +4/ CDS translation ************* + +From Jane + +Showing the peptide translation of an object next to the object in +zmap with the residue numbers next to the exon boundaries is still not +working. It took me about 5 times as long to check my object without +this functionality. + +Will be in next build which is early next week. Actually it's not +quite finished as it's currently quite unstable. + +rds has nearly finished this and it will be in the build for the test +set up. + + + +5/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + + + + +Medium priority +--------------- + + +0/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + + +1/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +2/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +jgrg said Mustapha is working on this now. + + + +3/ clone overlap display + +Discussions with Mindi showed that main requirement is for annotators to be +able to see what features are in the overlap region of the section of clone +_not_ mapped. This can be done currently using the ZMap -> File -> New Sequence +and specifying the clone for the new sequence. The resulting display shows +the "non-golden" column which marks which section(s) of the clone were not +mapped allowing the annotator to identify which features lie in that zone +and hence are not mapped themselves. + +rds has mailed round about how to do this. jgrg said he would like to add +this facility to otterlace but the zmap route is ok for now. + + + +4/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. + + +5/ pfetch proxy + +jgrg has provided pseudo code, rds to implement. + + +6/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + +7/ Quality Control + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + + +8/ Interface issues: + + +extending marked region: + +jla1, eah and lw2 said users would like some way of interactively +extending the marked area. edgrif to look at this perhaps using mouse +action over the marked area. eah also requested that this should +alter the bump column so that evidence that has been hidden, as it did +not overlap, gets shown. + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + + + +9/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. edgrif to implement and improve column turning on/off. + +edgrif has implemented code for this but now need eah and lw2 to report back with +list of what users would like to be able to configure. + +jla1 said that configuration is required more at the group or DB level, we can +already do this via lace setting up zmaps configuration files. + + + +10/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test. + + +5'and 3' EST read pairs and Ditags + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + + + +11/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +12/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + + + +Low priority +------------ + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +3/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. + + + + +2) Back To The Future +===================== + + +Leo's transcript display +------------------------ + +A discussion about this showed that most of it was not used but some features +such as highlighting all matches that exactly align to an exon would be very +useful. These should be imported to zmap (need to think about how to do this +in terms of short cuts and colour used for highlighting...shoudl it be a mask ?). + + +The discussion went on to talk about enhancements to blixem in two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + + + + +A new canvas +------------ + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + +Builds +------ + +rds has done the universal binary builds. So we can provide hand builds of them +but still need to incorporate the system into our overnight build procedure. +A labour of love not helped by fairly poor docs from Apple. But it is now +working. + +update has been incorporated into overnight builds, but we still have +library clashes with those in the curernt otterlace (xace only) +distribution. James and I are working to resolve these. + +jgrg working on mac distribution to incorporate zmap. + +edgrif to start getting systems to take this on. + +jgrg and rds seem to have got things to a stage where we can reliably +build for local installs and for James install package. Be good to pass +some of this on to systems. + + + +Turning off xace +---------------- + +The plan is to turn off xace in the otterlace client as soon as +possible. This requires actioning of 1, 2 and 4. Inevitably there +will be a need to either quickly switch back or have a version which +still incorporates xace. jgrg to think how to do this. + +How many external users is this going to effect? Do they need training? + +jgrg said he would make xace readonly and turn it off by default in the +next release. + + +Comparison of annotation viewers +-------------------------------- + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + + +2) Other matters +================ + + +- jgrg, st3 & jla1 discussed the upcoming Vega (mouse) release + including which schema version and assembly version it should be + built with. jla1 asked about updating to assembly NCBI37. + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + + +- jla1 requested that redundant external annotation should not be shown +in Ensembl as it was out of date and inaccurate and gave Vega a bad name. +This annotation will be removed in the future. + + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. + + +3) Next Meeting +=============== + +Will be at 2pm, 6th March 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_03_06 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_03_06 new file mode 100755 index 000000000..1ae373390 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_03_06 @@ -0,0 +1,456 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 6th March 2008 + +Attendees: jgrg, st3, jla1, kj2, lw2, edgrif + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + + +0/ ZMap performance - contd..... + +edgrif reported experience of sorting out one users machine where +browsers, mail readers and stray sgifaceservers were taking nearly +1GB of memory. Several measures are being taken to avoid this problem: + +- jgrg is to send an email to Havana about how to save the current +session so it can be restarted rapidly. This means annotators can +be encouraged to log off at least every couple of days to reset +memory usage of long running programs (e.g. X server). + +- edgrif noted that memory usage by a systems recommended mail reader +(iceXXX) was enormous and perhaps other alternatives should be looked +at. + +- edgrif asked If to put in a ticket requesting that programs like Mozilla +could be run on cbi4 so that annotators could ssh into it and run such +programs there. + +- edgrif has mailed havana about his performance script. + + +1/ Consistency Test using ZMap + +A number of items are required for the annotation test, they are indentified with +"*************" and must be in place _prior_ to the start of the test. + +jla1 said there will be a test in February using zmap, not xace. edgrif said he +and Roy will make sure there is a stable version for then. + +jla1 said this would be only possible when points 1, 2 & 4 have been actioned. + + + +2/ Display of otter information ************* + +zmap feature display is done but needs further details to be passed from +otterlace: + +- lw2 also asked about DE line information display, edgrif asked lw2 to collect +together requirements for information to be displayed. jla1 to chase up. + +- Several users want species info. for matches. + +- 39329: PFAM info +Possible to show a description associated with Halfwise (Pfam) objects in Zmap? +Currently in lace we see the domain description as well as the pfam accession number +(EAH) jgrg will pass this information over to zmap, in particular Ensembl URLs will +be passed over. + +- BLAST evidence info: +Would it be possible to see what organism a piece of evidence belongs to, without +having to pfetch each one individually? Also no info (field Title in Fmap) on EST/mRNA +hits unless you pfetch (RJK, AF2, MT4) We need species and taxon data but some of +this needs new database fields (jgrg to implement). + + +edgrif said all the zmap code is in place, jgrg agreed to put this top priority. +edgrif will fix any zmap problems as a top priority. + + + +3/ Clone summary info/Automating DE line creation. + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + + + +4/ CDS translation ************* + +From Jane + +Showing the peptide translation of an object next to the object in +zmap with the residue numbers next to the exon boundaries is still not +working. It took me about 5 times as long to check my object without +this functionality. + +Will be in next build which is early next week. Actually it's not +quite finished as it's currently quite unstable. + +edgrif talked to rds, this is still not done and needs to be. + + + +5/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +6/ Styles + +jla1 said that styles are not a priority for the annotation test so jgrg +is to turn them off until the other priorities are fixed. + +jgrg said he was confused by the all the bump options and how to specify them, +edgrif will clarify and also check that the menu ticks for which bumping is +currently active are correct. + + +7/ Blixem multiple cols + +jla1 said that running multiple match columns in blixem is essential. +edgrif said he thought this was working in zmap, jgrg to investigate. + + +8/ Builds + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. He has emailed th about this +and will discuss it with him. + +We now seem to have a fully working build system for macs now. + + +9/ Turning off xace + +lw2 said there was only one thing that needed to be editted in xace +now (part of locus naming ??), otherwise xace can be made read only +and then turned off. + + + + + +Medium priority +--------------- + + +0/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + + +1/ Annotation needs to have a history across assembly changes. + +kj2 requested that when an assembly changes the objects which get +transferred should have a history as to what they were (otter id) +previously. lw2 remarked that it's possible to search for the old id +in the lace interface, but impossible to retrieve the old object and +display it. jgrg acknowledged there is a bug in the lace searching, +but that providing the history would be difficult. Further +discussion/thought on how to implement this is needed. + +jgrg said Mustapha is working on this now. + + + +2/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. + + +3/ pfetch proxy + +jgrg has provided pseudo code, rds to implement. + + +4/ Quality Control + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + + +5/ Interface issues: + + +extending marked region: + +jla1, eah and lw2 said users would like some way of interactively +extending the marked area. edgrif to look at this perhaps using mouse +action over the marked area. eah also requested that this should +alter the bump column so that evidence that has been hidden, as it did +not overlap, gets shown. + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + + + +6/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. edgrif to implement and improve column turning on/off. + +edgrif has implemented code for this but now need eah and lw2 to report back with +list of what users would like to be able to configure. + +jla1 said that configuration is required more at the group or DB level, we can +already do this via lace setting up zmaps configuration files. + + + + +7/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + + + + +8/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +9/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + + +10/ blixem reverse coords problem + +there is a bug in blixem whereby when reversed it does not pass on the coords +to dotter properly...needs fixing.... + + + +Low priority +------------ + +0/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +3/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. + + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Leo's transcript display + +A discussion about this showed that most of it was not used but some features +such as highlighting all matches that exactly align to an exon would be very +useful. These should be imported to zmap (need to think about how to do this +in terms of short cuts and colour used for highlighting...shoudl it be a mask ?). + + + +3/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + +4/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +7/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + + + +8/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. + + + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. + + +- kj2 asked about embl dumping for zebrafish, jgrg said they were awaiting +answers to questions they had asked about the required header informatin. + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 20th March 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_03_20 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_03_20 new file mode 100755 index 000000000..962eac027 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_03_20 @@ -0,0 +1,432 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 20th March 2008 + +Attendees: jla1, jgrg, lw2, st3, edgrif + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + + +0/ ZMap performance - contd..... + +- edgrif noted that memory usage by a systems recommended mail reader +(icedove) was enormous and perhaps other alternatives should be looked +at. This is still a problem (Rhoda's iceDove process was 127MB). + +- systems have said that browsers etc should not be run on the cluster machines +so jla1 will ask systems for more memory in machines since memory is cheap. + + + +1/ Consistency Test using ZMap/WashU visit + +A number of items are required for the annotation test, they are indentified with +"*************" and must be in place _prior_ to the start of the test. + +jla1 said that a bigger priority is now the visit by WashU which will be week +beginning 7th April. ZMap/lace _must_ be in a good working state for that. + + +jla1 said this would be only possible when points 1, 2 & 4 have been actioned. + + + +2/ Display of otter information ************* + +zmap feature display is done but needs further details to be passed from +otterlace: + +- 39329: PFAM info +Possible to show a description associated with Halfwise (Pfam) objects in Zmap? +Currently in lace we see the domain description as well as the pfam accession number +(EAH) jgrg will pass this information over to zmap, in particular Ensembl URLs will +be passed over. + +- BLAST evidence info: +Would it be possible to see what organism a piece of evidence belongs to, without +having to pfetch each one individually? Also no info (field Title in Fmap) on EST/mRNA +hits unless you pfetch (RJK, AF2, MT4) We need species and taxon data but some of +this needs new database fields (jgrg to implement). + +- author info required for transcripts. + +edgrif said all the zmap code is in place, jgrg agreed to put this top priority. +edgrif will fix any zmap problems as a top priority. + + + +3/ Clone summary info/Automating DE line creation. + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + + + +4/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +5/ Styles + +jla1 said that styles are not a priority for the annotation test so jgrg +is to turn them off until the other priorities are fixed. + +jgrg said he was confused by the all the bump options and how to specify them, +edgrif will clarify and also check that the menu ticks for which bumping is +currently active are correct. + + +6/ Builds + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. edgrif emailed th who was +luke warm about this, edgrif to email again as there is a level of software +requiered by jgrg and edgrif that systems should support, e.g. automake. + +We now seem to have a fully working build system for macs now. + + + + +7/ Turning off xace + +lw2 said there was only one thing that needed to be editted in xace +now (part of locus naming ??), otherwise xace can be made read only +and then turned off. + +jgrg is implementing a dialog to do this now. + + + + +Medium priority +--------------- + + +0/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +1/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + +2/ pfetch proxy + +jgrg has provided pseudo code, rds to implement. rds is working on this +now. + + +3/ Quality Control + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + +jla1 said this needs to be run dynamically as annotators enter their data. +jla1, st3 and jgrg will meet separately to discuss how to implment this. + + +4/ Interface issues: + + +extending marked region: + +jla1, eah and lw2 said users would like some way of interactively +extending the marked area. edgrif to look at this perhaps using mouse +action over the marked area. + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + + + +5/ There was a discussion about how much a user should be able to +configure. It was agreed that they should be able to configure which columns +are initially hidden. They should also be able to "save" the current settings +to set this up and to be able to restore the system or there currently saved +defaults. edgrif to implement and improve column turning on/off. + +edgrif has implemented code for this but now need eah and lw2 to report back with +list of what users would like to be able to configure. + +jla1 said that configuration is required more at the group or DB level, we can +already do this via lace setting up zmaps configuration files. + + + + +6/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + + +7/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +8/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +9/ blixem reverse coords problem + +there is a bug in blixem whereby when reversed it does not pass on the coords +to dotter properly...needs fixing.... + + +10/ locator column: zmap needs a locator column as per fmap which could be +used to display dna and peptide search results and other information. + + +Low priority +------------ + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +3/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Leo's transcript display + +A discussion about this showed that most of it was not used but some features +such as highlighting all matches that exactly align to an exon would be very +useful. These should be imported to zmap (need to think about how to do this +in terms of short cuts and colour used for highlighting...shoudl it be a mask ?). + + + +3/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + +4/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test in the May Havana meeting. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +7/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + +We should tag a meeting on Genome Informatics which will be in Hinxton +from September 10 - 14. an alternative would be to do it at BOSC. + + + +8/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + + + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + +After Easter jla1, lw2 and jgrg will weed out redundant tickets. + + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. jla1 to email rdf and jgrg. + + +- kj2 asked about embl dumping for zebrafish, jgrg said they were awaiting +answers to questions they had asked about the required header informatin. + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 3rd April 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_04_16 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_04_16 new file mode 100755 index 000000000..83caaba96 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_04_16 @@ -0,0 +1,421 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 16th April 2008 + +Attendees: jla1, jgrg, lw2, st3, edgrif, kj2 + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + + +0/ ZMap performance - contd..... + +- problems with performance have been much less since annotators started +logging out more frequently. There is still a problem with memory though +and jla1 will ask systems for more memory in machines. + + + +1/ Consistency Test using ZMap + +The consistency test will be 1st/2nd May, jgrg & edgrif to make sure +code is ready. A build will be done by 22nd April that will (barring serious +bugs) be the one used for this test. + + +2/ Display of otter information ************* + +zmap feature display is done but needs further details to be passed from +otterlace, this is now the top priority item as it is the major missing +function: + +- 39329: PFAM info +Possible to show a description associated with Halfwise (Pfam) objects in Zmap? +Currently in lace we see the domain description as well as the pfam accession number +(EAH) jgrg will pass this information over to zmap, in particular Ensembl URLs will +be passed over. + +- BLAST evidence info: +Would it be possible to see what organism a piece of evidence belongs to, without +having to pfetch each one individually? Also no info (field Title in Fmap) on EST/mRNA +hits unless you pfetch (RJK, AF2, MT4) We need species and taxon data but some of +this needs new database fields (jgrg to implement). + +- author info required for transcripts. + +edgrif said all the zmap code is in place, jgrg agreed to put this top priority. +edgrif will fix any zmap problems as a top priority. + +- URL display + +edgrif said that some code is needed in zmap to display urls from the feature display +window. + + +3/ Clone summary info/Automating DE line creation / Quality Control + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + +jla1 said this needs to be run dynamically as annotators enter their data. +jla1, st3 and jgrg will meet separately to discuss how to impleent this. + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + +4/ Builds + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. edgrif has spoken to Guy Coates +about this and he is keen to pick up the scripts, edgrif has contacted +Tim Cutts who now looks after the Mac stuff. + + +5/ Turning off xace + +lw2 said there was only one thing that needed to be editted in xace +now (part of locus naming ??), otherwise xace can be made read only +and then turned off. jgrg has implemented this, lw2 to test. jgrg & +edgrif to get together to check zmap will work for this as well, it +will require redrawing/renaming of a number of features. + + + + +Medium priority +--------------- + + +0/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +1/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + +2/ pfetch proxy + +jgrg has provided pseudo code, rds is half way through implementing this. + + +3/ Speed + +Marie has said there is still a problem with the speed of zmap when showing long +genes, edgrif to investigate. + + +4/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +5/ Styles + +jla1 said that styles are not a priority for the annotation test so jgrg +is to turn them off until the other priorities are fixed. + +jgrg said he was confused by the all the bump options and how to specify them, +edgrif will clarify and also check that the menu ticks for which bumping is +currently active are correct. + + + +6/ Interface issues: + + +extending marked region: + +jla1, eah and lw2 said users would like some way of interactively +extending the marked area. edgrif to look at this perhaps using mouse +action over the marked area. + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + + + +7/ User configuration + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + + +8/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + + +9/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +10/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + + +Low priority +------------ + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +3/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Leo's transcript display + +A discussion about this showed that most of it was not used but some features +such as highlighting all matches that exactly align to an exon would be very +useful. These should be imported to zmap (need to think about how to do this +in terms of short cuts and colour used for highlighting...shoudl it be a mask ?). + + + +3/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + +4/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test in the May Havana meeting. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +7/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + +We should tag a meeting on Genome Informatics which will be in Hinxton +from September 10 - 14. an alternative would be to do it at BOSC. + + + +8/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + + + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + +After Easter jla1, lw2 and jgrg will weed out redundant tickets. + + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. jla1 to email rdf and jgrg. + + +- kj2 asked about embl dumping for zebrafish, jgrg said they were awaiting +answers to questions they had asked about the required header informatin. +kj2 will ask them what is needed. + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 1st May 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_05_08 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_05_08 new file mode 100755 index 000000000..0d52e4a4f --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_05_08 @@ -0,0 +1,409 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 8th May 2008 + +Attendees: jla1, jgrg, lw2, st3, edgrif, kj2 + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + + + +1/ Display of otter information ************* + + +- need improvments to transcript/genes display, need to make +better use of tabbed windows. jgrg, lw2, lg4 to coordinate. + + +- author info required for transcripts. + +edgrif said all the zmap code is in place, jgrg agreed to put this top priority. +edgrif will fix any zmap problems as a top priority. + +- URL display + +edgrif said that some code is needed in zmap to display urls from the feature display +window. + + +2/ Clone summary info/Automating DE line creation / Quality Control + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + +jla1 said this needs to be run dynamically as annotators enter their data. +jla1, st3 and jgrg will meet separately to discuss how to impleent this. + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + +3/ Builds + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. edgrif has spoken to Guy Coates +about this and he is keen to pick up the scripts, edgrif has contacted +Tim Cutts who now looks after the Mac stuff. + +We agreed to do new builds just after these meetings from now on so that +there would be almost two weeks of testing prior to the next zlave meeting. +This did not happen this time, we'll try to make it happen this time ! + + +4/ Turning off xace + +Can we do this now ? + + +5/ Styles + +Now the annotation test is over we need to get this done. + +jgrg said he was confused by the all the bump options and how to specify them, +edgrif will clarify and also check that the menu ticks for which bumping is +currently active are correct. + + +6/ annotation test + +There seemed to be no show stopping problems with zmap but a number of issues +did come up that need addressing, Adam (af2) is collating and sending in tickets. + + +Medium priority +--------------- + + +0/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +1/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + +2/ pfetch proxy + +jgrg has provided pseudo code, rds is still testing, when he has finished +he will pass code on to edgrif to incorporate in blixem. + + +3/ Speed + +Marie has said there is still a problem with the speed of zmap when showing long +genes, edgrif to investigate. She has also said that long genes do not display +as well when zoomed out as in fmap. + + +4/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +5/ Interface issues: + + +extending marked region: + +rds has implemented first part of this to change marked region, we are +currently working on the recalculate code required (e.g. for rebumping +columns). + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + + + +7/ User configuration + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + + +8/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + + +9/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +10/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +11/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +Low priority +------------ + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. + + +2/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +3/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Leo's transcript display + +A discussion about this showed that most of it was not used but some features +such as highlighting all matches that exactly align to an exon would be very +useful. These should be imported to zmap (need to think about how to do this +in terms of short cuts and colour used for highlighting...shoudl it be a mask ?). + + + +3/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + +4/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test in the May Havana meeting. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +7/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + +We should tag a meeting on Genome Informatics which will be in Hinxton +from September 10 - 14. an alternative would be to do it at BOSC. + + + +8/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. jla1 to email rdf and jgrg. + + +- kj2 asked about embl dumping for zebrafish, jgrg said they were awaiting +answers to questions they had asked about the required header informatin. +kj2 will ask them what is needed. + + +- st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 22nd May 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_05_22 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_05_22 new file mode 100755 index 000000000..2183ddbfa --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_05_22 @@ -0,0 +1,426 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 22nd May 2008 + +Attendees: jla1, jgrg, lw2, st3, edgrif, kj2 + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + + + +1/ URL display + +edgrif said that some code is needed in zmap to display urls from the feature display +window. + + +2/ Clone summary info/Automating DE line creation / Quality Control + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + +jla1 said this needs to be run dynamically as annotators enter their data. +jla1, st3 and jgrg will meet separately to discuss how to impleent this. + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + +3/ Builds + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. edgrif has spoken to Guy Coates +about this and he is keen to pick up the scripts, edgrif has contacted +Tim Cutts who now looks after the Mac stuff. + +We agreed to do new builds just after these meetings from now on so that +there would be almost two weeks of testing prior to the next zlave meeting. +This did not happen this time, we'll try to make it happen this time ! + + +5/ Styles + +Now the annotation test is over we need to get this done. + +jgrg said he was confused by the all the bump options and how to specify them, +edgrif will clarify and also check that the menu ticks for which bumping is +currently active are correct. + + +6/ annotation test + +There seemed to be no show stopping problems with zmap but a number of issues +did come up that need addressing, Adam (af2) is collating and sending in tickets. + + +7/ Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. + + + + + +Medium priority +--------------- + + +0/ Interactive reposition of columns + + +1/ Display of otter information ************* + + +- need improvments to transcript/genes display, need to make +better use of tabbed windows. jgrg, lw2, lg4 to coordinate. + + + + +0/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +1/ Multiple alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + +2/ pfetch proxy + +jgrg has provided pseudo code, rds is still testing, when he has finished +he will pass code on to edgrif to incorporate in blixem. + + +3/ Speed + +Marie has said there is still a problem with the speed of zmap when showing long +genes, edgrif to investigate. She has also said that long genes do not display +as well when zoomed out as in fmap. + + +4/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +5/ Interface issues: + + +extending marked region: + +rds has implemented first part of this to change marked region, we are +currently working on the recalculate code required (e.g. for rebumping +columns). + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + + +graph colun: + +edgrif implemented this a long time ago but needs testing again. + + +7/ User configuration + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + + +8/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + + +9/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +10/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +11/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +12/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +Low priority +------------ + + +1/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this. Help page needs +checking also. + + +2/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +3/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Leo's transcript display + +A discussion about this showed that most of it was not used but some features +such as highlighting all matches that exactly align to an exon would be very +useful. These should be imported to zmap (need to think about how to do this +in terms of short cuts and colour used for highlighting...shoudl it be a mask ?). + + + +3/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + +4/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test in the May Havana meeting. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +7/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + +We should tag a meeting on Genome Informatics which will be in Hinxton +from September 10 - 14. an alternative would be to do it at BOSC. + + + +8/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. jla1 to email rdf and jgrg. + + +- kj2 asked about embl dumping for zebrafish, jgrg said they were awaiting +answers to questions they had asked about the required header informatin. +kj2 will ask them what is needed. + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 22nd May 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_06_05 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_06_05 new file mode 100755 index 000000000..0028a2c11 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_06_05 @@ -0,0 +1,425 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 5th June 2008 + +Attendees: jla1, jgrg, lw2, st3, edgrif, kj2 + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + +0/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this and to check that online +docs are up to date. + + +1/ URL display + +edgrif said that some code is needed in zmap to display urls from the feature display +window. Needed for pfam/ensembl entries/display. + + +2/ Clone summary info/Automating DE line creation / Quality Control + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + +jla1 said this needs to be run dynamically as annotators enter their data. +jla1, st3 and jgrg will meet separately to discuss how to impleent this. + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + +3/ Builds + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. edgrif has spoken to Guy Coates +about this and he is keen to pick up the scripts, edgrif has contacted +Tim Cutts who now looks after the Mac stuff. + +We agreed to do new builds just after these meetings from now on so that +there would be almost two weeks of testing prior to the next zlave meeting. +This did not happen this time, we'll try to make it happen this time ! + +edgrif to check both zmap and acedb 64-bit builds to make sure they are complete. + + +4/ Styles + +Now the annotation test is over we need to get this done. + +jgrg said he was confused by the all the bump options and how to specify them, +edgrif will clarify and also check that the menu ticks for which bumping is +currently active are correct. + + +5/ Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + + + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +1/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +2/ Locus list + +*sorry I have forgotten what this item was about....!!* + + +3/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + +4/ pfetch proxy + +rds has implemented this for zmap, edgrif needs to incorporate in blixem. + + +5/ Speed + +Marie has said there is still a problem with the speed of zmap when showing long +genes, edgrif to investigate. She has also said that long genes do not display +as well when zoomed out as in fmap. + + +6/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +7/ Interface issues: + + +extending marked region: + +rds has implemented first part of this to change marked region, we are +currently working on the recalculate code required (e.g. for rebumping +columns). + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +Master Region + +need zmap to display what this and information about it...we currently have a sliding window +that shows position etc. but it's not very good. + + +8/ blixem launch + +edgrif to add blixem launch to column menu (only in feature menu currently). + + +9/ clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + + +10/ checking multiple genes + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requiers more discussion. + + + +11/ User configuration + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + + +12/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + + +13/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +14/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +15/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +16/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +Low priority +------------ + + +1/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +2/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + +3/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test in the May Havana meeting. + + +4/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +5/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +6/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + +We should tag a meeting on Genome Informatics which will be in Hinxton +from September 10 - 14. an alternative would be to do it at BOSC. + + + +7/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + + + +- there is an issue with old tickets getting lost in RT for zmap & acedb, +edgrif to check these. + + + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. jla1 to email rdf and jgrg. + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 19th June 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_06_26 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_06_26 new file mode 100755 index 000000000..7d15dfca8 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_06_26 @@ -0,0 +1,422 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 26th June 2008 + +Attendees: jla1, jgrg, lw2, st3, edgrif + + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + +0/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...edgrif to do this and to check that online +docs are up to date by 15th July for annotation jamboree etc. + + +1/ URL display + +edgrif said that some code is needed in zmap to display urls from the feature display +window. Needed for pfam/ensembl entries/display. + + +2/ Clone summary info/Automating DE line creation / Quality Control + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + +jla1 said this needs to be run dynamically as annotators enter their data. +jla1, st3 and jgrg will meet separately to discuss how to impleent this. + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + +3/ Builds + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. edgrif has spoken to Guy Coates +about this and he is keen to pick up the scripts, edgrif has contacted +Tim Cutts who now looks after the Mac stuff. + +We agreed to do new builds just after these meetings from now on so that +there would be almost two weeks of testing prior to the next zlave meeting. +This did not happen this time, we'll try to make it happen this time ! + +edgrif to check both zmap and acedb 64-bit builds to make sure they are complete. + + +4/ Styles + +Now the annotation test is over we need to get this done. + +jgrg said he was confused by the all the bump options and how to specify them, +edgrif will clarify and also check that the menu ticks for which bumping is +currently active are correct. + + +5/ Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + +st3 would also like a "Clone finished" button. + + + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +1/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +2/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + +3/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + +4/ pfetch proxy + +rds has implemented this for zmap, but need to check if internal users can +still go direct to pfetch. edgrif has almost finisehd this for blixem where +it is possible to choose between direct to pfetch or http proxy. + + +5/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +6/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +Master Region + +need zmap to display what this is and information about it...we currently have a sliding window +that shows position etc. but it's not very good. + + +8/ blixem launch + +edgrif to add blixem launch to column menu (only in feature menu currently). + + +9/ clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + + +10/ checking multiple genes + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requiers more discussion. + + + +11/ User configuration + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + + +12/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + + +13/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +14/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +15/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +16/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +Low priority +------------ + + +1/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +2/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + + +3/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test in the May Havana meeting. + + +4/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +5/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +6/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + +We should tag a meeting on Genome Informatics which will be in Hinxton +from September 10 - 14. an alternative would be to do it at BOSC. + + + +7/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + + + +- there is an issue with old tickets getting lost in RT for zmap & acedb, +edgrif to check these. + + + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. jla1 to email rdf and jgrg. + + +- kj2 and jla1 would like to get Solexa reads into pipeline. + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 10th July 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_07_10 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_07_10 new file mode 100755 index 000000000..16e1bce13 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_07_10 @@ -0,0 +1,421 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 10th July 2008 + +Attendees: jla1, jgrg, lw2, st3, br2, edgrif + + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + +0/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist... + + +1/ URL display + +edgrif said that some code is needed in zmap to display urls from the feature display +window. Needed for pfam/ensembl entries/display. + + +2/ Clone summary info/Automating DE line creation / Quality Control + +eah raised the point that certain summary information for clones is +not available when using zmap. e.g. the number of CpG islands. These +are used for the authoring of DE lines. There is a script for +automating this which kj2 wrote for zebrafish, but it's specific to +zebrafish and requires running on the command line. jgrg said this +could be integrated into the clone editing window in lace. jgrg to +action. + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + +jla1 said this needs to be run dynamically as annotators enter their data. +jla1, st3 and jgrg will meet separately to discuss how to impleent this. + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + +3/ Builds + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. edgrif has spoken to Guy Coates +about this and he is keen to pick up the scripts, edgrif has contacted +Tim Cutts who now looks after the Mac stuff. edgrif is working on this at home +to test the script out on his home machine. + +We agreed to do new builds just after these meetings from now on so that +there would be almost two weeks of testing prior to the next zlave meeting. +This did not happen this time, we'll try to make it happen this time ! + +edgrif to check both zmap and acedb 64-bit builds to make sure they are complete. + + +4/ Styles + +Now the annotation test is over we need to get this done, edgrif and jgrg to +attempt this week beginning 21/7/2008. + +jgrg said he was confused by the all the bump options and how to specify them, +edgrif will clarify and also check that the menu ticks for which bumping is +currently active are correct. + + +5/ Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + +st3 would also like a "Clone finished" button. + + +6/ kj2 and jla1 would like to get Solexa reads into pipeline and Jen would +like data from the comparacon ensembl track. + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +1/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +2/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + +3/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + +4/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +6/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue and with less dense dots. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +Master Region + +need zmap to display what this is and information about it...we currently have a sliding window +that shows position etc. but it's not very good. + + +8/ blixem launch + +edgrif to add blixem launch to column menu (only in feature menu currently). + + +9/ clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + +Would like this info. in navigator panel + navigator panel needs to display both +the foocanvas scrolled window area _and_ the actual area on the screen, and both +should be draggable... + + +10/ checking multiple genes + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requires more discussion. + + + +11/ User configuration + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + + +12/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + + +13/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +14/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +15/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +16/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +Low priority +------------ + + +1/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +2/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +Perhaps one way to get this done would be employ a good C programmer on a +short contract. + + + +3/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test in the May Havana meeting. + + +4/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +5/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +6/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + +We should tag a meeting on Genome Informatics which will be in Hinxton +from September 10 - 14. an alternative would be to do it at BOSC. + + + +7/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. jgrg will set this up, he has example +code. + +- there was some discussion about Leo leaving the anacode group for the EBI, +this now leaves the annotation software groups (zmap/acedb & anacode) somewhat +understrength. + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 24th July 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_07_24 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_07_24 new file mode 100755 index 000000000..540ecd4d5 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_07_24 @@ -0,0 +1,452 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 24th July 2008 + +Attendees: jla1, jgrg, lw2, st3, kj2, edgrif + + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + + +1/ DAS tracks + +jla1 would like some of the DAS tracks currently available to be put into +lace and hence zmap. jgrg said that this is not immediately straight forward +as they don't all say which assembly they are based on but some can be done +fairly soon. e.g. comparacon ? + + +2/ Tick boxed for controlled vocabulary + +jla1 said there is an urgent need to add "tick boxes" to the lace interface to +ensure that certain properties of annotated features can only be chosen from +a controlled vocabulary. + + +3/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist... + + +4/ URL display + +edgrif said that some code is needed in zmap to display urls from the feature display +window. Needed for pfam/ensembl entries/display. + + +5/ Clone summary info/Automating DE line creation / Quality Control + +There is a script for automating this which kj2 wrote for zebrafish, +currently it must be run from the command line but jgrg is integrating +into the clone editing window in lace. + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + + +6/ Build script + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. edgrif has spoken to Guy Coates +about this and he is keen to pick up the scripts, edgrif has contacted +Tim Cutts who now looks after the Mac stuff. edgrif is working on this at home +to test the script out on his home machine. rds and edgrif to do final testing. + + +7/ Styles + +edgrif working on this now and will give jgrg a document describing all the +changes. + +jgrg said he was confused by the all the bump options and how to specify them, +edgrif will clarify and also check that the menu ticks for which bumping is +currently active are correct. + + +8/ Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + + +9/ Clone finished button + +st3 would like a "Clone finished" button with same function as Locus Finished +button. jgrg to implement. + + +10/ kj2 and jla1 would like to get Solexa reads into pipeline but this is a lot +of data and will require zmap to be able to do dynamic fetches of subranges of +data otherwise we will be swamped by it. rds is to do some design work on the +dynamic loading. Initially we could only load those alignments within a marked +range. + +As an addition to this edgrif and rds will think about how we might give some +kind of "overview" for alignment columns that could show where the aligns are +without drawing them all. + + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +1/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +2/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + +3/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + +4/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +5/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue and with less dense dots. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +Master Region + +need zmap to display what this is and information about it...we currently have a sliding window +that shows position etc. but it's not very good. + + +6/ blixem launch + +edgrif to add blixem launch to column menu (only in feature menu currently). + + +7/ clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + +Would like this info. in navigator panel + navigator panel needs to display both +the foocanvas scrolled window area _and_ the actual area on the screen, and both +should be draggable... + +rds will work on this. + + + +8/ checking multiple genes + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requires more discussion. + + + +9/ User configuration + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + +kj2 would like to be able to specify initial columns, needs jgrg to modify +styles and all will be well. + + +10/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + +We can use worm tags but adjust to be more generic, e.g. "Read_pairs" + + + +11/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +12/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +13/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +14/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +15/ Best in Genome matches + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + + + + +Low priority +------------ + + +1/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +2/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +Perhaps one way to get this done would be employ a good C programmer on a +short contract. + + + +3/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test in the May Havana meeting. + + +4/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +5/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +6/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + +We should tag a meeting on Genome Informatics which will be in Hinxton +from September 10 - 14. an alternative would be to do it at BOSC. + + + +7/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. jgrg will set this up, he has example +code. + +- there was some discussion about Leo leaving the anacode group for the EBI, +this now leaves the annotation software groups (zmap/acedb & anacode) somewhat +understrength. + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 7th August 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_08_07 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_08_07 new file mode 100755 index 000000000..540ecd4d5 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_08_07 @@ -0,0 +1,452 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 24th July 2008 + +Attendees: jla1, jgrg, lw2, st3, kj2, edgrif + + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + + +1/ DAS tracks + +jla1 would like some of the DAS tracks currently available to be put into +lace and hence zmap. jgrg said that this is not immediately straight forward +as they don't all say which assembly they are based on but some can be done +fairly soon. e.g. comparacon ? + + +2/ Tick boxed for controlled vocabulary + +jla1 said there is an urgent need to add "tick boxes" to the lace interface to +ensure that certain properties of annotated features can only be chosen from +a controlled vocabulary. + + +3/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist... + + +4/ URL display + +edgrif said that some code is needed in zmap to display urls from the feature display +window. Needed for pfam/ensembl entries/display. + + +5/ Clone summary info/Automating DE line creation / Quality Control + +There is a script for automating this which kj2 wrote for zebrafish, +currently it must be run from the command line but jgrg is integrating +into the clone editing window in lace. + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + + +6/ Build script + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. edgrif has spoken to Guy Coates +about this and he is keen to pick up the scripts, edgrif has contacted +Tim Cutts who now looks after the Mac stuff. edgrif is working on this at home +to test the script out on his home machine. rds and edgrif to do final testing. + + +7/ Styles + +edgrif working on this now and will give jgrg a document describing all the +changes. + +jgrg said he was confused by the all the bump options and how to specify them, +edgrif will clarify and also check that the menu ticks for which bumping is +currently active are correct. + + +8/ Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + + +9/ Clone finished button + +st3 would like a "Clone finished" button with same function as Locus Finished +button. jgrg to implement. + + +10/ kj2 and jla1 would like to get Solexa reads into pipeline but this is a lot +of data and will require zmap to be able to do dynamic fetches of subranges of +data otherwise we will be swamped by it. rds is to do some design work on the +dynamic loading. Initially we could only load those alignments within a marked +range. + +As an addition to this edgrif and rds will think about how we might give some +kind of "overview" for alignment columns that could show where the aligns are +without drawing them all. + + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +1/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +2/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + +3/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + +4/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +5/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue and with less dense dots. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +Master Region + +need zmap to display what this is and information about it...we currently have a sliding window +that shows position etc. but it's not very good. + + +6/ blixem launch + +edgrif to add blixem launch to column menu (only in feature menu currently). + + +7/ clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + +Would like this info. in navigator panel + navigator panel needs to display both +the foocanvas scrolled window area _and_ the actual area on the screen, and both +should be draggable... + +rds will work on this. + + + +8/ checking multiple genes + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requires more discussion. + + + +9/ User configuration + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + +kj2 would like to be able to specify initial columns, needs jgrg to modify +styles and all will be well. + + +10/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + +We can use worm tags but adjust to be more generic, e.g. "Read_pairs" + + + +11/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +12/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +13/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +14/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +15/ Best in Genome matches + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + + + + +Low priority +------------ + + +1/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +2/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +Perhaps one way to get this done would be employ a good C programmer on a +short contract. + + + +3/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test in the May Havana meeting. + + +4/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +5/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +6/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + +We should tag a meeting on Genome Informatics which will be in Hinxton +from September 10 - 14. an alternative would be to do it at BOSC. + + + +7/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. jgrg will set this up, he has example +code. + +- there was some discussion about Leo leaving the anacode group for the EBI, +this now leaves the annotation software groups (zmap/acedb & anacode) somewhat +understrength. + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 7th August 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_09_25 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_09_25 new file mode 100755 index 000000000..7d03b58a9 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_09_25 @@ -0,0 +1,450 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 25th Sept 2008 + +Attendees: jla1, jgrg, lw2, st3, kj2, edgrif + + + +------------------------------------------------------------------------------ +Current Issues + + +High priority +------------- + + +1/ DAS tracks + +jla1 would like some of the DAS tracks currently available to be put into +lace and hence zmap. jgrg said that this is not immediately straight forward +as they don't all say which assembly they are based on but some can be done +fairly soon. e.g. comparacon ? + + +2/ Tick boxed for controlled vocabulary + +jla1 said there is an urgent need to add "tick boxes" to the lace interface to +ensure that certain properties of annotated features can only be chosen from +a controlled vocabulary. + + +3/ URL display + +edgrif said that some code is needed in zmap to display urls from the feature display +window. Needed for pfam/ensembl entries/display. + + +4/ Clone summary info/Automating DE line creation / Quality Control + +There is a script for automating this which kj2 wrote for zebrafish, +currently it must be run from the command line but jgrg is integrating +into the clone editing window in lace. + +Following on from 7/ jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + + +7/ Styles + +Doing the builds today that will support this....finally.... + + +8/ Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + + +9/ Clone finished button + +st3 would like a "Clone finished" button with same function as Locus Finished +button. jgrg to implement. + + +10/ kj2 and jla1 would like to get Solexa reads into pipeline but this is a lot +of data and will require zmap to be able to do dynamic fetches of subranges of +data otherwise we will be swamped by it. rds is to do some design work on the +dynamic loading. Initially we could only load those alignments within a marked +range. + +As an addition to this edgrif and rds will think about how we might give some +kind of "overview" for alignment columns that could show where the aligns are +without drawing them all. + + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +1/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +3/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist... + + + +2/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + +3/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + +4/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +5/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue and with less dense dots. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + +Zooming set up: +It would be very useful for large genes if the evidence, ensembl objects etc. did not +disappear when you zoom out. This happens faster in Zmap than in Fmap, but the havana +objects do not disappear in either case. (MMS) + +edgrif explained that we need the new styles to fix this. + + +Master Region + +need zmap to display what this is and information about it...we currently have a sliding window +that shows position etc. but it's not very good. + + +6/ blixem launch + +edgrif to add blixem launch to column menu (only in feature menu currently). + + +7/ clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + +Would like this info. in navigator panel + navigator panel needs to display both +the foocanvas scrolled window area _and_ the actual area on the screen, and both +should be draggable... + +rds will work on this. + + + +8/ checking multiple genes + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requires more discussion. + + + +9/ User configuration + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + +kj2 would like to be able to specify initial columns, needs jgrg to modify +styles and all will be well. + + +10/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + +We can use worm tags but adjust to be more generic, e.g. "Read_pairs" + + + +11/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +12/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +13/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +14/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +15/ Best in Genome matches + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + + +5/ Build script + +edgrif said that both he/rds and jgrg had spent a lot of time on this and +that it was time to hand it over to systems. edgrif has spoken to Guy Coates +about this and he is keen to pick up the scripts, edgrif has contacted +Tim Cutts who now looks after the Mac stuff. edgrif is working on this at home +to test the script out on his home machine. rds and edgrif to do final testing. + + + + + + +Low priority +------------ + + +1/ loutre schema and data representation changes + +We need some way to make sure st3 knows about changes anacode are making. +jgrg and st3 to communicate more over this. + + +2/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +------------------------------------------------------------------------------ +Futures + + +1/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +2/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +Perhaps one way to get this done would be employ a good C programmer on a +short contract. + + + +3/ Future stuff + +edgrif said he would like annotators to start thinking/reporting two things: + +- repetitive tasks that could be automated. + +- new ways to highlight/select data to help build/annotate transcripts. + + +We will revisit this after the annotation test in the May Havana meeting. + + +4/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + + +5/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +6/ Comparison of annotation viewers + +jla1 made the excellent suggestion that we could organise a 2 day meeting +of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The +aim being to assess the state of the art and pick up tips. + +jla1 said there was money available for this so we should target a date +later this year. + +We should tag a meeting on Genome Informatics which will be in Hinxton +from September 10 - 14. an alternative would be to do it at BOSC. + + + +7/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Other matters + + + +- jla1 said that the anacode RT queue is becoming unusable because so +many tickets remain unresolved. jgrg agreed and they will meet to clean +up the queue. edgrif commented that it should be possible to import +any useful custom fields from the zmap/acedb queues as necessary. +There is also an issue with tickets going missing, this is being +investigated. + +- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact +her and James....as it happened edgrif saw Rob later in the day and he is +on the case. He had an issue with how to produce a meaningful score which +he has pretty much sorted out now. jgrg will set this up, he has example +code. + +- there was some discussion about Leo leaving the anacode group for the EBI, +this now leaves the annotation software groups (zmap/acedb & anacode) somewhat +understrength. + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 7th August 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_10_09 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_10_09 new file mode 100755 index 000000000..8890227e0 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_10_09 @@ -0,0 +1,417 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 9th Oct 2008 + +Attendees: jla1, jgrg, lw2, kj2, edgrif + + +NOTE: We have all found that we are spending too much time in meetings going +through low priority or "in the future" items so we made the decision today +to reformat this document into two main sections: + + - a current items (high/medium priority) list + + - a back-burner section. + +the intention is to spend nearly all the time on the current list and to +promote items from the back-burner section as appropriate. + +Jen also asked about priorities/timings, I will look to see if there is +some easy to use tool that could be used to produce summaries of work to be +done and priorities. edgrif to look for a "timeline" application to help. + + + +------------------------------------------------------------------------------ +CURRENT ITEMS + + +High priority +------------- + +0/ ZMap launch bug + +There is a problem with the button that does "launch in a zmap", rds working +on this now. + + +1/ DAS tracks + +jla1 would like some of the DAS tracks currently available to be put into +lace and hence zmap. jgrg said that this is not immediately straight forward +as they don't all say which assembly they are based on but some can be done +fairly soon. e.g. comparacon ? + + +2/ Tick boxed for controlled vocabulary + +jla1 said there is an urgent need to add "tick boxes" to the lace interface to +ensure that certain properties of annotated features can only be chosen from +a controlled vocabulary. + + +3/ URL display + +edgrif said that some code is needed in zmap to display urls from the feature display +window. Needed for pfam/ensembl entries/display. + + +4/ Clone summary info/Automating DE line creation / Quality Control + +There is a script for automating this which kj2 wrote for zebrafish, +currently it must be run from the command line but jgrg is integrating +into the clone editing window in lace. + +Following on jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + + +5/ clicking on a feature in the feature list window should scroll to and +highlight the item but _not_ zoom to it as its annoying. + + +6/ Styles + +Doing the builds today that will support this, then jgrg can do his bit. +edgrif has produced a guide to style usage to help. + + +7/ Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + + +8/ Clone finished button + +st3 would like a "Clone finished" button with same function as Locus Finished +button. jgrg to implement. + + +9/ kj2 and jla1 would like to get Solexa reads into pipeline but this is a lot +of data and will require zmap to be able to do dynamic fetches of subranges of +data otherwise we will be swamped by it. rds is to do some design work on the +dynamic loading. Initially we could only load those alignments within a marked +range. + +As an addition to this edgrif and rds will think about how we might give some +kind of "overview" for alignment columns that could show where the aligns are +without drawing them all. + + +10/ Aliass/renaming of Loci + +Disable aliasing after manual locus change as it's creating incorrect aliases. + + +11/ jgrg suggested that short cuts should be given on menus/mouse-over popups +to remind the user that they exist...the help docs and zmap in general needs +to go on the external website. + + +12/ pfetch of full embl entries for variants. + +kj2 is taking this up with Kristian Gray. + + +13/ Pfam on the fly. + +- jla1 asked about doing pfam analysis on the fly, Rob Finn and James have +spoken about this, now being implemented ? + + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +- removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +1/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + +2/ blixem launch + +edgrif to add blixem launch to column menu (only in feature menu currently). + + +3/ clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + +Would like this info. in navigator panel + navigator panel needs to display both +the foocanvas scrolled window area _and_ the actual area on the screen, and both +should be draggable... + +rds will work on this. + + + +4/ checking multiple genes + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requires more discussion. + + + +5/ User configuration + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + +kj2 would like to be able to specify initial columns, needs jgrg to modify +styles and all will be well. + +rds has written code based on new Glib config package which gives read/write, +working on "save". + + + +6/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + +We can use worm tags but adjust to be more generic, e.g. "Read_pairs" + + + +7/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +8/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +9/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +10/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +11/ Best in Genome matches + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + + + +------------------------------------------------------------------------------ +BACK-BURNER ITEMS + + + +ZMap/acedb +---------- + +1/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue and with less dense dots. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + + +2/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + + +3/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +4/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +Perhaps one way to get this done would be employ a good C programmer on a +short contract. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +Otterlace +--------- + +1/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +2/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +3/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 23rd October 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_10_23 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_10_23 new file mode 100755 index 000000000..8f9ed4e24 --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_10_23 @@ -0,0 +1,377 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 23rd Oct 2008 + +Attendees: jgrg, lw2, kj2, edgrif + + +------------------------------------------------------------------------------ +CURRENT ITEMS + + +Jen asked about a tool to do priorities etc, edgrif has found "Planner", a +GNU free tool for priorities, GANT charts etc. edgrif to try out setting up +with data to see if we find it useful. + + +High priority +------------- + +1/ SNP tracks + +jla1 would like some of the DAS tracks currently available to be put into +lace and hence zmap. jgrg said that this is not immediately straight forward +as they don't all say which assembly they are based on but some can be done +fairly soon. e.g. comparacon ? + + +2/ Tick boxed for controlled vocabulary + +jla1 said there is an urgent need to add "tick boxes" to the lace interface to +ensure that certain properties of annotated features can only be chosen from +a controlled vocabulary. + +Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + + +st3 would like a "Clone finished" button with same function as Locus Finished +button. jgrg to implement. + + +3/ Clone summary info/Automating DE line creation / Quality Control + +There is a script for automating this which kj2 wrote for zebrafish, +currently it must be run from the command line but jgrg is integrating +into the clone editing window in lace. + +Following on jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + + +4/ Styles + +Where are we with this ? + + +5/ kj2 and jla1 would like to get Solexa reads into pipeline but this is a lot +of data and will require zmap to be able to do dynamic fetches of subranges of +data otherwise we will be swamped by it. rds is to do some design work on the +dynamic loading. Initially we could only load those alignments within a marked +range. Mustapha is working on this. + +As an addition to this edgrif and rds will think about how we might give some +kind of "overview" for alignment columns that could show where the aligns are +without drawing them all. + + +6/ Aliass/renaming of Loci + +Disable aliasing after manual locus change as it's creating incorrect aliases. + + +7/ pfetch of full embl entries for variants. + +Roy has talked to Kristian Gray about this, still unresolved. + + +8/ Pfam on the fly. + +- jla1 asked about doing pfam analysis on the fly, Rob Finn and James have +spoken about this, Mustapha is working on this. + + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +- removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +1/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + +2/ clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + +Would like this info. in navigator panel + navigator panel needs to display both +the foocanvas scrolled window area _and_ the actual area on the screen, and both +should be draggable... + +rds will work on this. + + + +3/ checking multiple genes + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requires more discussion. + + + +4/ User configuration + +*this is written and being tested now** + +This came up once more, edgrif has implemented better column control, +but lw2 to check with Havana and report what is required in an RT ticket. + +kj2 would like to be able to specify initial columns, needs jgrg to modify +styles and all will be well. + +rds has written code based on new Glib config package which gives read/write, +working on "save". + + + +5/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + +We can use worm tags but adjust to be more generic, e.g. "Read_pairs" + + + +6/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +edgrif also said that was extending the acedb dumper to dump gffv3 and would +make zmap dump gffv3 too. + + +7/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +8/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +9/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +10/ Best in Genome matches + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + + + +------------------------------------------------------------------------------ +BACK-BURNER ITEMS + + + +ZMap/acedb +---------- + +1/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue and with less dense dots. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + + +2/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + + +3/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +4/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +Perhaps one way to get this done would be employ a good C programmer on a +short contract. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +Otterlace +--------- + +1/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +2/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +3/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 23rd October 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_11_20 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_11_20 new file mode 100755 index 000000000..215d4b7ec --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_11_20 @@ -0,0 +1,376 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 20th Nov 2008 + +Attendees: jgrg, jla1, lw2, kj2, edgrif + + +------------------------------------------------------------------------------ +CURRENT ITEMS + + +Items Completed +--------------- + + +1/ User configuration + +rds has written code based on new Glib config package which gives read/write, +working on "save". There will need to be additional work as users try this out +and ask for things. + +2/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +gffv2 possible now, gffv3 on the way. + + +3/ pfetch of full embl entries for variants. + +Roy has talked to Kristian Gray about this, still unresolved but Roy has +changed zmap so sequence is returned for all matches and full entry for +original gene but not variants. Kristian is not able to give the ideal +fix just now. + + + + + +High priority +------------- + +1/ SNP tracks + +jla1 would like some of the DAS tracks currently available to be put into +lace and hence zmap. jgrg said that this is not immediately straight forward +as they don't all say which assembly they are based on but some can be done +fairly soon. e.g. comparacon ? + + +2/ Tick boxed for controlled vocabulary + +jla1 said there is an urgent need to add "tick boxes" to the lace interface to +ensure that certain properties of annotated features can only be chosen from +a controlled vocabulary. + +Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + + +st3 would like a "Clone finished" button with same function as Locus Finished +button. jgrg to implement. + + +3/ Clone summary info/Automating DE line creation / Quality Control + +There is a script for automating this which kj2 wrote for zebrafish, +currently it must be run from the command line but jgrg is integrating +into the clone editing window in lace. + +Following on jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + + +4/ Styles + +James working to introduce this now, zmap code is all there. + + +5/ kj2 and jla1 would like to get Solexa reads into pipeline but this is a lot +of data and will require zmap to be able to do dynamic fetches of subranges of +data otherwise we will be swamped by it. rds is to do some design work on the +dynamic loading. Initially we could only load those alignments within a marked +range. Mustapha is working on this. + +As an addition to this edgrif and rds will think about how we might give some +kind of "overview" for alignment columns that could show where the aligns are +without drawing them all. + + +6/ Aliass/renaming of Loci + +HUGO old data was overwriting deliberate manual changes to locus by annotators. +Fixed now ? + + +7/ Pfam on the fly. + +- jla1 asked about doing pfam analysis on the fly, Rob Finn and James have +spoken about this, Mustapha is working on this and there is a prototype in +test_otterlace. + + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +- removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +1/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + +2/ clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + +Would like this info. in navigator panel + navigator panel needs to display both +the foocanvas scrolled window area _and_ the actual area on the screen, and both +should be draggable... + +We need to extract this information from the smap information, edgrif to do this part. + +rds will work on this. + + + +3/ checking multiple genes + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requires more discussion. + + + +4/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + +We can use worm tags but adjust to be more generic, e.g. "Read_pairs" + + + +5/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +6/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +7/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +8/ Best in Genome matches + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + + + +------------------------------------------------------------------------------ +BACK-BURNER ITEMS + + + +ZMap/acedb +---------- + +1/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue and with less dense dots. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + + +2/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + + +3/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +4/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +Perhaps one way to get this done would be employ a good C programmer on a +short contract. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +Otterlace +--------- + +1/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +2/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +3/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 23rd October 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_12_04 b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_12_04 new file mode 100755 index 000000000..9dc2ebbfd --- /dev/null +++ b/ZMAP_LACE_PROJECT/2008/zmap_lace.2008_12_04 @@ -0,0 +1,376 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 4th Dec 2008 + +Attendees: jgrg, lw2, edgrif, st3 + + +------------------------------------------------------------------------------ +CURRENT ITEMS + + +Items Completed +--------------- + + +1/ User configuration + +rds has written code based on new Glib config package which gives read/write, +working on "save". There will need to be additional work as users try this out +and ask for things. + +2/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this should +be an easy extension. + +gffv2 possible now, gffv3 on the way. + + +3/ pfetch of full embl entries for variants. + +Roy has talked to Kristian Gray about this, still unresolved but Roy has +changed zmap so sequence is returned for all matches and full entry for +original gene but not variants. Kristian is not able to give the ideal +fix just now. + + + + + +High priority +------------- + +1/ SNP tracks + +jla1 would like some of the DAS tracks currently available to be put into +lace and hence zmap. jgrg said that this is not immediately straight forward +as they don't all say which assembly they are based on but some can be done +fairly soon. e.g. comparacon ? + + +2/ Tick boxed for controlled vocabulary + +jla1 said there is an urgent need to add "tick boxes" to the lace interface to +ensure that certain properties of annotated features can only be chosen from +a controlled vocabulary. + +Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + + +st3 would like a "Clone finished" button with same function as Locus Finished +button. jgrg to implement. + + +3/ Clone summary info/Automating DE line creation / Quality Control + +There is a script for automating this which kj2 wrote for zebrafish, +currently it must be run from the command line but jgrg is integrating +into the clone editing window in lace. + +Following on jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. + + +4/ Styles + +James working to introduce this now, zmap code is all there. + + +5/ kj2 and jla1 would like to get Solexa reads into pipeline but this is a lot +of data and will require zmap to be able to do dynamic fetches of subranges of +data otherwise we will be swamped by it. rds is to do some design work on the +dynamic loading. Initially we could only load those alignments within a marked +range. Mustapha is working on this. + +As an addition to this edgrif and rds will think about how we might give some +kind of "overview" for alignment columns that could show where the aligns are +without drawing them all. + + +6/ Aliass/renaming of Loci + +HUGO old data was overwriting deliberate manual changes to locus by annotators. +Fixed now ? + + +7/ Pfam on the fly. + +- jla1 asked about doing pfam analysis on the fly, Rob Finn and James have +spoken about this, Mustapha is working on this and there is a prototype in +test_otterlace. + + +8/ clone path: +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + +Would like this info. in navigator panel + navigator panel needs to display both +the foocanvas scrolled window area _and_ the actual area on the screen, and both +should be draggable... + +We need to extract this information from the smap information, edgrif to do this part. + +rds will work on this. + + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +- removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +1/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + + +2/ checking multiple genes + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requires more discussion. + + + +3/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + +We can use worm tags but adjust to be more generic, e.g. "Read_pairs" + + + +4/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +5/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +6/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +7/ Best in Genome matches + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + + + +------------------------------------------------------------------------------ +BACK-BURNER ITEMS + + + +ZMap/acedb +---------- + +1/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue and with less dense dots. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + + +2/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + + +3/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +4/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +Perhaps one way to get this done would be employ a good C programmer on a +short contract. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +Otterlace +--------- + +1/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +2/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +3/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 23rd October 2008 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2009/zmap_lace.2009_01_15 b/ZMAP_LACE_PROJECT/2009/zmap_lace.2009_01_15 new file mode 100755 index 000000000..873e59bee --- /dev/null +++ b/ZMAP_LACE_PROJECT/2009/zmap_lace.2009_01_15 @@ -0,0 +1,373 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 4th Dec 2008 + +Attendees: jgrg, jla1, lw2, kj2, edgrif, st3 + + +------------------------------------------------------------------------------ +CURRENT ITEMS + + +Items Completed +--------------- + + +1/ Dumping features + +kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2 +format but that some work was needed to dump subsets of features (e.g. dump +all the features from a search results window), this is done via some testing +and more importantly a tidy up of SO terms. + +2/ Pfam on the fly. + +- jla1 asked about doing pfam analysis on the fly, Rob Finn and James have +spoken about this, Mustapha is working on this and there is a prototype in +test_otterlace. DONE...needs testing. + + + + + +High priority +------------- + +1/ Tick boxed for controlled vocabulary + +jla1 said there is an urgent need to add "tick boxes" to the lace interface to +ensure that certain properties of annotated features can only be chosen from +a controlled vocabulary. + +1a/ Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + +1b/ Clone Finished button + +st3 would like a "Clone finished" button with same function as Locus Finished +button. jgrg to implement. + + +2/ Clone summary info/Automating DE line creation / Quality Control + +There is a script for automating this which kj2 wrote for zebrafish, +currently it must be run from the command line but jgrg is integrating +into the clone editing window in lace. + +Following on jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. Add automated checking against SwissProt for CDS. + +jgrg said that much of the checking was done for annotation and he will circulate +an email summarising this. QC for save to data back to Otterlace need doing though. + + +3/ Data for zebra fish DAS tracks needs mapping between assemblies, jgrg +said Mustapha has done this but he is not sure for which assemblies. + + +4/ SNP tracks + +jla1 would like some of the DAS tracks currently available to be put into +lace and hence zmap. jgrg said that this is not immediately straight forward +as they don't all say which assembly they are based on but some can be done +fairly soon. e.g. comparacon ? jgrg to investigate. + + +5/ Styles + +James working to introduce this now, zmap code is all there. + + +6/ Solexa reads + +kj2 and jla1 would like to get Solexa reads into pipeline but this is a lot +of data and will require zmap to be able to do dynamic fetches of subranges of +data otherwise we will be swamped by it. rds is to do some design work on the +dynamic loading. Initially we could only load those alignments within a marked +range. Mustapha is working on this. + +As an addition to this edgrif and rds will think about how we might give some +kind of "overview" for alignment columns that could show where the aligns are +without drawing them all. + +In fact John Collins has initial data for gene models and confirmed introns +that can be added now without code changes. + + + +6/ Aliass/renaming of Loci + +HUGO old data was overwriting deliberate manual changes to locus by annotators. +Fixed now ? lw2 to contact MGI as there are problems with IDs from them. + + +8/ clone path + +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + +Would like this info. in navigator panel + navigator panel needs to display both +the foocanvas scrolled window area _and_ the actual area on the screen, and both +should be draggable... + +edgrif will do tile path information/display, rds will do navigator bit. + + +9/ multi-view interactions + +kj2 would like a way to check positioning of multiple genes, currently would require +multiple lace sessions, requires more discussion. Possible now ?? + +kj2 would like to click on a feature in one view and see it highlighted in another +so that she can look for genes present in more than one clone. + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +- removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +1/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + + +2/ 5'and 3' EST read pairs + +we need these to be marked in zmap as in acedb, requires new tags in database in +the same way as in worm database. + +edgrif explained that acedb loses the match strand information which will be +to implement this cleanly. edgrif is changing acedb code so it holds this +information and also dumps it in gff v2 and v3 (it is required for the latter). + +kj2 would also like DITAG information displayed. + +We can use worm tags but adjust to be more generic, e.g. "Read_pairs" + + + +3/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +4/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +5/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +6/ Best in Genome matches + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + + + +------------------------------------------------------------------------------ +BACK-BURNER ITEMS + + + +ZMap/acedb +---------- + +1/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue and with less dense dots. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + + +2/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + + +3/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +4/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +Perhaps one way to get this done would be employ a good C programmer on a +short contract. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +Otterlace +--------- + +1/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +2/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +3/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 29th January 2009 + + +============================================================================== diff --git a/ZMAP_LACE_PROJECT/2009/zmap_lace.2009_01_29 b/ZMAP_LACE_PROJECT/2009/zmap_lace.2009_01_29 new file mode 100755 index 000000000..a0b55a24e --- /dev/null +++ b/ZMAP_LACE_PROJECT/2009/zmap_lace.2009_01_29 @@ -0,0 +1,369 @@ +============================================================================== +ZMap/Otterlace Development + + +Date: Thursday 29th Jan 2009 + +Attendees: jgrg, jla1, lw2, kj2, edgrif, st3 + + +------------------------------------------------------------------------------ +CURRENT ITEMS + + +Items Completed +--------------- + +<NONE ?> + + + +High priority +------------- + +1/ Tick boxed for controlled vocabulary + +jla1 said there is an urgent need to add "tick boxes" to the lace interface to +ensure that certain properties of annotated features can only be chosen from +a controlled vocabulary. lw2 to check whether "fragmented_loci" is included +in the tags. lw2 said all other tags are in the RT ticket: NNNNNNNNN + +1a/ Locus Finished button + +st3 asked if there could be a tag on a Locus to say it was Finished, +implemented via a button so that the correct tag(s) were automatically +entered. jgrg to implement. + +1b/ Clone Finished button + +st3 would like a "Clone finished" button with same function as Locus Finished +button. jgrg to implement. + + +2/ Clone summary info/Automating DE line creation / Quality Control + +jgrg has done lots of work on this and summarised his progress in an +email circulated to us all. + +There is a script for automating this which kj2 wrote for zebrafish, +currently it must be run from the command line but jgrg is integrating +into the clone editing window in lace. + +Following on jla1 also suggested that it would be good to have +automated QC scripts trawling through the database regularly looking for +duff data. Tina Eyre wrote one that could be co-opted and st3 also has +some. This is becoming an important issue for Havana to ensure really +good quality data. Add automated checking against SwissProt for CDS. + +jgrg said that much of the checking was done for annotation and he will circulate +an email summarising this. QC for save to data back to Otterlace need doing though. + + +3/ Data for zebra fish DAS tracks needs mapping between assemblies, jgrg +said Mustapha has done this but he is not sure for which assemblies. + +Not currently possible because there is no currently finished assembly. + + +4/ SNP tracks + +jla1 would like some of the DAS tracks & other data sources currently available +to be put into lace and hence zmap. jgrg said that this is not immediately +straight forward as they don't all say which assembly they are based on but some +can be done fairly soon. e.g. comparacon ? jgrg to investigate. + + +5/ Styles + +James working to introduce this now, zmap code is all there. jgrg is to set a week +when he can work on this and edgrif will set aside the week also. + + +6/ Solexa reads + +kj2 and br2 would like to get Solexa reads into pipeline but this is a lot +of data and will require zmap to be able to do dynamic fetches of subranges of +data otherwise we will be swamped by it. rds is to do some design work on the +dynamic loading. Initially we could only load those alignments within a marked +range. Mustapha is working on this. + +As an addition to this edgrif and rds will think about how we might give some +kind of "overview" for alignment columns that could show where the aligns are +without drawing them all. + +In fact John Collins has initial data for gene models and confirmed introns +that can be added now without code changes. This data is coming from Simon +Whitehead. + + +7/ Alias/renaming of Loci + +lw2 to contact MGI as there are problems with IDs from them. + + +8/ clone path + +lw2 would like the full clone extents displayed with the non-golden sections displayed. +Do we need the clone ends information for this, edgrif to check ?? Check with Leo's +smapped example with several sections of a single clone... + +Would like this info. in navigator panel + navigator panel needs to display both +the foocanvas scrolled window area _and_ the actual area on the screen, and both +should be draggable... + +edgrif will do tile path information/display, rds will do navigator bit. + + +9/ multi-view interactions + +kj2 would like to click on a feature in one view and see it highlighted in another +so that she can look for genes present in more than one clone. edgrif to do this. + + +10/ RT numbers + +It was agreed that where possible RT ticket numbers would be included in the +meetings notes. lw2, edgrif, jgrg to look up numbers. + + +11/ feature grouping tags (e.g. for 5'and 3' EST read pairs) + +wormdb uses paired tags specific to EST read pairs but we need a more flexible +generalisation of this to handle multiple features and different types of +feature. jgrg's group have been working on filtering hits in a better way and +so have more information about grouping for display. + + + +Medium priority +--------------- + +0/ new column bump to show inconsistent matches + +Often annotator has many matches that fit against an existing transcript, be good +to have a mode that hid these and only showed the ones inconsistent with the +transcripts splices. + + +1/ dotter error messages + +lw2 said that sometimes dotter just does not appear. edgrif to check that dotter +is reporting errors properly and to make sure they show in dialog windows not on +the terminal which is often not available to the annotator. + + +2/ removing evidence already used ************* + +annotators would like to be able to remove from display homologies that +have already been used to annotate variants etc. Does this need to be +persistent in the database in some way ?? edgrif & jgrg will get +together to arrange this via styles so it can persist in a natural way +in the database. + +**24526: Showing which evidence has been used +Differential coloring of matches that have been used already as evidence +for a transcript + +mainly requires jgrg to mark features and then tell zmap to move the features +to a new column or repaint them with a new style. + + +3/ Locus list + +jgrg to provide a list of loci as another tab window. + searching on ensembl ids. + + + + +5/ bug in acedb server + +jgrg raised a bug in the server which was causing it run out of memory, edgrif +to investigate. There is a ticket for this: 51894 + +edgrif to make jgrg has up to date binaries for dotter etc. + + +6/ popups/labels for transcripts + +jla1 said that apollo had a neat way of showing a label for a transcript +that remained in one place on the screen as the window was scrolled. edgrif +to investigate + look at "tool tips" for transcripts....especially with +locus information. + + +7/ Naming of Alternative Alleles + +-st3 asked about naming of alternative alleles in different mouse strains / human +haplotypes. For loci that don't have HGNC/MGI names, these are incorrectly named after +the clones on the reference seqiuence. jla1 suggested correctly naming them after the +clones they are on, but making sure that the annotators can see the associated +'reference assembly' gene. st3 said this could be done via the alt_allele table, and +if it were done across the board, ie including KNOWN genes, then this would make Vega +prep easier + + +8/ Best in Genome matches + +jla1 also said she would like to "best in genome" displayed. jgrg said this +is not easy as Otterlace works on a clone by clone basis. It was agreed +that would be worthwhile to show at least "best in clone" or better to do +a crude "best in genome". + + + + +------------------------------------------------------------------------------ +BACK-BURNER ITEMS + + + +ZMap/acedb +---------- + +1/ Interface issues: + + +jla1 and lw2 said they would like the marked area to be less obvious an also to +be a "greying" out rather than blue and with less dense dots. edgrif to implement. + + +jla1 said she would like to be able to click on an exon and see evidence (and +transcripts ?) with the same splice be highlighted. laurens also wants this +as it would often avoid having to open dotter to check. + + + +2/ Display of multiple compara alignments + +multiple alignments: edgrif is about a third of the way through implementing a +more general way of displaying arbitrary blocks. This will become a high +priority item as we move to haplotypes etc. + +th said this would be needed soon so it should be moved up the priority list. +jgrg said they have mappings in lace that could be passed on to zmap easily +and also said that annotators can already annotate assemblies from variants +and different species alongside each other as needed. + +We need to decide on the format for specifying the alignments. + + + +3/ alternative translations: edgrif about half way through code to do this. + +edgrif is doing this as part of the protein search code since this code +does translations itself. edgrif will talk to jgrg about how alternative +genetic codes can be specified with acedb. + +We need a test database for this. jgrg said this would come soon. + +edgrif will add field to transcript feature to hold alternative translation +table. + + +4/ Blixem enhancements + +two areas: + +- display multiple overlapping transcripts better (includes removing the many +yellow lines introduced by this...clarify this point), have a scrolled window +of the transcripts. jgrg said that perhaps only the transcripts made by havana +should be displayed. jla1 said she would like to be able to dynamically update +the transcripts displayed. + +- better interaction with zmap, e.g. click on things in zmap and see them +highlighted in blixem and vice versa.... + +we had better have a more generalised protocol for communicating with external +programs.... + +- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches +will be added. + +Perhaps one way to get this done would be employ a good C programmer on a +short contract. + + +5/ acedb server performance + +edgrif investigating two possibilities for improving performance: + + - make sgifaceserver stream data rather than batch it up, would + save a lot of memory. + + - deferred loading, only load features when needed and load in + zone requested by user....design done...now need to implement. + + +6/ A new canvas + +rds has been looking at alternative canvas implementations which offer an MVC +model. He has managed to get goocanvas developers to fix some bugs and make +some changes to support our needs. + +the goocanvas MVC model will mean we do not have to copy data to split windows +meaning greatly reduced memory usage. + +the goocanvas will cope automatically with the X Windows window size limit, this +combined with changes in the gtk scrolling model means we will be able to do away +with having two scroll bars. + +We will introduce the new canvas this year. + + + + +Otterlace +--------- + +1/ Alternative alignment programs + +There has been some discussion about using splice aware alignment programs. +jgrg is waiting for a fix to exonerate to support the new pipeline mustapha +has written. + +edgrif and jgrg both commented that some changes to acedb data structures +would be needed to represent both HSP's that are "joined up" but also +protein matches that start part of the way through a peptide. BUT one +possibility would be for zmap to access this data directly from a mysql +database thus sidestepping the need to put it in acedb first. gffv3 will also +be needed to represent this kind of joined up HSP data in a natural and +robust way. + +Changes will also be required to represent codons that are spliced across +introns as perhaps surprisingly none of the acedb programs can cope with +this currently (and neither can zmap). + + +2/ Spell checker + +jla1 reported a problem that free text fields and some fixed text fields +have misspellings (is that a mis-spelling ?) and it would be good to have +some autocorrection facility. The ideal would be to have some widget that +allowed other dictionaries (e.g. science) to be attached to it and could thus +be used as a general text entry tool. + + + +3/ Sequence exceptions + +kj2 raised the subject of how to indicate sequence exceptions, +e.g. when bases are skipped in translations. kj2 wondered if alternative +translations could be registered as sequence exceptions, edgrif said he +prefer a separate mechanism as much of the code is already done for this. +We should therefore include a mechanism in zmap for sequence exceptions, +this would require a similar mechanism in acedb. This is yet another reason +for GFF 3 which has standards for frame shifts and other things. + +There should be a way of tagging transcripts where there are sequence +exceptions. + + + + +------------------------------------------------------------------------------ +Next Meeting + +Will be at 2pm, 12th February 2009 + + +============================================================================== -- GitLab