initial version

627bff13 · edgrif · 1c6055df · 627bff13
Commit 627bff13 authored 17 years ago by edgrif
--- a/ZMAP_LACE_PROJECT/zmap_lace.2008_02_21
+++ b/ZMAP_LACE_PROJECT/zmap_lace.2008_02_21
+==============================================================================
+ZMap/Otterlace Development
+
+
+Date:  Thursday 21st February 2008
+
+Attendees: jgrg, st3, jla1, lw2, edgrif
+
+
+1) otterlace + zmap progress
+============================
+
+
+High priority
+-------------
+
+
+0/ ZMap performance - contd.....
+
+edgrif investigating two possibilities for improving performance:
+
+	- make sgifaceserver stream data rather than batch it up, would
+	  save a lot of memory.
+
+	- deferred loading, only load features when needed and load in
+	  zone requested by user....design done...now need to implement.
+
+We have a performance script that we can now use to measure X Windows as
+compared to edgrif's machine. The script records the salient parts of a
+machines configuration as well. 
+
+Need to check script works on Mac machines. edgrif to follow this up.
+
+systems have found a fix for the poor display performance and it should
+all be ok now.
+
+edgrif to email havana about performance measuring script.
+
+
+
+1/ Consistency Test using ZMap
+
+A number of items are required for the annotation test, they are indentified with
+"*************" and must be in place _prior_ to the start of the test.
+
+jla1 said there will be a test in February using zmap, not xace. edgrif said he
+and Roy will make sure there is a stable version for then.
+
+jla1 said this would be only possible when points 1, 2 & 4 have been actioned.
+
+
+
+2/ Display of otter information *************
+
+zmap feature display is done but needs further details to be passed from
+otterlace:
+
+- lw2 also asked about DE line information display, edgrif asked lw2 to collect
+together requirements for information to be displayed. jla1 to chase up.
+
+- Several users want species info. for matches.
+
+- 39329:  PFAM info
+Possible to show a description associated with Halfwise (Pfam) objects in Zmap?
+Currently in lace we see the domain description as well as the pfam accession number
+(EAH) jgrg will pass this information over to zmap, in particular Ensembl URLs will
+be passed over.
+
+- BLAST evidence info:
+Would it be possible to see what organism a piece of evidence belongs to, without
+having to pfetch each one individually? Also no info (field Title in Fmap) on EST/mRNA
+hits unless you pfetch (RJK, AF2, MT4) We need species and taxon data but some of
+this needs new database fields (jgrg to implement).
+
+
+3/ Clone summary info/Automating DE line creation.
+
+eah raised  the point that  certain summary information for  clones is
+not available when using zmap.  e.g. the number of CpG islands.  These
+are  used for  the  authoring of  DE  lines.  There  is  a script  for
+automating this  which kj2 wrote  for zebrafish, but it's  specific to
+zebrafish and  requires running  on the command  line. jgrg  said this
+could be  integrated into  the clone editing  window in lace.  jgrg to
+action.
+
+
+
+
+4/ CDS translation *************
+
+From Jane
+
+Showing the  peptide translation  of an object  next to the  object in
+zmap with the residue numbers next to the exon boundaries is still not
+working.  It took me about 5  times as long to check my object without
+this functionality.
+
+Will be  in next  build which  is early next  week. Actually  it's not
+quite finished as it's currently quite unstable.
+
+rds has nearly finished this and it will be in the build for the test
+set up.
+
+
+
+5/ removing evidence already used *************
+
+annotators would like to be able to remove from display homologies that
+have already been used to annotate variants etc. Does this need to be
+persistent in the database in some way ?? edgrif & jgrg will get
+together to arrange this via styles so it can persist in a natural way
+in the database.
+
+**24526: Showing which evidence has been used
+Differential coloring of matches that have been used already as evidence
+for a transcript
+
+mainly requires jgrg to mark features and then tell zmap to move the features
+to a new column or repaint them with a new style.
+
+
+
+
+
+Medium priority
+---------------
+
+
+0/ alternative translations: edgrif about half way through code to do this.
+
+edgrif is doing this as part of the protein search code since this code
+does translations itself. edgrif will talk to jgrg about how alternative
+genetic codes can be specified with acedb.
+
+We need a test database for this. jgrg said this would come soon.
+
+
+1/ locator column: zmap needs a locator column as per fmap which could be
+used to display dna and peptide search results and other information.
+
+
+2/ Annotation needs to have a history across assembly changes.
+
+kj2  requested that  when an  assembly changes  the objects  which get
+transferred  should have a  history as  to what  they were  (otter id)
+previously.  lw2 remarked that it's  possible to search for the old id
+in the lace  interface, but impossible to retrieve  the old object and
+display it.  jgrg  acknowledged there is a bug  in the lace searching,
+but  that   providing  the   history  would  be   difficult.   Further
+discussion/thought on how to implement this is needed.
+
+jgrg said Mustapha is working on this now.
+
+
+
+3/ clone overlap display
+
+Discussions with Mindi showed that main requirement is for annotators to be
+able to see what features are in the overlap region of the section of clone
+_not_ mapped. This can be done currently using the ZMap -> File -> New Sequence
+and specifying the clone for the new sequence. The resulting display shows
+the "non-golden" column which marks which section(s) of the clone were not
+mapped allowing the annotator to identify which features lie in that zone
+and hence are not mapped themselves.
+
+rds has mailed round about how to do this. jgrg said he would like to add
+this facility to otterlace but the zmap route is ok for now.
+
+
+
+4/ Multiple alignments
+
+multiple alignments: edgrif is about a third of the way through implementing a
+more general way of displaying arbitrary blocks.  This will become a high
+priority item as we move to haplotypes etc.
+
+th said this would be needed soon so it should be moved up the priority list.
+
+
+5/ pfetch proxy
+
+jgrg has provided pseudo code, rds to implement.
+
+
+6/ Spell checker
+
+jla1 reported a problem that free text fields and some fixed text fields
+have misspellings (is that a mis-spelling ?) and it would be good to have
+some autocorrection facility. The ideal would be to have some widget that
+allowed other dictionaries (e.g. science) to be attached to it and could thus
+be used as a general text entry tool.
+
+
+7/ Quality Control
+
+Following on from 7/ jla1 also suggested that it would be good to have
+automated QC scripts trawling through the database regularly looking for
+duff data. Tina Eyre wrote one that could be co-opted and st3 also has
+some. This is becoming an important issue for Havana to ensure really
+good quality data.
+
+
+8/ Interface issues:
+
+
+extending marked region:
+
+jla1,  eah and lw2  said users  would like  some way  of interactively
+extending the marked area. edgrif  to look at this perhaps using mouse
+action  over the  marked area.   eah also  requested that  this should
+alter the bump column so that evidence that has been hidden, as it did
+not overlap, gets shown.
+
+
+jla1 and lw2 said they would like the marked area to be less obvious an also to
+be a "greying" out rather than blue. edgrif to implement.
+
+jla1 said she would like to be able to click on an exon and see evidence (and
+transcripts ?) with the same splice be highlighted.
+
+
+Zooming set up:
+It would be very useful for large genes if the evidence, ensembl objects etc. did not
+disappear when you zoom out.  This happens faster in Zmap than in Fmap, but the havana
+objects do not disappear in either case. (MMS)
+
+edgrif explained that we need the new styles to fix this.
+
+
+
+
+9/ There was a discussion about how much a user should be able to
+configure. It was agreed that they should be able to configure which columns
+are initially hidden. They should also be able to "save" the current settings
+to set this up and to be able to restore the system or there currently saved
+defaults. edgrif to implement and improve column turning on/off.
+
+edgrif has implemented code for this but now need eah and lw2 to report back with
+list of what users would like to be able to configure.
+
+jla1 said that configuration is required more at the group or DB level, we can
+already do this via lace setting up zmaps configuration files.
+
+
+
+10/ Future stuff
+
+edgrif said he would like annotators to start thinking/reporting two things:
+
+- repetitive tasks that could be automated.
+
+- new ways to highlight/select data to help build/annotate transcripts.
+
+
+We will revisit this after the annotation test.
+
+
+5'and 3' EST read pairs and Ditags
+
+we need these to be marked in zmap as in acedb, requires new tags in database in
+the same way as in worm database.
+
+
+
+11/ Dumping features
+
+kj2 asked if zmap could dump features, edgrif said that it could dump in GFFv2
+format but that some work was needed to dump subsets of features (e.g. dump
+all the features from a search results window), this should
+be an easy extension.
+
+edgrif also said that was extending the acedb dumper to dump gffv3 and would
+make zmap dump gffv3 too.
+
+
+12/ bug in acedb server
+
+jgrg raised a bug in the server which was causing it run out of memory, edgrif
+to investigate. There is a ticket for this: 51894
+
+
+
+Low priority
+------------
+
+
+1/ jgrg suggested that short cuts should be given on menus/mouse-over popups
+to remind the user that they exist...edgrif to do this.
+
+
+2/ loutre schema and data representation changes
+
+We need some way to make sure st3 knows about changes anacode are making.
+jgrg and st3 to communicate more over this.
+
+
+3/ Alternative alignment programs
+
+There has been some discussion about using splice aware alignment programs.
+jgrg is waiting for a fix to exonerate to support the new pipeline mustapha
+has written.
+
+edgrif and jgrg both commented that some changes to acedb data structures
+would be needed to represent both HSP's that are "joined up" but also 
+protein matches that start part of the way through a peptide. BUT one
+possibility would be for zmap to access this data directly from a mysql
+database thus sidestepping the need to put it in acedb first.
+
+
+
+
+2) Back To The Future
+=====================
+
+
+Leo's transcript display
+------------------------
+
+A discussion about this showed that most of it was not used but some features
+such as highlighting all matches that exactly align to an exon would be very
+useful. These should be imported to zmap (need to think about how to do this
+in terms of short cuts and colour used for highlighting...shoudl it be a mask ?).
+
+
+The discussion went on to talk about enhancements to blixem in two areas:
+
+- display multiple overlapping transcripts better (includes removing the many
+yellow lines introduced by this...clarify this point), have a scrolled window
+of the transcripts. jgrg said that perhaps only the transcripts made by havana
+should be displayed. jla1 said she would like to be able to dynamically update
+the transcripts displayed.
+
+- better interaction with zmap, e.g. click on things in zmap and see them 
+highlighted in blixem and vice versa....
+
+we had better have a more generalised protocol for communicating with external
+programs....
+
+- blixem: dna searching is NOT DONE, edgrif to expedite. Also protein searches
+will be added.
+
+
+
+
+
+A new canvas
+------------
+
+rds has been looking at alternative canvas implementations which offer an MVC
+model. He has managed to get goocanvas developers to fix some bugs and make
+some changes to support our needs.
+
+the goocanvas MVC model will mean we do not have to copy data to split windows
+meaning greatly reduced memory usage.
+
+the goocanvas will cope automatically with the X Windows window size limit, this
+combined with changes in the gtk scrolling model means we will be able to do away
+with having two scroll bars.
+
+We will introduce the new canvas this year.
+
+
+
+Builds
+------
+
+rds has done the universal binary builds. So we can provide hand builds of them
+but still need to incorporate the system into our overnight build procedure.
+A labour of love not helped by fairly poor docs from Apple. But it is now
+working.
+
+update has been incorporated into  overnight builds, but we still have
+library  clashes  with those  in  the  curernt  otterlace (xace  only)
+distribution.  James and I are working to resolve these.
+
+jgrg working on mac distribution to incorporate zmap.
+
+edgrif to start getting systems to take this on.
+
+jgrg and rds seem to have got things to a stage where we can reliably
+build for local installs and for James install package. Be good to pass
+some of this on to systems.
+
+
+
+Turning off xace
+----------------
+
+The  plan is  to turn  off xace  in the  otterlace client  as  soon as
+possible.  This  requires actioning of  1, 2 and 4.   Inevitably there
+will be a  need to either quickly switch back or  have a version which
+still incorporates xace. jgrg to think how to do this.
+
+How many external users is this going to effect?  Do they need training?
+
+jgrg said he would make xace readonly and turn it off by default in the
+next release.
+
+
+Comparison of annotation viewers
+--------------------------------
+
+jla1 made the excellent suggestion that we could organise a 2 day meeting
+of gbrowse, apollo, artemis, zmap/otterlace to compare and contrast. The
+aim being to assess the state of the art and pick up tips.
+
+jla1 said there was money available for this so we should target a date
+later this year.
+
+
+2) Other matters
+================
+
+
+- jgrg,  st3  &  jla1  discussed  the upcoming  Vega  (mouse)  release
+  including  which schema version  and assembly  version it  should be
+  built with. jla1 asked about updating to assembly NCBI37.
+
+
+- jla1 said that the anacode RT queue is becoming unusable because so
+many tickets remain unresolved. jgrg agreed and they will meet to clean
+up the queue. edgrif commented that it should be possible to import
+any useful custom fields from the zmap/acedb queues as necessary.
+There is also an issue with tickets going missing, this is being
+investigated.
+
+
+- jla1 requested that redundant external annotation should not be shown
+in Ensembl as it was out of date and inaccurate and gave Vega a bad name.
+This annotation will be removed in the future.
+
+
+- jla1 asked about doing pfam analysis on the fly, Rob Finn was to contact
+her and James....as it happened edgrif saw Rob later in the day and he is
+on the case. He had an issue with how to produce a meaningful score which
+he has pretty much sorted out now.
+
+
+3) Next Meeting
+===============
+
+Will be at 2pm, 21st February 2008
+
+
+==============================================================================