<!--#set var="banner" value="ZMap/Acedb Development Plans for Autumn 2009/Spring 2010"-->
<!--#include virtual="/perl/header"-->
<h2>ZMap/Acedb Development Plans for Autumn 2009/Spring 2010</h2>
<br />
<fieldset>
<legend>Introduction</legend>
<p>The current state of play is:<p>
<ul>
<li><p>ZMap has now replaced xace as the annotation tool in havana. The code is largely stable
and performance is mostly acceptable, most functions that annotators used in xace
have been replicated in ZMap. In addition new functions have been added as well as
code to support much better interactivity between Otterlace and ZMap.<p>
<li><p>Acedb had been maintained at an acceptable level for the Worm group and agreement was
reached to stop any development on xace at all. Development continues on the
commandline/server code because this is needed to support Otterlace and ZMap.<p>
</ul>
<p>The next phase for ZMap is to build on the existing code to add completely new
facilities for annotation as described in this document.</p>
</fieldset>
<br />
<fieldset>
<legend>Staffing</legend>
<p>This summer saw the departure of Roy Storey who had worked both on Otterlace and ZMap for some
years, he will be a hard act to follow. He will be replaced as soon as possible and in addition
there is funding for another person for 2 years. This is good news as it will mean that several
pieces of work that have remained as prototypes until now will be able to be finished.</p>
<p>It is anticipated that both positions will be filled by this Autumn and this would mean
the period up to Christmas will largely be one of learning the system with proper development
beginning in Spring 2010.</p>
</fieldset>
<br />
<fieldset>
<legend>Variation Display</legend>
<p>Annotation of "canonical" organism DNA is now more than a decade old and being superseded by
the need to annotate inter and intra species variation. The display of data for single sequence
annotation is increasingly challenging and the extension of this to variation data is not simple
and Whether in zmap, blixem or whatever we don't really have good ways to present this
information currently.</p>
<p>The major challenges are:</p>
<ul>
<li>Dealing with the huge volumes of data, in particular alignments.
<li>Handling screen real estate in a way that is useful to the annotator.
<li>Displaying the different types of variation data in an informative way.
</ul>
<p>Variation can be taken to include:</p>
<ul>
<li>SNPs
<li>Alleles
<li>CNVs
<li>Haplotypes
<li>Chromosomal rearrangements
</ul>
<p>Clearly several different types of display will be required for these quite different types of
data. Currently our toolkit has two major display components:</p>
<ul>
<li>ZMap for features display
<li>Blixem for DNA or peptide sequence comparisons
</ul>
<p>ZMap requires enhancements to display some of this data while blixem will require some parts
to be completely rewritten.</p>
</fieldset>
<br />
<fieldset>
<legend>Improving the "Annotation Suite"</legend>
<p>Through experience with Acedb in particular it became clear that the annotation "viewer" and
the annotation "database" should be separated into separate programs. Databases all have their own
semantics, it is vital to keep these separate from the viewer program if the latter is to be a
more general tool for annotators. This is the approach taken with the Otterlace/ZMap system (OZ)
and is one being considered by other developers (e.g. Apollo, Suzy Lewis pers com). OZ has a
number of component programs that must communicate with each other to give as seamless a system as
possible and this is leading to the development of protocols for annotation program
inter-communication. Currently we have three major components that must communicate together:</p>
<ul>
<li>Otterlace editing/DB system
<li>ZMap display system
<li>Helper programs: Blixem, dotter, belvu and others
</ul>
<p>Communication between these components is rudimentary at the moment and the ease of use of OZ
could be considerably improved with enhancements to the current "protocols".</p>
<p>ZMap and blixem need to be very tightly linked and a better alternative would be to incorporate
blixem function into ZMap in the form of a new ZMap window. This would allow for much more
sophisticated interaction. Maybe the overview panel in blixem would not even be needed, since this
duplicates some Zmap functions. The blixem code is poorly organised, which prevents further major
development.</p>
</fieldset>
<br />
<fieldset>
<legend>Data Source/Format Support</legend>
<p>Slowly but surely a few data sources/formats are becoming "standards" for bioinformatics
e.g. GFFv3. In particular the use of ontologies (e.g. SOFA) is becoming obligatory to ensure data
integrity and interchange. Annotation at Sanger needs to change to actively use more of these
formats which requires a number of components to be augmented to support these standards.</p>
<p>Most immediately:
<ul>
<li>Add GFFV3 export to acedb
<li>Add GFFv3 parsing/export to ZMap
<li>Add Ensembl interface support to ZMap
</ul>
<p>Reuse of Sanger software by external groups is not widespread and adopting some of these common
standards and formats would help to change that. Adoption of our software by external users should
be an important goal for us.</p>
</fieldset>
<br />
<fieldset>
<legend>Strategic Software Decisions</legend>
<p>The OZ system relies on several major external graphical components:</p>
Otterlace: Tk graphics package
ZMap: Gtk graphics package, foocanvas canvas
<p>While Gtk is likely to be long lived both Tk and foocanvas seem to be reaching the end of their
active development lives. This is a concern because these components are unlikely to be developed
further and replacing them will be very time consuming as they are integral to our systems</p>
<p>There is a plan for ZMap to replace foocanvas with goocanvas its more powerful successor but
this is on hold until the Gtk consortium decide whether to adopt the goocanvas as their official
canvas widget</p>
</fieldset>
<br />
<fieldset>
<legend>Improvements to Blixem</legend>
<p>1) Blixem
<p>It's my impression that this tool is important to havana and that we could enhance it
in a number of ways:</p>
<ul>
<li>make it deal with all combinations of strands, nucleotide and peptide
alignments correctly (it does not do this currently)
<li>make it more informative (show more information visually), about
splice sites, gaps etc
<li>make it able to interact with other programs (e.g. ZMap) to provide
better navigation etc.
<li>improve general display stuff like the transcript display.
<li>make blixem able to take data in cigar and exonerate formats, the
latter could be used to give more information about matches.
</ul>
<p>all of this can be done without a rewrite which I think is preferable otherwise there
is a danger that the 2 years could be absorbed by just reimplementing rather than
extending.</p>
<p>Along the same lines it may be that there are enhancements to dotter that would also
help you.</p>
</fieldset>
<!--#include virtual="/perl/footer"-->