zmapView is quite a large module! When adding a topic to this file please include a link here.
Historically, when ZMap creates a view it goes through a process of 'data loading' during which it requests all the data implied in its source stanzas. This has typically been ACEDB and this connection is maintained open for the life of the view to allow further requests by the user. Till this initial phase has completed the user may not interact with the view. With the advent of pipeServaers this model is no longer enough to cope with incremental/ optimised loading of data and we need to be able to specify sources as not active on startup, and to be able to start previously unknown sources on request from Otterlace.
A source stanza in ZMap's configuration file ZMap specifies the URL of the data source and some other options including featuresets supported. A new option delayed=true/false will set whether or not a data source will be activated automatically on startup. A pipeServer source will default to delayed=true and other types false. NOTE: this proves to be quite fiddly. In the short term (to allow testing of the real code) we will require delayed=true to be set where needed. It may be easier to specify active and delayed sources in [ZMap].
Any request for data after the initial 'data loading' phase will be activated immediately regardless of the setting of 'delayed'.
Feature columns may be populated by data from several featuresets and this mapping was originally specified by ACEDB. A new stanza [featuresets] is provided to allow this to be defined statically in the main ZMap configuration file (NB: fix this link, it should point to the config user guide)
An initial request for data must be configured as ZMap will not run without data, and this will prevent the situation where we have a blank window and no user control. It also provides an opportunity to add in predfined data - this can probably be tidied up with little mishap, such that predefined styles etc are added automatically and no initial data requests are required. However, it may be necessary to include all the navigator featuresets in the initial load.
On startup the config file will be scanned and the list of data servers extracted. Those that are not configued as 'delayed' will be connected to and all featuresets they support requested. As at present if they support many featuresets then all will be requested together; if it is desired that these servers supply each feature set concurrently then seperate source stanzas must be defined for each one. (This should not be important as we expect minimal amount of data to be requested in this way, and the primary mechanism will be via X-Remote commands.
This can only occur after the initial 'data loading' phase. If triggered by a 'load column' request from the user this must be so as they do not have keyboard control till then. However, if commands are recevied on X-Remote during the initial load phase then this may simply take longer - extra featuresets will be subsumed into the initial set of requests.
The existing X-Remote protocol is suffcient for our purposes and will be used unchanged, this means that data will be requested by featureset name; a sample data request looks like this:
<zmap> <request action=\"load_features\"> <align> <block> <featureset name=\"EST_Human\"> </featureset> <featureset name=\"Saturated_EST_Human\"> </featureset> </block> </align> </request> </zmap>
On receipt of a request ZMap will re-read its config file and scan the list of source stanzas for the requested featuresets - this will define the data sources to request the data from (each source stanza defines the supported featuresets). A new connection will be started for each request to allow multiple requests to be processed by servers without imposing delays implied by queuing. Existing connections (ie non-delayed) may be used if not currently active.
Styles are defined in a (large) file and it may be beneficial for Otterlace to be able to specify whether or not to re-read this file, or for indiviudual sources to have thier own styles files, or to be able to request a styles refresh without re-requesting data.
The pipeServer interface has a requirement to supply styles despite these not being handled by GFF (due to legacy issues). It would be good to remove this requirement at some point - styles can be defined at startup and there is little sense in merging one set of styles with an identical one. However thease can be changed at runtime so it's not so clear cut. The GFF parser requires styles so to fix this would required a deeper mod than might first appear necessary.
A View has a set of data structures that implement a StepList (see zmapView_P.h), which allows a sequence of actions to be created and operated in turn, and also to specify actions to take in case of error. Currently (10 Mar 2010) the View has one of these lists and this is used to control the list of data servers together. To handle concurrent requests from different servers this will be changed to be a list of Step Lists, each operating independantly.
The request process is modelled on the ACEDB interface, consisting of a number of steps defined as ZMapServerReqType in include/ZMap/zmapServerProtocol.h. Old code used the first connection in the view's list and assumes that this is already active and for requests to pipeSevers this is not correct - the connection must be started fresh each time and closed when finished.
Code that handles this is found in zmapView.c/zmapViewLoadFeatures(), zmapViewUtils.c/loadFeatures(), and zmapView.c/commandCB() (to request a DNA sequence). The latter uses View->sequenece_server to find the right connection.
The function zmapView.c/zmapViewConnect() is used for the initial 'data loadng' phase on startup and uses similar code.
This will involve the following steps, (any legacy code left over by mistake can be removed)
Note that a separate request will be allocated for each featureset, even if several are included in the same external request and as supported by the same data server. This is to allow maximum concurrency in the hope that data will be loaded faster. For example EST_human and Saturated_EST_Human typically get requested together and form a logical category together. This will also apply to ACEDB requests if more than one request is active.
The view has a list of active connections, originally none of these ever died. There is also a step list that refers to lists of connections. The connection_list will be modified so that each connection has its own step list, and the step list will no longer have a list of connections. The step list structures will not refer to thier connection - the code is only called from a few places and the view/ connection is available at all of them.
The step list poll function (zmapView.c/checkStateConnections() will poll the entire connection list and inspect the step list for each. As the step list has a 'current' pointer this will be efficient, and servers that complete and are terminated can be removed from the connnection list.
On the initial load the View will be set as 'data loading' and a busy cursor displayed until all loading has completed.
Loading data after startup set the View to 'columns loading' and a busy cursor displayed until all requested columns have completed. It may be possible to delay the busy cursor till the first requested column is ready for display, but intitially we will not implement this as it may allow for race conditions to occur due to user activity. The view state 'columns loading' will revert to 'data loaded' on completion.
Initially plain GFF format (version 2, replaced by version 3) will be used, but to reduce network bandwidth this may be GZipped - these should compress very well. It may be advantagous to GZip the data in smaller chunks, which will require less memory (important if we load 100 featuresets at once).
Currently Otterlace shows a progress bar for data loading and to retain this while usihng pipeServers X-Remote messages will be added to drive this. Initially this will consist of reporting 'completed', but if GZip chunks are implemented this can be broken down somewhat. The X-remote message format will be like:
hello: ref to x-remote doc mentioned above