From cd443bf9fad9cdf24e416702aa90091ad11a9410 Mon Sep 17 00:00:00 2001 From: mh17 <mh17> Date: Tue, 1 Jun 2010 09:23:01 +0000 Subject: [PATCH] updates --- doc/Design_notes/index.html | 3 +++ doc/Design_notes/notes/col_order.html | 18 +++++++++++++++++- doc/Design_notes/notes/performance.html | 14 ++++++-------- 3 files changed, 26 insertions(+), 9 deletions(-) diff --git a/doc/Design_notes/index.html b/doc/Design_notes/index.html index f9be5e385..d118d1388 100644 --- a/doc/Design_notes/index.html +++ b/doc/Design_notes/index.html @@ -24,6 +24,8 @@ Coordinate systems<br /> <br /> <a href="Design_notes/notes/performance.shtml">Performance</a> <a href="Design_notes/notes/profile.shtml">Profiling</a> +<a href="Design_notes/notes/optimise.shtml">Optimising</a> + <br /> <a href="Design_notes/modules/pipeServer.shtml">pipeServer Configuration</a><br /> <a href="Design_notes/modules/zmapFeature.shtml">Styles</a> <a href="Design_notes/notes/glyph_style.shtml">and Glyphs</a><br> @@ -54,6 +56,7 @@ Coordinate systems<br /> <li>Styles have 3 fields that appear to be misplaced as they refer to column data: loaded, deferred, and cur_bump_mode. Most importantly <b>deferred styles</b> need to be looked at esp as commented around <b>zmapViewRemoteReceive.c #1242</b> <li><a href="Design_notes/modules/zmapView.shtml#DNA_memory">Scope and memory allocation issues</a> with DNA sequence data <li>zMapFeaturesetCreateID() and zmapStyleCreateId() and ZMapGFFSet and ZMapGFFSource could benefit from a review. The problem is that sometimes we want names normalised and sometimes we want them capitalised as given. Perhaps the best way woule be to have all key values as normalised and duplicate quarks used for display names. There are 86 calls to these functions, and this in not all the code that would have to be checked. +<li><b>NC-Splice</b> markers - need to verify if correct or not, need some test data. Currently coded as previously which does display, Jmaes' recommendations also available (iffed out) but do not produce any output. </ul> </p> diff --git a/doc/Design_notes/notes/col_order.html b/doc/Design_notes/notes/col_order.html index fa60aa761..00bb3dac0 100644 --- a/doc/Design_notes/notes/col_order.html +++ b/doc/Design_notes/notes/col_order.html @@ -33,10 +33,26 @@ The '3 Frame' column is treated specially as a placeholder. Columns are grouped </fieldset> <fieldset><legend>Reverse Complement and 3-Frame</legend> -<p>3 Frame mode will display frame sensitive colums in three groups, one for each frame, and these columns are sorted into configured order per group. 3Frame columns only appear on the right hand side of the display and contain features from the relevant strand ie normally the forward strand and the reverse strand if reverse complemented, if the style used is strand specific; otherwise they may include both strands. +<p>3 Frame mode will display frame sensitive colums in three groups, one for each frame, and these columns are sorted into configured order per group. 3Frame columns only appear on the right hand side of the display and contain features from the relevant strand ie normally the forward strand and the reverse strand if reverse complemented if the style used is strand specific, otherwise they may include both strands. </p> <p>GF-splice features are a special case as they are frame and strand sensitive and only include features from the forward strand. It is meaningless to display this data for the reverse strand. </p> </fieldset> +<fieldset><legend>Implementation</legend> +<p> +All possible columns are created for featuresets when the arrive from various database servers via a few functions on <b>zmapWindow/zmapWindowDrawFeatures.c</b> that end up in <b>createColumnFull()</b>. +</p> +<p> +There is an execute function <b>windowDrawContextCB</b> that displays a feature context and calls +<b>set_name_create_set_columns()</b> at the block level to create solumns per strand and <b>feature_set_matches_frame_drawing_mode()</b> at the featureset level to decide whether or not to include each one. +<i>Note that this second function also attempts to optimise the display of columns by preventing the drawing of columns that are already there.</i>. +</p> +<p><b>windowDrawContextCB()</b> is also called from <b>windowDrawContext()</b> for the normal display of features. +</p> +<p>Similar actions are taken by <b>zmapWindowDraw.c/zMapWindowToggle3Frame()</b> which calls <b>zmapWindowDrawFeatures.c/zmapWindowDraw3FrameFeatures()</b> and <b>zmapWindowDrawRemove3FrameFeatures()</b>. Note that when turning off 3 Frame mode we may have to re-draw single columns for those features defined as 'always display'. +<b>purge_hide_frame_specific_columns()</b> handles removing unwanted columns for these types. +</p> + +</fieldset> diff --git a/doc/Design_notes/notes/performance.html b/doc/Design_notes/notes/performance.html index fd585b133..fb705c790 100644 --- a/doc/Design_notes/notes/performance.html +++ b/doc/Design_notes/notes/performance.html @@ -1,4 +1,4 @@ -<!-- $Id: performance.html,v 1.3 2010-05-18 09:46:14 mh17 Exp $ --> +<!-- $Id: performance.html,v 1.4 2010-06-01 09:23:01 mh17 Exp $ --> <h2>Performance: Making ZMap and Otterlace run faster</h2> <fieldset><legend>Ideas for speeding things up</legend> <p> @@ -11,13 +11,7 @@ This will cut network delays by half, but note that there is a memory problem to </p> </fieldset> <fieldset><legend>Profiling ZMap</legend> -<p><b>gprof</b> is available and does all the obvious stuff. A new build directory can be created with the necessary gcc options (-pg) and run in parallel with existing builds. To set this up it is necessary to checkout a new version of ZMap and edit <b>scripts/build_config.sh</b> to set USE_GPROF=yes. This is better than editing your development version as it does not risk forgetting to remove the option for a live build. -</p> -<p>The man page for gprof does not mention whether or not it copes with threaded programs. -</p> -<p>Initial experiments show confusing numbers: the basic flat format output gives foo_canvas etc dominatiion the figures, cumultaive totals give a function somewhere in the middle (processfeature()) with no mention of appMain(). -</p> -<p>Click <a href="Design_notes/notes/profile.shtml">here</a> for some ideas on DIY profiling. +<p>Click <a href="Design_notes/notes/profile.shtml">here</a> for some ideas on profiling. </p> </fieldset> @@ -83,4 +77,8 @@ Which all implies an increase in peak network traffic by a factor of 90 - 150x. </p> <p>One obvious remark is that not every will press the start button at the same instant, and also that not all users will request huge amounts of data, so we can divide the worse case by (eg) a factor of 10. However, considering that to achieve a given performance target it is necessary to design in spare capacity that still leaves us facing in rough terms a network requirement not far removed from GB per second if we just program it blindly. </p> +<h3>Feedback from anacode</h3> +<p>Experience with pipe servers is that moderately parallel used of server scripts soon reaches a performance bottleneck on the web servers. Distant sources (eg DAS in Washington) can time out. +Data is cached by the server scripts and this is exepcted to resolve this king of problem after an initial loading of data. +</p> </fieldset> \ No newline at end of file -- GitLab