ZMap Feature Sets and Styles


Preface

Configuration examples for Styles and other components of ZMap are given in GLib keyword value file format, if you are unfamiliar with this then read the Keyword Value Format section.

Note that ZMap can also support Styles in other formats as shown in the section on acedb.

NB: This file has beed copied frorm the web old web directory and may need updating. (mh17 09 Mar 2010)


Overview

ZMap displays features using "Feature Sets" and "Styles", the distinction between these two is:

"Feature Sets":

Are the fundamental groups of features that ZMap displays. Each feature set has a unique name which can be used to query, find and retrieve the features within that set.

Which feature sets are displayed is controlled by the "featuresets" keyword in the "source" stanza in the ZMap config file:

[source]
url = acedb://localhost:12345?use_methods=true
featuresets = Genomic_canonical;ORF;CDS;Locus;Restriction map;External_transcript;etc...

Normally the names given are the names of individual feature sets but ZMap also allows grouping of feature sets into column groups (see Column tags), in this case you only need to put the "column" names in this list, you do not need to include all the column children, ZMap will sort all that out for you.

"Styles":

Control the appearance and processing of features within ZMap. They control aspects such as colour, width, style of feature display but also how those features are processed during user interaction, e.g. by clicking on them to display details or perhaps dumping them to files. These properties are specified in keyword value format in configuration files. The following example defines a Style called "Allele" with a column width of 2 and coloured red with a black border:


[Allele]
mode = basic
width = 2
colours = normal fill red ; normal border black

The following sections describe the data types used in specifying the properties and the properties themselves.


Data Types used in Style Properties
Special Value Types
Type Value range Description
boolean true | false be sure to use lower case !
int min_int <= value <= max_int Min and Max values for int are system dependent and usually much more than required by ZMap.
double min_double <= value <= max_double Min and Max values for double are system dependent and usually much more than required by ZMap.
one-of < value | value | value > Several properties can take only one of several values, these are explained in the property table that follows.
colour_type "Type Target Colour" triplet(s) The triplet specifies colours for "normal" or "selected" feature display with "fill", "draw" and "border" colours:
  • Type = < normal | selected >
  • Target = < fill | draw | border >
  • Colour = a colour from the X11 rgb.txt list or in hex format: '#rgb', '#rrggbb', '#rrrgggbbb' or '#rrrrggggbbbb'.
e.g. "normal fill light blue".

Style Properties

The following table gives a definitive list of Styles properties:

Styles Properties
Keyword Value Type Description
parent string Name of a style from which to inherit properties.
description string Text describing the style perhaps with comments about inheritance etc.
mode < basic | alignment | transcript | dna | peptide | text | graph | glyph | assembly-path > Identifies how a feature should be displayed and processed, e.g. a transcript will be displayed using the accepted block/angle line style.
colours colour_type Feature colours for normal display.
frame0-colours colour_type Feature colours for frame zero in 3 frame display.
frame1-colours colour_type Feature colours for frame one in 3 frame display.
frame2-colours colour_type Feature colours for frame two in 3 frame display.
rev-colours colour_type Feature colours for display on reverse strand.
display-mode < hide | show_hide | show > Controls when features are displayed:
  • hide: never show the features.
  • show_hide: show/hide according to zoom, min/max mag.
  • show: always show features.
bump-mode < unbump | overlap | start-position | alternating | all | name | name-interleave | name-no-interleave | name-colinear | name-best-ends > Controls how features are arranged within their column:
  • unbump: No bumping (default)
  • overlap: Bump any features overlapping each other.
  • start-position: Bump if features have same start coord.
  • alternating: Alternate features between two sub_columns, e.g. to display assemblies.
  • all: A sub-column for every feature.
  • name: A sub-column for features with the same name.
  • name-interleave: All features with same name in a single sub-column but several names interleaved in each sub-column, the most compact display.
  • name-no-interleave: Display as for Interleave but no interleaving of different names.
  • name-colinear: As for No Interleave but for alignments only colinear shown.
  • name-best-ends: As for No Interleave but for alignments sorted by 5' and 3' best/biggest matches, one sub_column per match.
default-bump-mode (as for bump-mode) (as for bump-mode)
bump-spacing double Specifies space to leave between sub-columns of features.
frame-mode < never | always | only-3 | only-1 > Controls how a column is processed for 3 Frame display:
  • never: column is not 3 frame sensitive, display as normal.
  • always: in 3 frame mode, display column features in their 3 columns, as single column otherwise.
  • only-3: display as 3 frame columns only in 3 frame mode.
  • only-1: display as single column only in 3 frame mode.
min-mag double Do not display features when zoom is below this magnification.
max-mag double Do not display features when zoom is above this magnification.
width double Gives a default width to the feature, some features will have different widths, e.g. if they are scaled by score.
score-mode < width | offset | histogram | percent > For features with scores, controls how they are displayed:
  • width: features displayed as blocks, higher the score, wider the block.
  • offset: features displayed as blocks, higher the score, bigger the offset.
  • histogram: features displayed as a graph/histogram.
  • percent: features displayed as graph within certain percentage range.
min-score double features with a score below this are not displayed.
max-score double features with a score above this are not displayed.
gff-source string Alternative GFF "source" name to output in GFF dump, default is style name.
gff-feature string Alternative GFF "feature" type to output in GFF dump, default is taken from features type.
displayable boolean if false, never display column.
show-when-empty boolean If true, show column even if it has no features.
show-text boolean If true, display any remarks/text relating to the feature.
strand-specific boolean If true, feature is strand specific meaning that in normal display only forward strand features get displayed (see show-reverse-strand for reverse strand features).
show-reverse-strand boolean If true, for strand specific features, features on reverse strand are displayed in the reverse strand area.
show-only-in-separator boolean If true, features are displayed in the strand separator bar.
directional-ends boolean If true, features which are directional (transcripts, matches etc) are drawn with arrow ends to show direction.
deferred boolean If true, features are not loaded at start up but later by user request.
loaded boolean If true, indicates that a set of features has been loaded.
graph-mode < line | histogram > Controls drawing style for a graph column:
  • line: a line graph.
  • histogram: a histogram.
graph-baseline double Set's a baseline for drawing, graphs are drawn up and downwards from this value.
glyph-mode < splice > Controls how which glyph is used to draw a column:
  • splice: walking stick shapes indicate splice sites.
  • alignment-parse-gaps boolean If true, gapped alignment data is parsed out of the incoming data stream.
    alignment-align-gaps boolean If true, gapped alignment data is drawn.
    alignment-within-error int Specifies allowable alignment error in bases between sub-blocks within a single gapped alignment.
    alignment-between-error int Specifies allowable alignment error in bases between separate alignments.
    alignment-allow-misalign boolean If true then match and reference do not have to be exactly the same length, reference coords will be used for display.
    alignment-blixem < blixem-n | blixem-x > If specified then the sequences of the features will be passed to blixem for:
    • blixem-n: nucleotide alignment.
    • blixem-x: peptide alignment.
    alignment-pfetchable boolean If true, the sequence for an alignment feature can be retrieved using the pfetch server.
    alignment-perfect-colours colour type Specifies the colour for the bar joining perfectly colinear alignments.
    alignment-colinear-colours colour type Specifies the colour for the bar joining colinear alignments.
    alignment-noncolinear-colours colour type Specifies the colour for the bar joining non-colinear alignments.
    transcript-cds-colours colour type Specifies the colour for the CDS part of a transcript.
    non-assembly-colours colour_type Colours for the non-assembly section of a clone.

    Inheritance

    There is only one mandatory property for a style and that is it's name, the name is used to make a unique id for the style. This very flexible policy is what allows inheritance to work but has the disadvantage that the user must set certain properties explicitly for there to be enough information for ZMap to display the set of features referencing that style.

    The implementation allows any number of styles inheriting settings from parent styles to any number of levels. This facility can be used to give common properties to sets of features that are common in some way (e.g. all wublast hits).

    If you have several styles that are similar you can specify all the common properties in one parent style and then in the child styles just specify those properties that are different. You can have any level of inheritance you like BUT the inheritance must be a DAG, no circularities or multiple inheritance are allowed:

    # parent-style gives the name of a style from which to inherit properties
    # which can be overridden in this style.
    parent-style = < name of parent style >
    
    

    Here's an example in keyword-value format showing a base parent for all box like features, a subparent for all BLASTN data sets and then some styles for different BLASTN columns:

    [Basic_Feature_Parent]
    description = The base style for all box like features.
    Colours = Normal Border black
    
    
    [BLASTN_Parent]
    mode = alignment
    parent-style = basic_feature_parent
    colours = normal fill brown
    width = 15.000000
    bump-mode = complete
    score-mode = width
    min-score = 100.000000
    max-score = 400.000000
    gff-feature = transcription
    
    
    [blastn_est_briggsae]
    style-parent = blastn_parent
    strand-sensitive = true
    gff-source = blastn_est_briggsae
    
    
    [blastn_est_elegans]
    style-parent = blastn_parent
    strand-sensitive = true
    gff-source = blastn_est_elegans
    
    
    [blastn_tc1]
    description = this method was used to map flanking sequences of tc1 insertions etc etc
    style-parent = blastn_parent
    gff-source = blastn_tc1
    

    Minimum Drawable Style

    In order to support inheritance (see later) there are _no_ defaults in styles, this means that the user needs to specify a minimum subset of properties in order for a style to be "drawable". Note that ZMap does not have display defaults embedded in the code because past experience has shown that this requires a large number of defaults that are at best "guesses" as to what the user wants to see. This leads to confusion and uncertainty about how feature display is controlled.

    The following properties must be specified if the features referencing the style are to be displayed:

    Minimum Displayable Style
    Property Description
    "mode" How the feature should be displayed/processed.
    "bump-mode" How the feature should be bumped.
    "width" Basic display width of feature.
    "border" and/or "fill" and/or "draw" colour Basic, Transcript and Alignment features require fill and/or border colours, all text (e.g. dna) requires fill and/or draw colours.

    Thus the minimum specification of a style is:

    # Min drawable class.
    [min_drawable_style]
    mode = xxxxx
    width = nnn.nnn
    colours	= normal border some colour
    bump-mode = xxxx
    

    Here are some examples:

    [allele]
    mode = basic
    colours = normal fill orange
    width = 1.100000
    bump-mode = complete
    
    
    [blastn_est_briggsae]
    mode = alignment
    colours = normal fill brown
    width = 15.000000
    bump-mode = complete
    
    
    [curated]
    mode = transcript
    colours = normal border darkblue
    width = 15.000000
    bump-mode = compact_cluster
    

    Colours

    There is no automatic border colour so it needs to be specified for most features, e.g.

    colours = normal fill blue
    colours = normal border black
    

    The following colours are used by Otterlace but do not exist in the X11 rgb.txt file and so produce errors in the colour conversion call in the style code:

    paleviolet
    paleorange
    palegray
    cerise
    

    Strand stuff

    In acedb "frame_sensitive" had more than one meaning:

    which of these happened with methods is hard coded in acedb. With styles you must specify which behaviour you want:

    frame-mode = always        gives the first behaviour
    
    frame-mode = only-3        gives the second behaviour
    
    frame-mode = only-1        gives the third behaviour
    

    Bump stuff

    Bump modes now need to be set explicitly or nothing will happen to the column on bumping.

    bump-mode = XXXXXXX      sets the initial bump mode.
    
    bump-mode-default = XXXXXX    sets the mode that the column will be bumped with by default.
    

    If you want the column to be initially "unbumped" but bumped with colinear lines etc when it is bumped you just need to set bump-mode-default, bump-mode will be unbumped by default, e.g.

    bump-mode-default = range_colinear
    

    Alignments

    All alignments, e.g. EST_human, should be given the mode "Alignment", not "Basic" and also need the "Internal" and "External" tags setting if internal and external (between HSP) gaps for all matches for a given sequence should be joined up with lines:

    [WABA_parent_all]
    description = Inherited method for all WABA hits.
    colours = normal fill darkgreen ; normal border black
    width = 6.000000
    bump-mode = complete
    bump-mode-default = range_colinear
    score-mode = width
    score_min = 40.000000
    score_max = 120.000000
    alignment-align-gaps = true
    

    Predefined Styles

    There are a number of predefined styles for common features such as DNA sequence display. These features have reserved names which should not be used for other sorts of features. The feature set and the style for these features have the same name.

    Some of these styles are "meta" styles which control the action of a column rather than specific features, e.g. "3 Frame" controls whether and where the 3 frame translation columns are displayed but the individual 3 Frame columns display is controlled by the style for each column. Others are normal styles and some of their settings can be overridden by the user. The following table summarises the predefined styles:

    Predefined Styles
    Style Style Type User Modifiable Description
    3 Frame Meta No controlling 3 frame display
    3 Frame Translation Normal Yes 3 frame protein translation display
    dna Normal Yes dna sequence display
    Locus Normal Yes display of a column of locus names + display of locus names in navigator
    GeneFinderFeatures Meta No splice sites from the Gene Finder program
    Show Translation Normal Yes Show peptide translation column
    Assembly Path Normal Yes Show reference sequence assembly path

    The rules for using these predefined styles are:


    ZMap Styles and Acedb

    ZMap will read either acedb ?Method or ?ZMap_style objects for style information. The default is to read ?ZMap_style objects and you should convert to using these instead of methods as many features are not available with methods.

    The relationship between features and ?Method objects is different according to whether you are using ?ZMap_style objects or ?Method objects to specify styles. The following sections describe how to use each one. Support for ?Method object styles will be maintained to allow immediate usage of acedb databases without the need to convert to styles.

    Using ?ZMap_style objects for Styles

    Using ?ZMap_style objects means that all feature display is completely controlled by the ZMap_style object and all retrieval of features from the acedb database is completely controlled by the acedb ?Methods. This echoes the separation adopted by DAS and other feature display protocols.

    This is where we always wanted to get to, the style is now completely independent of the feature set. BUT note the corollary of this: each feature set must now have both a method object AND a style.

    A number of classes must be added to wspec/models.wrm to fully support ZMap Styles from an acedb database and several tags must be added to the ?Method class. The following is a definitive, annotated list of the new classes and tags which can be copied and pasted direct into a models.wrm file.

    The ?ZMap_style Class

    
    

    Expressing Inheritance using the ZMap_style class

    The principles of inheritance have already been covered, this section gives details for the acedb ?ZMap_style class specification.

    The inheritance tags in the ?ZMap_style class are:

                // Parent points to a parent style from which attributes can be inherited,
                // there can be an arbitrary depth of parents/children but they must form
                // a DAG, cycles and multiple inheritance are _not_ permitted.
                Parent Style_parent UNIQUE ?ZMap_Style XREF Style_child
                       Style_child         ?ZMap_Style XREF Style_parent
    

    Note that you need only specify the Style_parent in your ace file, the XREFs will take care of all the Style_child entries. Note also that the tags attempt to enforce a DAB by allowing only one parent but multiple children.

    Here's an example in ace format showing a base parent for all box like features, a subparent for all BLASTN data sets and then some styles for different BLASTN columns:

    ZMap_style : "Basic_Feature_Parent"
    Remark   "The base style for all box like features."
    Colours	 Normal Border "black"
    
    
    ZMap_style : "BLASTN_Parent"
    Alignment
    Style_Parent	 "Basic_Feature_Parent"
    Colours	 Normal Fill "brown"
    Width	 15.000000
    Unbumped
    Score_by_width
    Score_bounds	 100.000000 400.000000
    GFF	 Feature "transcription"
    
    
    ZMap_style : "BLASTN_EST_briggsae"
    Style_Parent	 "BLASTN_Parent"
    Strand_sensitive
    GFF	 Source "BLASTN_EST_briggsae"
    
    
    ZMap_style : "BLASTN_EST_elegans"
    Style_Parent	 "BLASTN_Parent"
    Strand_sensitive
    GFF	 Source "BLASTN_EST_elegans"
    
    
    ZMap_style : "BLASTN_TC1"
    Remark	 "This method was used to map flanking sequences of Tc1 insertions etc etc"
    Style_Parent	 "BLASTN_Parent"
    GFF	 Source "BLASTN_TC1"
    

    The ?Method Class

    The ?Method class has existed in Acedb since the first display code was written but when using ?ZMap_style objects it's usage changes. ZMap only uses the tags outlined in this section.

    //=========================================================================================
    // Method:
    //         With styles the method class will have a different meaning, it will be
    // be used to represent different sets of features and the Column_group tag
    // can be used to clump sets of features into one common set.
    //
    //
    ?Method	Remark ?Text #Evidence
            // ZMap related tags.
            //
            // ZMap_style points to a style specifying how to display/process this feature set.
            //
            // Column_parent specifies the parent "feature set" for features that reference
            // this method, this is the "column" in zmap.
            // Column_child specifies child feature sets that exist within this parent
            // feature set or column. Note that this is a one level thing: children do not
            // have their own children, parents do not have parents. N.B. the XREF will fill in
            // this tag.
            //
            Feature_set Style UNIQUE ?ZMap_style
                        Group UNIQUE Column_child ?Method XREF Column_parent
                                     Column_parent UNIQUE ?Method XREF Column_child
            //
            // All other method tags are as before and will be ignored by ZMap.
            //
    //=========================================================================================
    

    The Style tag _must_ be populated for a feature set to be displayed by zmap.

    The Column_group tag is now defunct, instead there are parent/child tags which take advantage of acedbs UNIQUE and XREF mechanisms to ensure that there is only one level of parent/child i.e. that methods are either parents or children. When using the Group tags you only need to specify the Column_parent tag, the XREF will fill in the Column_children tag.

    The name of the column is the name of the method unless there is a Column_parent method in which case it is the name of the Column_parent method.

    The trailing Float for the Column_parent tag specifies a priority for ordering the column children features into "sub_columns". Low to high priorities go left to right on the forward strand, right to left on the reverse strand.

    An example of column group usage:

    //------------------------------------------------------------------------------------
    
    // A feature set with no parent which will display in the column "Eds_alignments":
    
    Method : "Eds_alignments"
    Style "My favourite lovely style"
    
    
    // A feature set with child feature sets which will all be displayed in the column "Parent column".
    
    Method : "Parent column"
    Style "overall column style"
    
    
    Method : "First Child feature set"
    Column_parent "Parent column"
    Style "my style"
    
    Method : "Second Child feature set"
    Column_parent "Parent column"
    Style "my style"
    
    //------------------------------------------------------------------------------------
    
    The entry in the "featuresets" stanza in the ZMap config file would be:
    [source]
    
    featuresets = Eds_alignments ; Parent column
    

    Genefinder and Auto-created Method objects

    The acedb code creates some method objects itself, this is generally for display of features calculated by the code, e.g. gene finder features. (In fact I think this is the only case we need to be aware of...famous last words....)

    To avoid zmap having to second guess acedbs self generation stuff it is necessary for the database administrator to add these methods in advance. A simple way to do this is as follows:

    1. run the genefinder from within fmap and then "Save" the database before exitting, the auto-created methods will then be saved in the database.

    2. Add a "Style" tag to each genefinder:

            //------------------------------------------------------------------------------------
            Method : "ATG"
            Style "ATG"
      
            Method : "hexExon"
            Style "hexExon"
      
            Method : "hexIntron"
            Style "hexIntron"
      
            Method : "GF_coding_seg"
            Style "GF_coding_seg"
      
            Method : "GF_ATG"
            Style "GF_ATG"
      
            Method : "GF_splice"
            Style "GF_splice"
            //------------------------------------------------------------------------------------
            
    3. Add the Style for each method:

            //------------------------------------------------------------------------------------
            ZMap_Style : "ATG"
            Remark	 "This method is used by acedb to display potential methionine initiation codons"
            Basic
            Colours	 Normal Fill "yellow"
            Width	 15.000000
            Bump_mode	 Complete
            Strand_sensitive
            Show_only_as_3_frame
            GFF	 Source "ATG"
            GFF	 Feature "translation"
      
            ZMap_Style : "hexExon"
            Remark	 "GeneFinder method"
            Basic
            Colours	 Normal Fill "orange"
            Colours	 Normal Border "black"
            Width	 15.000000
            Bump_mode	 Complete
            Strand_sensitive
            Show_only_as_3_frame
            Score_by_width
            Score_bounds	 10.000000 100.000000
      
            ZMap_Style : "hexIntron"
            Remark	 "GeneFinder method"
            Basic
            Colours	 Normal Fill "orange"
            Colours	 Normal Border "black"
            Width	 15.000000
            Bump_mode	 Complete
            Strand_sensitive
            Show_only_as_3_frame
            Score_by_width
            Score_bounds	 10.000000 100.000000
      
            ZMap_Style : "GF_coding_seg"
            Remark	 "GeneFinder method"
            Basic
            Colours	 Normal Fill "gray"
            Colours	 Normal Border "black"
            Width	 15.000000
            Bump_mode	 Complete
            Strand_sensitive
            Show_only_as_3_frame
            Score_by_width
            Score_bounds	 2.0 8.0
      
            ZMap_Style : "GF_ATG"
            Remark	 "GeneFinder method"
            Basic
            Colours	 Normal Fill "orange"
            Colours	 Normal Border "black"
            Width	 15.000000
            Bump_mode	 Complete
            Strand_sensitive
            Show_only_as_3_frame
            Show_only_as_3_frame
            Score_by_width
            Score_bounds	 0.000000 3.000000
      
            ZMap_Style : "GF_splice"
            Remark	 "GeneFinder method"
            Glyph	 Splice
            Frame_0	 Normal Fill "red"
            Frame_1	 Normal Fill "blue"
            Frame_2	 Normal Fill "green"
            Width	 10.000000
            Bump_mode	 Complete
            Strand_sensitive
            Show_only_as_3_frame
            Show_only_as_1_column
            Score_by_width
            Score_bounds	 -2.000000 4.000000
      
            //------------------------------------------------------------------------------------
            

    Using ?Method objects for styles (deprecated)

    Rules for using ?Method objects:

    Using ?Method objects means that both feature display and feature retrieval is completely controlled by the ?Method object. There is no separation of these two, the ?Method is both the feature set and it's display/processing. The ?Method object tags will not be extended so only a subset of ZMap display features can be used. Use of ?Method objects should be seen as a "quick and easy" way of display which will duplicate most of the acedb FMap display but that is all. The following extract from the ?Method class shows which tags ZMap reads.

    // Method:
    //
    // These are the only tags read by ZMap.
    //
    //
    ?Method	Remark ?Text #Evidence
            //
            // the Display information controls how the column looks.
    	Display Control UNIQUE No_display                   // never display features
                                   Init_hidden          // initially hide features
     		Colour #Colour
     		CDS_colour #Colour
    		Frame_sensitive             // Is feature read frame dependent ?
    		Strand_sensitive Show_up_strand #Colour
    		Score	Score_by_offset	// has priority over width, for Jean
    			Score_by_histogram UNIQUE Float	// baseline value
    			Score_bounds UNIQUE Float UNIQUE Float
    		Overlap_mode UNIQUE	Overlap		  // draw on top - default
    					Bumpable	  // bump to avoid overlap
    			     		Cluster		  // one column per homol target
    		Width UNIQUE Float
    		Max_mag UNIQUE Float	// don't show if more bases per line
    		Min_mag UNIQUE Float	// don't show if fewer bases per line
    		Gapped         // draw sequences or homols with gaps
    		Join_blocks    // link up all blocks of a single feature with lines
    	GFF	GFF_source UNIQUE Text
    		GFF_feature UNIQUE Text
    

    Converting from ?Method to ?ZMap_style objects

    ZMap includes a small utility script, methods2style.pl, to do a first pass conversion, the result will require hand editting as some fields cannot be determined from the ?Method object, e.g. Mode.

    SYNOPSIS: methods2style.pl -file  -help
    
    e.g.      methods2style.pl -file ./my/methods/file > ./my_new/styles/file
    

    Keyword Value Format

    "Key Files" are derived from MS Windows ".ini" files although the standard they follow ( the Desktop Entry Specification) is different in places. The basic format is:

    # this is an example copied from the GLib documentation of these files.
    
    [First Group]
    
    Name=Key File Example\tthis value shows\nescaping
    
    Welcome = Hello
    
    
    [Another Group]
    
    Numbers=2;20;-200;0
    
    Booleans=true;false;true;true
    

    "Case Sensitivity:" key files are case sensitive and you should be careful to sepecify all information in lower case.

    A restriction of Key Files is that because they support merging of groups within a file you cannot use multiple groups with the same name to produce separate groups. Each separate group must have a unique name. What this means for Styles is that the Style name is the group name:

    
    [Curated]
    width = 2
    
    [Allele]
    width = 1
    
    etc.