Skip to content
Snippets Groups Projects
Commit 232701f1 authored by Andreas Kusalananda Kähäri's avatar Andreas Kusalananda Kähäri
Browse files

Guzzle, a stand-alone Perl-based DAS client.

parent c3087343
No related branches found
No related tags found
No related merge requests found
Guzzle
A STAND-ALONE PERL-BASED WEB DAS-CLIENT
Author:
Andreas Kähäri (EMBL-EBI/ensembl),
andreas.kahari@ebi.ac.uk
ABOUT
Guzzle consists of the "guzzle.pl" Perl script. It is a web based
client program that may be used to interface a distributed annotation
server (DAS). The name means "to eat or drink quickly, eagerly and
usually in large amounts" (Cambridge International Dictionary of
English). More information about DAS may be found at [1].
This particular client was developed at the EBI to be used with a
Dazzle server [2] or a Lightweight Distributed Annotation Server
(LDAS) [3] serving protein features, but there is really nothing
stopping it from being used as a more generic DAS client for LDAS or
Dazzle or any other server that serves data in a way that is
consistent with version 1.5 of the DAS protocol.
USAGE
So, what does Guzzle do then? When invoked as a CGI script by a web
browser, it will present a request page consisting of a number of
predefined DAS server sources (with the possibility of adding any
number of user-defined sources). Each source has a name and an URL
associated with it. The URL is equivalent to the DSN, the DAS Source
Name, of the source. In addition to this, there is also the
possibility to choose a colour to identify each source or to specify
that the colour should be taken from the DAS stylesheet of the source,
if it has one.
Once the DAS sources have been chosen, the ID of the sequence that
interests you may be entered together with an optional range. The
type of ID that should be used (SwissProt, Ensembl, or what have you)
depends entirely on the DAS sources that are being queried.
Once done, press the "GO" button.
The result page will contain a PNG image showing the tracks found in
the specified region on the sequence as colored boxes, followed by a
table containing the equivalent information. If link information was
retrieved from the DAS source, the table will contain the appropriate
links.
The descriptive labels on the features in the tracks may be turned off
by unchecking "Show descriptions on tracks". You are also be able
to sort the information in the table in various ways by using
the controls at the bottom of the page and clicking the "Update
display"-button.
CODE DEPENDENCIES
Although it was developed within the Ensembl project at the EBI and
Sanger institutes [4], the code does not depend on any pre-existing
Ensembl code, which hopefully makes it easy to install, to use, and to
modify.
The client does have dependencies on non-standard Perl modules. This
is a list of all modules that it depends on:
Bio::Das required
CGI required
Bio::Graphics required for graphics (optional)
Bio::SeqFeature::Generic required for graphics (optional)
File::Temp required for graphics (optional)
LWP::UserAgent required for faking
stylesheets (optional)
Data::Serializer required for being nice
to DAS servers (optional)
Most of these Perl modules have additional dependencies in themselves
(e.g. the Bio::Graphics module depends on the Perl GD module etc.),
but they are all available through CPAN.
INSTALLATION
Make sure that all dependencies are met (see above) and copy the
guzzle.pl script into the cgi-bin directory of a web server. Make
sure the script has the right permissions (executable). Configure the
client before running for the first time (see below).
CONFIGURATION
Guzzle is configured by changing the definition of a set of variables
in the head of the guzzle.pl script file itself. The comments in the
script should be descriptive enough for most configurational things.
You have the option to turn certain features on or off. These
features are the graphics, the "being nice to DAS servers" feature,
and the stylesheet feature. Turning features off removes dependencies
on some Perl modules.
Turning the graphics feature off will inhibit the creation of
temporary PNG files (which otherwise would need to be cleaned out with
a cron-job or something similar). There will be no graphical view of
the tracks on the result page.
The "being nice to DAS servers" feature is a feature that ensures that
the DAS servers will only be queried once. Without it, the servers
will be queried for features and stylesheets whenever the user presses
the "GO" or "Update display" buttons. With the feature, the responses
from the DAS servers are serialized, compressed, and stored as a
string in a hidden field on the result page.
Turning off the stylesheet feature will prevent the client from ever
asking the DAS server for stylesheet information.
BUGS
The DAS stylesheet support is a hack. The Bio::Das modules from CPAN
does not contain the implementation of the Bio::Das::Stylesheet
module. Rather than building the client around a version of the
Bio::Das modules that can only be found on a CVS server, I decided to
fake the stylesheet support of the client using LWP::UserAgent. This
hack simply picks up the *first* encountered foreground color from the
reply to a stylesheet request. I do not currently use glyphs or other
things specified in the stylesheet of a DAS source.
The generated graphics isn't beautiful.
Better suggestion for names are welcomed.
TO DO
Sequence ID translation and annotation mapping. Watch this space, I'm
working on it.
REFERENCES
[1] http://www.biodas/org/
[2] http://www.biojava.org/dazzle/
[3] http://www.biodas.org/servers/
[4] http://www.ensembl.org/
#!/usr/bin/perl
# $Id$
#
# GUZZLE
#
# A STAND-ALONE PERL-BASED WEB DAS-CLIENT
#
#
# Author: Andreas Khri (EMBL-EBI/ensembl),
# andreas.kahari@ebi.ac.uk
use strict;
use warnings;
# Whereever you keep non-standard modules
use lib qw(/opt/local/libdata/perl5/i386-openbsd/5.8.0
/opt/local/libdata/perl5
/opt/local/libdata/perl5/site_perl/i386-openbsd
/opt/local/libdata/perl5/site_perl);
use Bio::Das;
use CGI::Pretty qw(:standard -compile);
#use Data::Dumper;
#---------------------------------------------------------------
# Configurable things (change these):
# $contact:
# Who's responsible for this thing (webmaster or similar).
#
my $contact = 'Yours Truly<br>yt@some.place';
# $htdocs:
# Where the web server looks when someone requests the web
# server root ('/') URL.
#
my $htdocs = '/var/www/htdocs';
# $tmpurl:
# The URL to the directory which hold the generated images.
#
my $tmpurl = '/guzzle_tmp';
# $tmpdir:
# Where the directory of $tmpurl is located on the system.
# MAKE SURE THIS DIRECTORY EXISTS AND THAT IT IS WRITABLE AND
# EXECUTABLE BY THE WEB SERVER USER!
#
my $tmpdir = $htdocs . $tmpurl;
# $query_page_title and $result_page_title:
# These holds the title stings for the two pages.
#
my $query_page_title = 'Guzzle query page';
my $result_page_title = 'Guzzle result page';
# @presets:
# The DSN's to present by default on the query page. A DSN with
# a non-zero CHECKED field will be checked (used/queried) by
# default.
#
my @presets = (
{ NAME => 'Pfam-A (test)',
DSN => 'http://uhuru/cgi-bin/das/pfam',
CHECKED => 1 },
{ NAME => 'Sprot (test)',
DSN => 'http://uhuru/cgi-bin/das/sprot',
CHECKED => 1 } );
# $nblanks:
# The number of blank/empty fields to present on the query page
# by default.
#
my $nblanks = 1;
# @colours:
# The colours that the user should be able to
# choose from. Must be "named colors", see e.g.
# "http://webdesign.about.com/library/bl_namedcolors.htm".
#
my @colours = qw( red green blue magenta cyan yellow gray black );
# $stylesheet:
# A CSS stylesheet that will be included in every generated web
# page.
#
my $stylesheet = <<EOT;
body {
color: #000;
background: #6a6;
margin: 5%;
font-family: helvetica, arial, sans-serif;
}
td {
font-size: x-small;
}
h1 {
font-family: helvetica, arial, sans-serif;
}
address {
font-family: times, serif;
color: #666;
font-size: small;
}
tt {
font-family: courier, monospace;
}
.thetable {
background: #9c9;
}
EOT
#
# OPTIONAL FEATURES
#
# $use_stylesheets
# Whether to try to make use of the DAS stylesheets or not
# (non-zero is "yes").
#
my $use_stylesheets = 1;
# $be_nice
# Whether to be nice to DAS servers or not.
#
my $be_nice = 1;
# $use_graphics
# Whether to generate and use graphics ro not.
#
my $use_graphics = 1;
# No servicable parts inside...
#---------------------------------------------------------------
if ($use_stylesheets) {
require LWP::UserAgent;
}
if ($be_nice) {
require Data::Serializer;
}
if ($use_graphics) {
require Bio::Graphics;
require Bio::SeqFeature::Generic;
require File::Temp;
import File::Temp 'tempfile';
}
#---------------------------------------------------------------
# page_start_and_head():
# Will create the head of a HTML page with the CSS stylesheet
# from above ($stylesheet) and a H1 header with the page title.
#
sub page_start_and_head
{
my $cgi = shift;
my $title = shift;
print $cgi->header( -expires => 'now' );
print $cgi->start_html(
-title => $title,
-style => $stylesheet);
print $cgi->h1({ style => 'text-align:center' }, $title);
}
# page_foot_and_end():
# Will put the contact address in the page footer and then
# close/finish the page.
#
sub page_foot_and_end
{
my $cgi = shift;
print $cgi->hr, $cgi->address($contact), $cgi->end_html;
}
#---------------------------------------------------------------
# create_graphics():
# Given the array of structures created in result_page(), will
# create a PNG image and return its URL.
#
sub create_graphics
{
my $table = shift; # "Table" since it's the data that goes
# into the table on the result page.
my $do_descr = shift; # Whether to include the descriptions
# on each feature in the graph or not.
my $minpos = $table->[0]{FEATURE}->start;
my $maxpos = $table->[0]{FEATURE}->stop;
my $full_seq;
my %all;
foreach my $table_row (@{ $table }) {
my $feature = $table_row->{FEATURE};
if ($minpos > $feature->start) {
$minpos = $feature->start;
}
if ($maxpos < $feature->stop) {
$maxpos = $feature->stop;
}
my $tag = $table_row->{COLUMNS}[0];
my $source = $table_row->{COLUMNS}[1];
if ($feature->group =~ /^Sequence:/) {
if (!defined $full_seq) {
$full_seq = new Bio::SeqFeature::Generic(
-start => $feature->start,
-end => $feature->end );
}
} else {
push(@{ $all{$source}->{$tag} },
new Bio::SeqFeature::Generic(
-start => $feature->start,
-end => $feature->end,
-primary => $feature->group,
-source_tag => $feature->type->label,
-tag => {
colour => $table_row->{COLOUR}
} ));
}
}
my $panel = new Bio::Graphics::Panel(
-start => $minpos,
-stop => $maxpos,
-grid => 1,
-key_style => 'between',
-pad_left => 10,
-pad_right => 100,
-pad_top => 10,
-pad_bottom => 10);
my $full_length = new Bio::SeqFeature::Generic(
-start => $minpos,
-end => $maxpos);
# Add a nice arrow at the top of the image.
$panel->add_track($full_length,
-glyph => 'arrow',
-tick => 2,
-fgcolor => 'black',
-double => 1);
# (don't show the track for the whole sequence, it's not
# interesting)
#if (defined $full_seq) {
#$panel->add_track($full_seq);
#}
# Populate the panel.
foreach my $source (values %all) {
foreach my $tag (keys %{ $source }) {
if (defined $do_descr) {
$panel->add_track($source->{$tag},
-key => $tag,
-description => sub { $_[0]->source_tag },
-bgcolor => $source->{$tag}[0]->_tag_value('colour'),
-font2color => 'black' );
} else {
$panel->add_track($source->{$tag},
-key => $tag,
-bgcolor => $source->{$tag}[0]->_tag_value('colour'));
}
}
}
# Get a handle of a temporary (persistent) file.
my ($tmpfh, $tmpname) = tempfile('pdas_XXXXX',
UNLINK => 0,
SUFFIX => '.png',
DIR => $tmpdir);
# Dump the image data into the file.
syswrite($tmpfh, $panel->png);
# Make the image file readable for web clients.
chmod(0644, $tmpname);
$tmpname =~ s#.*/##; # Does a 'basename' operation.
return $tmpurl . '/' . $tmpname;
}
#---------------------------------------------------------------
# query_page():
# Creates the page with the query form.
#
sub query_page
{
my $cgi = shift;
if (defined $cgi->param('COUNT')) {
# We've been here before, so figure out how many blank
# lines we should display.
$nblanks = $cgi->param('COUNT') - scalar(@presets);
}
if (defined $cgi->param('INCR')) {
# User wants to insert more blank lines.
++$nblanks;
} elsif (defined $cgi->param('DECR')) {
# User wants to remove blank lines.
--$nblanks;
}
$nblanks = 0 if ($nblanks < 0);
page_start_and_head($cgi, $query_page_title);
$cgi->delete_all;
print $cgi->start_form;
print $cgi->start_table({
-class => 'thetable',
-border => '0',
-cellpadding => '2',
-cellspacing => '2',
-align => 'center' });
if ($use_stylesheets) {
print $cgi->Tr($cgi->th( [ qw( Use Name URL Colour ),
'... or Stylesheet' ] ));
}
my $count = 0;
foreach my $preset (@presets) {
print $cgi->Tr($cgi->td( { -align => 'center' },
$cgi->checkbox(
-name => 'USE',
-value => $count,
-checked => $preset->{CHECKED},
-label => '')),
$cgi->td( [
$preset->{NAME},
$preset->{DSN},
$cgi->popup_menu(
-name => 'COLOUR',
-values => \@colours,
-default => $colours[$count % scalar @colours])
]),
($use_stylesheets ?
$cgi->td( { -align => 'center' },
$cgi->checkbox(
-name => 'STYLE',
-value => $count,
-checked => '1',
-label => '' )) : '' ) ); # That wasn't pretty...
print $cgi->hidden(
-name => 'NAME',
-value => $preset->{NAME});
print $cgi->hidden(
-name => 'DSN',
-value => $preset->{DSN});
++$count;
}
for (my $i = 0; $i < $nblanks; ++$i) {
print $cgi->Tr($cgi->td( { -align => 'center' },
$cgi->checkbox(
-name => 'USE',
-value => $count,
-checked => '0',
-label => '')),
$cgi->td( [
$cgi->textfield(
-name => 'NAME',
-size => '10',
-default => 'Source #' . (1 + $count)),
$cgi->textfield(
-name => 'DSN',
-size => '50'),
$cgi->popup_menu(
-name => 'COLOUR',
-values => \@colours,
-default => $colours[$count % scalar @colours])
] ),
($use_stylesheets ?
$cgi->td( { -align => 'center' },
$cgi->checkbox(
-name => 'STYLE',
-value => $count,
-checked => '0',
-label => '' )) : '' ) ); # Stylish? No...
++$count;
}
print $cgi->Tr($cgi->td( [ '&nbsp;', '&nbsp;',
$cgi->submit(
-name => 'INCR',
-value => 'More blank fields') . '&nbsp;' .
$cgi->submit(
-name => 'DECR',
-value => 'Fewer blank fields'), '&nbsp;',
($use_stylesheets ? '&nbsp;' : '') ] ));
print $cgi->hidden(
-name => 'COUNT',
-value => $count);
print $cgi->hidden(
-name => 'DESCR',
-value => 1);
print $cgi->Tr($cgi->th(
{ -colspan => '2', -align => 'right' }, 'Sequence ID'),
$cgi->td({ -colspan => '2' }, $cgi->textfield(
-name => 'ID',
-size => '25',
-default => 'FLNA_HUMAN') ),
($use_stylesheets ? $cgi->td('&nbsp;') : ''));
print $cgi->Tr($cgi->th(
{ -colspan => '2', -align => 'right' }, 'Range'),
$cgi->td($cgi->textfield(
-name => 'START',
-size => '5') . ' to ' .
$cgi->textfield(
-name => 'STOP',
-size => '5')),
$cgi->td({ -align => 'center',
-colspan => ($use_stylesheets ? '2' : '1') },
$cgi->submit( -name => 'GO', -value => ' Go ' ) ));
print $cgi->end_table, $cgi->end_form;
page_foot_and_end($cgi);
}
#---------------------------------------------------------------
# result_page():
# Queries the DAS server, generates the graphics, and creates
# the result page.
#
sub result_page
{
my $cgi = shift;
my $cereal;
if ($be_nice) {
# Create a serialization/deserialization object
# ($cereal) that we use to embed the results from the
# query in a variable (RESULT) in the form on the result
# page. This means that if the user wants to re-sort
# the result, we won't need to query the DAS server
# again.
$cereal = new Data::Serializer(
serializer => 'Storable',
compress => 1,
portable => 1);
}
page_start_and_head($cgi, $result_page_title);
my $table = [ ];
if (!defined $cgi->param('RESULTS') || !$be_nice) {
# The results are fetched from the DAS server.
my @sources;
foreach my $idx ($cgi->param('USE')) {
my %source = (
NAME => [ $cgi->param('NAME') ]->[$idx],
DSN => [ $cgi->param('DSN') ]->[$idx],
COLOUR => [ $cgi->param('COLOUR') ]->[$idx],
USESTYLE => [ $cgi->param('STYLE') ]->[$idx] );
$source{DSN} =~ s/\s//g;
next unless (length $source{DSN} > 0);
push(@sources, \%source);
}
my $start = $cgi->param('START') || undef;
my $stop = $cgi->param('STOP') || undef;
my $segment;
if (defined $start && defined $stop) {
$segment = $cgi->param('ID') . ':' . $start . ',' . $stop;
} else {
$segment = $cgi->param('ID');
}
my @query;
foreach my $source (@sources) {
push(@query, $source->{DSN} . '/features?segment=' . $segment);
}
my $das = new Bio::Das(15);
my @replies = $das->features(
-dsn => \@query,
-segment => $segment);
foreach my $reply (@replies) {
next unless ($reply->is_success && defined $reply->results);
my $source_url = $reply->dsn->base;
if (defined $reply->dsn->id) {
$source_url .= '/' . $reply->dsn->id;
}
$source_url =~ s/\/features\?.*//;
foreach my $source (@sources) {
next if ($source_url ne $source->{DSN});
# FIXME: This is a quick'n dirty hack to work
# around the non-existant Bio::Das::Stylesheet
# module (in the CPAN distribution of the
# Bio::Das modules, as of 2003-05-08). It will
# pick up and use the *first* FGCOLOR from the
# stylesheet, if one exists.
if ($use_stylesheets && defined $source->{USESTYLE}) {
my $ua = new LWP::UserAgent;
my $response = $ua->get($source_url . '/stylesheet');
if ($response->is_success) {
my $headers = $response->headers;
if ($headers->header('x-das-status') == 200) {
$response->content =~ m#<FGCOLOR>(\w+)</FGCOLOR>#;
$source->{COLOUR} = $1 if (defined $1);
}
}
}
foreach my $feature ($reply->results) {
my %table_row = (
COLUMNS => [
$feature->group, # Label
$source->{NAME}, # Source
$feature->type->label, # Description
$feature->start, # Start
$feature->stop ], # Stop
FEATURE => $feature, # Yes, this will duplicate
# some of the data...
COLOUR => $source->{COLOUR} );
push(@{ $table }, \%table_row);
}
last;
}
}
} else {
# The results are picked up from the encoded string,
# not from any DAS server. Hopefully this works
# everywhere...
$table = $cereal->deserialize($cgi->param('RESULTS'));
}
print $cgi->start_form;
my $sort1;
my $sort2;
if (defined $cgi->param('SORT1')) {
$sort1 = $cgi->param('SORT1');
$sort2 = $cgi->param('SORT2');
} else {
$sort1 = 1;
$sort2 = 3;
}
# Sorting. Must distinguish between the numerical and the
# non-numerical columns (the last few ones and the first few
# ones, respectively).
if ($sort1 <= 2) {
if ($sort2 <= 2) {
$table = [ sort {
$a->{COLUMNS}[$sort1] cmp
$b->{COLUMNS}[$sort1] ||
$a->{COLUMNS}[$sort2] cmp
$b->{COLUMNS}[$sort2] } @{ $table } ];
} else {
$table = [ sort {
$a->{COLUMNS}[$sort1] cmp
$b->{COLUMNS}[$sort1] ||
$a->{COLUMNS}[$sort2] <=>
$b->{COLUMNS}[$sort2] } @{ $table } ];
}
} else {
if ($sort2 <= 2) {
$table = [ sort {
$a->{COLUMNS}[$sort1] <=>
$b->{COLUMNS}[$sort1] ||
$a->{COLUMNS}[$sort2] cmp
$b->{COLUMNS}[$sort2] } @{ $table } ];
} else {
$table = [ sort {
$a->{COLUMNS}[$sort1] <=>
$b->{COLUMNS}[$sort1] ||
$a->{COLUMNS}[$sort2] <=>
$b->{COLUMNS}[$sort2] } @{ $table } ];
}
}
if ($be_nice) {
# Embed the results for fast re-sort.
$cgi->autoEscape(0);
print $cgi->hidden(
-name => 'RESULTS',
-value => $cereal->serialize($table));
$cgi->autoEscape(1);
} else {
# Save the state from the query page.
print $cgi->hidden(
-name => 'USE',
-value => $cgi->param('USE') );
print $cgi->hidden(
-name => 'NAME',
-value => $cgi->param('NAME') );
print $cgi->hidden(
-name => 'DSN',
-value => $cgi->param('DSN') );
print $cgi->hidden(
-name => 'COLOUR',
-value => $cgi->param('COLOUR') );
print $cgi->hidden(
-name => 'STYLE',
-value => $cgi->param('STYLE') );
print $cgi->hidden(
-name => 'START',
-value => $cgi->param('START') );
print $cgi->hidden(
-name => 'STOP',
-value => $cgi->param('STOP') );
print $cgi->hidden(
-name => 'ID',
-value => $cgi->param('ID') );
}
print $cgi->start_table({
-class => 'thetable',
-border => '0',
-cellpadding => '2',
-cellspacing => '2',
-align => 'center' });
if ($use_graphics) {
my $imgsrc = create_graphics($table, $cgi->param('DESCR'));
print $cgi->Tr($cgi->td( { -colspan => '6', -align => 'center' },
$cgi->img({ -src => $imgsrc, -alt => 'Result (PNG)' }),
$cgi->p({ -style => 'text-align:right' },
$cgi->checkbox(
-name => 'DESCR',
-value => 1,
-checked => 1,
-label => ' Show descriptions on tracks' ),
$cgi->br,
$cgi->submit( -name => 'GO', -value => 'Update display' )),
$cgi->p('&nbsp;')
));
}
print $cgi->Tr($cgi->th( [
qw( &nbsp; Label Source Description Start Stop ) ] ));
foreach my $table_row (@{ $table }) {
# Don't display the full sequence reply (good/bad?)
next if ($table_row->{COLUMNS}[0] =~ /^Sequence:/);
# Make the first column the link.
$table_row->{COLUMNS}[0] = $cgi->a({
-href => $table_row->{FEATURE}->link },
$table_row->{COLUMNS}[0]);
print $cgi->Tr($cgi->td( {
-style => 'background:' . $table_row->{COLOUR} }, '&nbsp;' ),
$cgi->td($table_row->{COLUMNS}));
}
print $cgi->Tr($cgi->td({ -colspan => 6 }, '&nbsp;' ));
print $cgi->Tr($cgi->th({ -colspan => 6 }, 'Sorting' ));
$cgi->autoEscape(0);
print $cgi->Tr($cgi->td({ -colspan => 2, -align => 'right' },
'First sort on '),
$cgi->td({ -colspan => 2 },
$cgi->radio_group(
-name => 'SORT1',
-values => [ 0, 1, 2, 3, 4 ],
-labels => {
0 => ' Label',
1 => ' Source',
2 => ' Description',
3 => ' Start',
4 => ' Stop' },
-default => 1 ) ),
$cgi->td({
-rowspan => 2,
-colspan => 2,
-align => 'center',
-valign => 'center' },
$cgi->submit( -name => 'GO', -value => 'Update display' ) ));
print $cgi->Tr($cgi->td({ -colspan => 2, -align => 'right' },
'Then sort on '),
$cgi->td({ -colspan => 2 },
$cgi->radio_group(
-name => 'SORT2',
-values => [ 0, 1, 2, 3, 4 ],
-labels => {
0 => ' Label',
1 => ' Source',
2 => ' Description',
3 => ' Start',
4 => ' Stop' },
-default => 3 ) ));
$cgi->autoEscape(1);
print $cgi->end_table, $cgi->end_form;
page_foot_and_end($cgi);
}
#---------------------------------------------------------------
my $cgi = new CGI;
if (defined $cgi->param('GO')) {
result_page($cgi);
} else {
query_page($cgi);
}
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment