Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
E
ensembl
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Iterations
Wiki
Requirements
Jira
Code
Merge requests
1
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Locked files
Build
Pipelines
Jobs
Pipeline schedules
Test cases
Artifacts
Deploy
Releases
Package Registry
Container Registry
Operate
Environments
Terraform modules
Monitor
Incidents
Service Desk
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Code review analytics
Issue analytics
Insights
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
ensembl-gh-mirror
ensembl
Commits
10ae3ea1
Commit
10ae3ea1
authored
14 years ago
by
Graham Ritchie
Browse files
Options
Downloads
Patches
Plain Diff
added peek and append methods, changed dies to throws, updated documentation
parent
a7cbc9e8
No related branches found
Branches containing commit
No related tags found
Tags containing commit
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
modules/Bio/EnsEMBL/Utils/Iterator.pm
+130
-33
130 additions, 33 deletions
modules/Bio/EnsEMBL/Utils/Iterator.pm
with
130 additions
and
33 deletions
modules/Bio/EnsEMBL/Utils/Iterator.pm
+
130
−
33
View file @
10ae3ea1
package
Bio::EnsEMBL::Utils::
Iterator
;
=head1 LICENSE
Copyright (c) 1999-2011 The European Bioinformatics Institute and
...
...
@@ -25,46 +27,60 @@
=head1 SYNOPSIS
my $variation_iterator =
$variation_adaptor->fetch_
i
terator_by_VariationSet($1kg_set);
$variation_adaptor->fetch_
I
terator_by_VariationSet($1kg_set);
while ( my $variation = $variation_iterator->next ) {
# operate on variation object
print $variation->name, "\n";
}
=head1 DESCRIPTION
Some adaptor methods may return more objects than can fit in memory at once, in these cases
you can fetch an iterator object instead of the usual
list
reference. The iterator object
you can fetch an iterator object instead of the usual
array
reference. The iterator object
allows you to iterate over the set of objects (using the next() method) without loading the
entire set into memory at once. You can tell if an iterator is exhausted with the has_next()
method.
You can map and grep an iterator in an analogous way to using map and grep on arrays using
the provided map and grep methods. These methods return another iterator, and only perform
the filtering and transformation on each element as it is requested, so again these can be
used without loading the entire set into memory.
method. The peek() method allows you to fetch the next object from the iterator without
advancing the iterator - this is useful if you want to check some property of en element in
the set while leaving the iterator unchanged.
You can filter and transform an iterator in an analogous way to using map and grep on arrays
using the provided map() and grep() methods. These methods return another iterator, and only
perform the filtering and transformation on each element as it is requested, so again these
can be used without loading the entire set into memory.
Iterators can be combined together with the append() method which merges together the
iterator it is called on with the list of iterators passed in as arguments. This is
somewhat analogous to concatenating arrays with the push function. append() returns a new
iterator which iterates over each component iterator until it is exhausted before moving
on to the next iterator, in the order in which they are supplied to the method.
An iterator can be converted to an array (reference) containing all the elements in the
set with the to_arrayref() method, but note that this array may consume a lot of memory if
the set the iterator is iterating over is large and it is recommended that you do not call
this method unless there is no way of working with each element at a time.
=head1 METHODS
=cut
package
Bio::EnsEMBL::Utils::
Iterator
;
use
strict
;
use
warnings
;
use
Bio::EnsEMBL::Utils::
Exception
qw(throw)
;
=head2 new
Argument : a coderef representing the iterator, this anonymous subroutine
is assumed to return the next object in the set when called,
and to return undef when the set is exhausted. If the argument
is not defined then we return an 'empty' iterator that immediately
returns undef
is assumed to return the next object in the set when called,
and to return undef when the set is exhausted. If the argument
is not defined then we return an 'empty' iterator that immediately
returns undef
Example :
my @dbIDs = fetch_relevant_dbIDs();
my $iterator = Bio::EnsEMBL::Utils::Iterator->new(
sub { return $self->fetch_by_dbID(shift @dbIDs) }
);
...
...
@@ -76,7 +92,7 @@ use warnings;
Description: Constructor, creates a new iterator object
Returntype : Bio::EnsEMBL::Utils::Iterator instance
Exceptions :
dies
if the supplied argument is not a coderef
Exceptions :
thrown
if the supplied argument is not a coderef
Caller : general
Status : Experimental
...
...
@@ -92,9 +108,8 @@ sub new {
if
(
not
defined
$coderef
)
{
$coderef
=
sub
{
return
undef
};
}
else
{
die
"
The supplied argument does not look like an coderef
"
unless
ref
$coderef
eq
'
CODE
';
elsif
(
ref
$coderef
ne
'
CODE
'){
throw
("
The supplied argument does not look like an coderef
")
}
my
$self
=
{
sub
=>
$coderef
};
...
...
@@ -113,18 +128,13 @@ sub new {
=cut
sub
next
{
my
$self
=
shift
;
# if someone has called has_next, there might be a cached value we can return
if
(
$self
->
{
next
})
{
return
delete
$self
->
{
next
};
}
else
{
return
$self
->
{
sub
}
->
();
}
# if someone has called has_next or peek, there might be a cached value we can return
$self
->
{
next
}
||=
$self
->
{
sub
}
->
();
return
delete
$self
->
{
next
};
}
=head2 has_next
...
...
@@ -141,11 +151,32 @@ sub next {
sub
has_next
{
my
$self
=
shift
;
$self
->
{
next
}
=
$self
->
{
sub
}
->
();
$self
->
{
next
}
||
=
$self
->
{
sub
}
->
();
return
defined
$self
->
{
next
};
}
=head2 peek
Example : $obj = $iterator->peek()
Description: returns the next object from this iterator, or undef if the iterator is exhausted,
much like next() but does not advance the iterator (so the same object will be
returned on the following call to next() or peek())
Returntype : object reference (the type will depend on what this iterator is iterating over)
Exceptions : none
Caller : general
Status : Experimental
=cut
sub
peek
{
my
$self
=
shift
;
$self
->
{
next
}
||=
$self
->
{
sub
}
->
();
return
$self
->
{
next
};
}
=head2 grep
Example : my $filtered_iterator = $original_iterator->grep(sub {$_->name =~ /^rs/});
...
...
@@ -157,7 +188,7 @@ sub has_next {
preceded with the sub keyword). Otherwise you can pass in a reference to an
existing subroutine with the same behaviour.
Returntype : Bio::EnsEMBL::Utils::Iterator
Exceptions :
dies
if the argument is not a coderef
Exceptions :
thrown
if the argument is not a coderef
Caller : general
Status : Experimental
...
...
@@ -166,7 +197,7 @@ sub has_next {
sub
grep
{
my
(
$self
,
$coderef
)
=
@_
;
die
"
Argument should be a coderef
"
unless
ref
$coderef
eq
'
CODE
';
throw
('
Argument should be a coderef
')
unless
ref
$coderef
eq
'
CODE
';
return
Bio::EnsEMBL::Utils::
Iterator
->
new
(
sub
{
while
(
$self
->
has_next
)
{
...
...
@@ -188,7 +219,7 @@ sub grep {
Otherwise you can pass in a reference to an existing subroutine with
the same behaviour.
Returntype : Bio::EnsEMBL::Utils::Iterator
Exceptions :
dies
if the argument is not a coderef
Exceptions :
thrown
if the argument is not a coderef
Caller : general
Status : Experimental
...
...
@@ -197,7 +228,7 @@ sub grep {
sub
map
{
my
(
$self
,
$coderef
)
=
@_
;
die
"
Argument should be a coderef
"
unless
ref
$coderef
eq
'
CODE
';
throw
('
Argument should be a coderef
')
unless
ref
$coderef
eq
'
CODE
';
return
Bio::EnsEMBL::Utils::
Iterator
->
new
(
sub
{
local
$_
=
$self
->
next
;
...
...
@@ -205,4 +236,70 @@ sub map {
});
}
=head2 to_arrayref
Example : my $arrayref = $iterator->to_arrayref;
Description: return a reference to an array containing all elements from the
iterator. This is created by simply iterating over the iterator
until it is exhausted and adding each element in turn to an array.
Note that this may consume a lot of memory for iterators over
large collections
Returntype : arrayref
Exceptions : none
Caller : general
Status : Experimental
=cut
sub
to_arrayref
{
my
(
$self
)
=
@_
;
my
@array
;
while
(
my
$obj
=
$self
->
next
)
{
push
@array
,
$obj
;
}
return
\
@array
;
}
=head2 append
Example : my $combined_iterator = $iterator1->append($iterator2, $iterator3);
Description: return a new iterator that combines this iterator with the others
passed as arguments, this new iterator will iterate over each
component iterator (in the order supplied here) until it is
exhausted and then move on to the next iterator until all are
exhausted
Argument : an array of Bio::EnsEMBL::Utils::Iterator objects
Returntype : Bio::EnsEMBL::Utils::Iterator
Exceptions : thrown if any of the arguments are not iterators
Caller : general
Status : Experimental
=cut
sub
append
{
my
(
$self
,
@rest
)
=
@_
;
for
my
$iterator
(
@rest
)
{
throw
("
Argument to append doesn't look like an iterator
")
unless
UNIVERSAL::
can
(
$iterator
,
'
has_next
');
}
# push ourselves onto the front of the queue
unshift
@rest
,
$self
;
return
Bio::EnsEMBL::Utils::
Iterator
->
new
(
sub
{
# shift off any exhausted iterators
while
(
@rest
&&
not
$rest
[
0
]
->
has_next
)
{
shift
@rest
;
}
# and return the next object from the iterator at the
# head of the queue, or undef if the queue is empty
return
@rest
?
$rest
[
0
]
->
next
:
undef
;
});
}
1
;
This diff is collapsed.
Click to expand it.
Preview
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment