load_resource_usage.html 3.77 KB
Newer Older
1 2 3 4 5 6
<?xml version="1.0" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<link rev="made" href="mailto:vxd@glow.apple.com" />
8 9 10 11 12

<body style="background-color: white">

13 14 15 16 17 18 19 20 21 22

<h1 id="NAME">NAME</h1>

<pre><code>    load_resource_usage.pl</code></pre>


<pre><code>    This script obtains resource usage data for your pipeline from the Meadow and stores it in &#39;worker_resource_usage&#39; table.
    Your Meadow class/plugin has to support offline examination of resources in order for this script to work.

    Based on the start time of the first Worker and end time of the last Worker (as recorded in pipeline DB),
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
    it pulls the relevant data out of your Meadow (runs &#39;bacct&#39; script in case of LSF), parses the report and stores in &#39;worker_resource_usage&#39; table.
    You can join this table to &#39;worker&#39; table USING(meadow_name,process_id) in the usual MySQL way
    to filter by analysis_id, do various stats, etc.

    You can optionally provide an an external filename or command to get the data from it (don&#39;t forget to append a &#39;|&#39; to the end!)
    and then the data will be taken from your source and parsed from there.</code></pre>


<pre><code>        # Just run it the usual way: query and store the relevant data into &#39;worker_resource_usage&#39; table:
    load_resource_usage.pl -url mysql://username:secret@hostname:port/long_mult_test

        # The same, but assuming another user &#39;someone_else&#39; ran the pipeline:
    load_resource_usage.pl -url mysql://username:secret@hostname:port/long_mult_test -username someone_else

        # Assuming the dump file existed. Load the dumped bacct data into &#39;worker_resource_usage&#39; table:
    load_resource_usage.pl -url mysql://username:secret@hostname:port/long_mult_test -source long_mult.bacct

        # Provide your own command to fetch and parse the worker_resource_usage data from:
43 44 45 46 47
    load_resource_usage.pl -url mysql://username:secret@hostname:port/long_mult_test -source &quot;bacct -l -C 2012/01/25/13:33,2012/01/25/14:44 |&quot;</code></pre>

<h1 id="OPTIONS">OPTIONS</h1>

<pre><code>    -help                   : print this help
    -url &lt;url string&gt;       : url defining where hive database is located
49 50 51 52 53
    -username &lt;username&gt;    : if it wasn&#39;t you who ran the pipeline, the name of that user can be provided
    -source &lt;filename&gt;      : alternative source of worker_resource_usage data. Can be a filename or a pipe-from command.</code></pre>

<h1 id="LICENSE">LICENSE</h1>

<pre><code>    Copyright [1999-2015] Wellcome Trust Sanger Institute and the EMBL-European Bioinformatics Institute
Matthieu Muffato's avatar
Matthieu Muffato committed
<pre><code>    Copyright [2016-2018] EMBL-European Bioinformatics Institute

    Licensed under the Apache License, Version 2.0 (the &quot;License&quot;); you may not use this file except in compliance with the License.
58 59 60 61
    You may obtain a copy of the License at


62 63
    Unless required by applicable law or agreed to in writing, software distributed under the License
    is distributed on an &quot;AS IS&quot; BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
64 65 66 67 68 69
    See the License for the specific language governing permissions and limitations under the License.</code></pre>

<h1 id="CONTACT">CONTACT</h1>

<pre><code>    Please subscribe to the Hive mailing list:  http://listserver.ebi.ac.uk/mailman/listinfo/ehive-users  to discuss Hive-related questions or to be notified of our updates</code></pre>

70 71 72 73


74 75