bugfix: "returncode" is never set, so get the exit status from `waitpid`
Compare changes
+ 16
− 12
@@ -36,19 +36,22 @@ import sys
@@ -60,11 +63,12 @@ def wait_for_all_processes(ref_pid):
Created by: muffato
(reported by @tweep on the eHive-users mailing-list) When running an eHive command through Docker, the container returns the exit code 0 even if there are some errors.
$ docker run -it ensemblorg/ensembl-hive seed_pipeline.pl --url mysql://user@blst.abc
Use of uninitialized value in concatenation (.) or string at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/CoreDBConnection.pm line 324.
Could not connect to database as user user using [DBI:mysql:host=blst.abc;port=3306] as a locator:
DBI connect('host=blst.abc;port=3306','user',...) failed: Unknown MySQL server host 'blst.abc' (0) at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/CoreDBConnection.pm line 317.
Use of uninitialized value in concatenation (.) or string at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/CoreDBConnection.pm line 333.
DB(mysql://user@blst.abc:3306/) Could not connect to database as user user using [DBI:mysql:host=blst.abc;port=3306] as a locator:
DBI connect('host=blst.abc;port=3306','user',...) failed: Unknown MySQL server host 'blst.abc' (0) at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/CoreDBConnection.pm line 317.
at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/CoreDBConnection.pm line 333.
Bio::EnsEMBL::Hive::DBSQL::CoreDBConnection::connect(Bio::EnsEMBL::Hive::DBSQL::DBConnection=HASH(0x1a10780)) called at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/DBConnection.pm line 139
eval {...} called at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/DBConnection.pm line 141
Bio::EnsEMBL::Hive::DBSQL::DBConnection::connect(Bio::EnsEMBL::Hive::DBSQL::DBConnection=HASH(0x1a10780)) called at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/CoreDBConnection.pm line 736
Bio::EnsEMBL::Hive::DBSQL::CoreDBConnection::db_handle(Bio::EnsEMBL::Hive::DBSQL::DBConnection=HASH(0x1a10780)) called at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/CoreDBConnection.pm line 976
Bio::EnsEMBL::Hive::DBSQL::CoreDBConnection::__ANON__(Bio::EnsEMBL::Hive::DBSQL::DBConnection=HASH(0x1a10780), undef, undef, "hive_meta") called at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/BaseAdaptor.pm line 243
Bio::EnsEMBL::Hive::DBSQL::BaseAdaptor::_table_info_loader(Bio::EnsEMBL::Hive::DBSQL::MetaAdaptor=HASH(0xa5e830)) called at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/BaseAdaptor.pm line 187
Bio::EnsEMBL::Hive::DBSQL::BaseAdaptor::column_set(Bio::EnsEMBL::Hive::DBSQL::MetaAdaptor=HASH(0xa5e830)) called at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/BaseAdaptor.pm line 628
Bio::EnsEMBL::Hive::DBSQL::BaseAdaptor::AUTOLOAD(Bio::EnsEMBL::Hive::DBSQL::MetaAdaptor=HASH(0xa5e830), "hive_sql_schema_version") called at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/DBAdaptor.pm line 128
eval {...} called at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/DBSQL/DBAdaptor.pm line 128
Bio::EnsEMBL::Hive::DBSQL::DBAdaptor::new("Bio::EnsEMBL::Hive::DBSQL::DBAdaptor", "-reg_alias", undef, "-reg_type", undef, "-url", "mysql://user\@blst.abc", "-no_sql_schema_version_check", undef, ...) called at /repo/ensembl-hive/modules/Bio/EnsEMBL/Hive/HivePipeline.pm line 181
Bio::EnsEMBL::Hive::HivePipeline::new("Bio::EnsEMBL::Hive::HivePipeline", "-url", "mysql://user\@blst.abc", "-reg_conf", undef, "-reg_type", undef, "-reg_alias", undef, ...) called at /repo/ensembl-hive/scripts/seed_pipeline.pl line 82
main::main() called at /repo/ensembl-hive/scripts/seed_pipeline.pl line 152
$ echo $?
0
It seems that main_cmd.returncode
is not set (still None). According to the Python docs, I may have to call main_cmd.wait()
but it didn't help. I suppose it's because the process has already been ripped (see the waitpid
call above).
So my fix is to capture the main return code in wait_for_all_processes
as well (alongside the other children's return codes) and return the first failure (the main process has the priority)
Maybe some applications are used to eHive containers returning 0 ...
Have you added/modified unit tests to test the changes?
We don't have any tests for the Docker image, so I've tested it locally:
docker run -it -v $PWD/scripts/dev/:/repo/ensembl-hive/scripts/dev/ ensemblorg/ensembl-hive seed_pipeline.pl --url mysql://user@blst.abc; echo $?
Have you run the entire test suite and no regression was detected?
The rest of the source code is not impacted
Created by: muffato
:man_facepalming: I shouldn't have uncommented this line. I guess it's not critical, but it would be good to comment it back at the next PR on version/2.5