Skip to content
Snippets Groups Projects
Commit 80b4e6be authored by Andreas Kusalananda Kähäri's avatar Andreas Kusalananda Kähäri
Browse files

Remove files no longer used.

parent 61515878
No related branches found
No related tags found
No related merge requests found
1 Marker matches multiple times Marker aligns to the genome > 3 times
2 Marker does not align Unable to align to the genome
3 Failed to find Stable ID Stable ID that this xref was linked to no longer exists
4 No mapping done No mapping done for this type of xref
5 Failed to match Unable to match to any ensembl entity at all
6 Failed to match at thresholds Unable to match at the thresholds of 90% for the query or 90% for the target
7 No Master The dependent xref was not matched due to there being no master xref
8 Master failed The dependent xref was not matched due to the master xref not being mapped
66 Did not meet threshold Match score for transcript lower than threshold (0.75)
65 Was not best match Did not top best transcript match score (1.00)
64 Was not best match Did not top best transcript match score (0.99)
63 Was not best match Did not top best transcript match score (0.98)
62 Was not best match Did not top best transcript match score (0.97)
61 Was not best match Did not top best transcript match score (0.96)
60 Was not best match Did not top best transcript match score (0.95)
59 Was not best match Did not top best transcript match score (0.94)
58 Was not best match Did not top best transcript match score (0.93)
57 Was not best match Did not top best transcript match score (0.92)
56 Was not best match Did not top best transcript match score (0.91)
55 Was not best match Did not top best transcript match score (0.90)
54 Was not best match Did not top best transcript match score (0.89)
53 Was not best match Did not top best transcript match score (0.88)
52 Was not best match Did not top best transcript match score (0.87)
51 Was not best match Did not top best transcript match score (0.86)
50 Was not best match Did not top best transcript match score (0.85)
49 Was not best match Did not top best transcript match score (0.84)
48 Was not best match Did not top best transcript match score (0.83)
47 Was not best match Did not top best transcript match score (0.82)
46 Was not best match Did not top best transcript match score (0.81)
45 Was not best match Did not top best transcript match score (0.80)
44 Was not best match Did not top best transcript match score (0.79)
43 Was not best match Did not top best transcript match score (0.78)
42 Was not best match Did not top best transcript match score (0.77)
41 Was not best match Did not top best transcript match score (0.76)
40 Was not best match Did not top best transcript match score (0.75)
67 No overlap No coordinate overlap with any Ensembl transcript
96 Failed to match at thresholds Unable to match at the thresholds of 100% for the query or 100% for the target
125 Failed to match at thresholds Unable to match at the thresholds of 55% for the query or 55% for the target
126 >10% N-strings More than 10% of the sequence consists of strings of Ns. Sequences are not rejected for this reason but this may explain a low coverage hit
127 All long introns Every intron in these hits is of length 250000-400000bp, we require at least one intron to be shorter than 250000bp
128 GSS sequence This cDNA has been excluded from the analysis because it is in the GSS (Genome Survey Sequence) division of GenBank
129 Low coverage Coverage of the best alignment is less than 90% - see query_score for coverage
130 Low coverage with long intron Hits containing introns longer than 250000bp are rejected if coverage is less than 98% - see query_score for coverage
131 Low percent_id with long intron Hits containing introns longer than 250000bp are rejected if percentage identity is less than 98% - see query_score for percent_id
132 Low percent_id Percentage identity of the best alignment is less than 97% - see query_score for percent_id
133 No output from Exonerate Exonerate returned no hits using standard parameters plus options --maxintron 400000 and --softmasktarget FALSE
134 Parent xref failed to match Unable to match as parent xref was not mapped
135 Processed pseudogene Rejected as a processed pseudogene because there are multiple-exon hits with the same coverage which have been rejected for other reasons
136 See kill-list database This sequence has been excluded from the analysis - see the kill-list database for further details
137 Failed to match at thresholds Unable to match at the thresholds of 99% for the query or 99% for the target
138 Marker matches multiple times Marker aligns to the genome > 5 times
#
# updates the unmapped_reason tables on all of the core databases on a given host
#
use strict;
use Getopt::Long;
use DBI;
use IO::File;
my ( $host, $user, $pass, $port,@dbnames, $file, $release_num);
GetOptions( "dbhost|host=s", \$host,
"dbuser|user=s", \$user,
"dbpass|pass=s", \$pass,
"dbport|port=i", \$port,
"file=s", \$file,
"dbnames=s@", \@dbnames, # either provide -dbnames or -release
"release_num=i", \$release_num
);
#both host and file are required
usage() if(!$host || !$file);
#release num XOR dbname are required
if(($release_num && @dbnames) || (!$release_num && !@dbnames)) {
print "\nYou can't use -dbnames <> and -release_num <> options at the same time\n" ;
sleep(3) ;
usage() ;
}
$port ||= 3306;
my $dsn = "DBI:mysql:host=$host;port=$port";
my $db = DBI->connect( $dsn, $user, $pass, {RaiseError => 1} );
if($release_num) {
@dbnames = map {$_->[0] } @{ $db->selectall_arrayref( "show databases" ) };
#
# filter out all non-core databases
#
@dbnames = grep {/^[a-zA-Z]+\_[a-zA-Z]+\_(core|est|estgene|vega)\_${release_num}\_\d+[A-Za-z]?$/} @dbnames;
}
#
# make sure the user wishes to continue
#
print STDERR "The following databases will be unmapped_reason updated:\n ";
print join("\n ", @dbnames);
print "\ncontinue with update (yes/no)> ";
my $input = lc(<STDIN>);
chomp($input);
if($input ne 'yes') {
print "unmapped_reason conversion aborted\n";
exit();
}
#
# read all of the new external_db entries from the file
#
my $fh = IO::File->new();
$fh->open($file) or die("could not open input file $file");
my @rows;
my $row;
while($row = <$fh>) {
chomp($row);
my @a = split(/\t/, $row);
push @rows, {'unmapped_reason_id' => $a[0],
'summary_description' => $a[1],
'full_description' => $a[2]};
}
$fh->close();
foreach my $dbname (@dbnames) {
print STDERR "updating $dbname\n";
$db->do("use $dbname");
my $sth = $db->prepare('DELETE FROM unmapped_reason');
$sth->execute();
$sth->finish();
$sth = $db->prepare('INSERT INTO unmapped_reason (unmapped_reason_id,summary_description, full_description)
VALUES (?,?,?)');
foreach my $row (@rows) {
print $row->{'unmapped_reason_id'}."\n";
$sth->execute($row->{'unmapped_reason_id'},
$row->{'summary_description'},
$row->{'full_description'});
}
$sth->finish();
}
print STDERR "updates complete\n";
sub usage {
print STDERR <<EOF
Usage: update_unmapped_reason_id options
Where options are: -host hostname
-user username
-pass password
-port port_of_server optional
-release the release of the database to update used to
match database names. e.g. 13
-file the path of the file containing the insert statements
of the entries of the external_db table
-dbnames db1
the names of the database to update. if not provided
all of the core databases matching the release arg
will be updated. Either -dbnames or -release must
be specified, but not both. Multiple dbnames can
be provided.
E.g.:
#update 2 databases
perl update_unmapped_reasons.pl -host ecs1c -file unmapped_reason.txt -user ensadmin -pass secret -dbnames homo_sapiens_core_14_33 -dbnames mus_musculus_core_14_30
#update all core databases for release 14
perl update_unmapped_reasons.pl -host ecs2d -file unmapped_reason.txt -user ensadmin -pass secret -release 14
EOF
;
exit;
}
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment