Skip to content
Snippets Groups Projects
Commit 0364683f authored by Abel Ureta-Vidal's avatar Abel Ureta-Vidal
Browse files

small update in doc and regexps

parent 5c0a5204
No related branches found
No related tags found
No related merge requests found
# regexp used for filter out useless description for Homo sapiens
# add more as appropriate, line begining with # are supposed to be comments
^LOC\d+\s*(PROTEIN)?\.?
^ORF.*
^PROTEIN C\d+ORF\d+\.*
\(CLONE \S+\)\s+
^BC\d+\_\d+\.?
^CGI\-\d+ PROTEIN\.?\;?
......@@ -28,8 +31,9 @@ RIKEN CDNA [0-9A-Z]{10}[ \.]
^SIMILAR TO (KIAA|LOC).*
^SIMILAR TO\s+$
^WUGSC:H_.*
^\s*\(FRAGMENT\)\.?\s*$
^\s*\(GENE\)\.?\s*$
^\s*\(?PROTEIN\)?\.?\s*$
^\s*\(?FRAGMENT\)?\.?\s*$
^\s*\(?GENE\)?\.?\s*$
^\s*\(\s*\)\s*$
^\s*\(\d*\)\s*[ \.]$
......
......@@ -50,7 +50,7 @@ my $usage = "
RefSeq entries in gnp-like format.
and/or
\"consortium\" description file, which format should
match this regexp /^(\S+)\\t(.*)\$/, making sure that
match this regexp /^(\\S+)\\t(.*)\$/, making sure that
the mapping includes populating identity_xref table.
OR to load the data from gene-descriptions.tab file to \'gene-description\' table
......
......@@ -967,7 +967,7 @@ sub get_Chromosome {
Arg [2] : int $default_masking_type (optional, default is 0)
0 hard mask, repeats are replaced by Ns
1 soft mask, repeats are transformed in lower case
Arg [3] : hash reference $not_default_masking_cases (optional)
Arg [3] : hash reference $not_default_masking_cases (optional, default is {})
The values are 0 or 1 with same definition as in Arg [2]
The keys of the hash should be of 2 forms
"repeat_class_" . $repeat_consensus->repeat_class,
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment