Skip to content
Snippets Groups Projects
Commit a21e9cf0 authored by Ian Longden's avatar Ian Longden
Browse files

now in docs directory

parent d9c6f802
No related branches found
No related tags found
No related merge requests found
Questions
---------
1) What code do i need to run the external database cross reference mapping.
2) What is the recommended way to run the extrnal databse cross references for
an already entered species?
3) How do i add a new species?
4) How do i add a new external database source?
5) How do i track my process?
6) I have mapping errors how do i fix them?
7) How do i start again from the parsing has finished stage?
8) How do i start again from the mapping_finished stage?
9) What is fullmode and partupdate?
10) How do i run my external database references without a compute farm?
11) I want to use a different list of external database sources for my
display_xrefs (names)?
12) I want to use a different list of external database sources for my gene
descriptions?
Answers
-------
1) What software do i need to run the external database cross reference mapping?
You will need a copy of exonerate and the ensembl API code.
Exonerate installation intructions can be found at
http://www.ebi.ac.uk/~guy/exonerate/
To install the ensembl API see
http://www.ensembl.org/info/docs/api/api_installation.html
2) What is the recommended way to run the xrefs for an already entered species?
The xref system comes in two parts, first parsing the external database sources
into an tempory xref database and then mapping these to the core database.
a) To parse the data into the xref database you should use the script
xref_parser.pl, which can be found in ensembl/misc-scripts/xref_mapping
directory.
xref_parser.pl -user rwuser -pass XXX -host host1 -species human
-dbname human_xref -stats -create >& PARSER.OUT
check the file PARSER.OUT to make sure everything is okay. It could be that
it was unable to connect to an external site and may not have loaded
everything.
If there was a problem with the connections try again but this time use the
option -checkdownload as this will not download data you already have but
will try to get the data you are missing, saving time.
The xref_parser.pl script may wait for you to answer a couple of questions
about overwriting the database or redoing the configuration so you will also
have to look at what is in the output file, but this is usually worth doing
to keep a record of what the parser did.
At the end of the parsing you should get a summary which should look
something like:-
============================================================================
Summary of status
============================================================================
EntrezGene EntrezGeneParser OKAY
GO GOParser OKAY
GO InterproGoParser OKAY
Interpro InterproParser OKAY
RefSeq_dna RefSeqParser OKAY
RefSeq_peptide RefSeqGPFFParser OKAY
UniGene UniGeneParser OKAY
Uniprot/SPTREMBL UniProtParser OKAY
Uniprot/SWISSPROT UniProtParser OKAY
ncRNA ncRNA_DBParser OKAY
If any of these are not OKAY then ther has been a problem so look further
up in the file to find out why it failed.
b) Map the external databases entries to the core database.
First you need to create a configuration file.
Below is an example of a configuration file
####################################################
xref
host=host1
port=3306
dbname=macaca_xref
user=user1
password=pass1
dir=./xref_dir
species=macaca_mulatta
host=host2
port=3306
dbname=macaca_core
user=user2
password=pass2
dir=./ensembl_dir
farm
queue=long
exonerate=/software/ensembl/bin/exonerate-1.4.0
####################################################
Note that the Directorys specified must exist when the mapping is done.
The farm options are totally optional and can be left out but may be needed
if you have different queue names or have exonerate installed not in the
default place
Now we can do the mapping.
Ideally this should be done in two steps so that after the first step you
can check the output to make sure you are happy with everything before
loading into the core database.
i) Map the entitys in the xref database and do some checks etc.
xref_mapper.pl -file xref_config >& MAPPER1.OUT
If you have no compute farm then add the -nofarm option.
Check the output file if warning about xref number increasing do not
worry the main thing to be concerned about is a reduction in the number
of that none are in the xref database abut are in the core database.
If you get errors about the mapping files then a couple of things could
have gone wrong, first and usual culprit is that the system ran out of
disk space or the compute farm job got lost.
In this case you have two options
1) reset then database to the parsing stage and rerun all the mappings
To reset the database use the option -reset_to_parsing_finished
xref_mapper.pl -file xref_config -reset_to_parsing_finished
then redo the mapping
xref_mapper.pl -file xref_config -dumpcheck >& MAPPER.OUT
Note here we use -dumpcheck to make the program does not dump the
fasta files if they are already there as this process can take
along time and the fasta files will not have changed.
2) just redo those jobs that failed.
Run the mapper with the -resubmit_failed_jobs flag
xref_mapper.pl -file xref_config -resubmit_failed_jobs
Option 2 will be much faster as it will only redo the jobs that failed.
ii) Load the data into the core database and calculate the display_xrefs etc
xref_mapper.pl -file xref_config -upload >& MAPPER2.OUT
3) How do i add a new species?
Edit the file xref_config.ini and add a new entry in the species section
Here is an example:-
[species macaca_mulatta]
taxonomy_id = 9544
aliases = macaque, rhesus, rhesus macaque, rmacaque
source = EntrezGene::MULTI
source = GO::MULTI
source = InterproGO::MULTI
source = Interpro::MULTI
source = RefSeq_dna::MULTI-vertebrate_mammalian
source = RefSeq_peptide::MULTI-vertebrate_mammalian
source = Uniprot/SPTREMBL::MULTI
source = Uniprot/SWISSPROT::MULTI
source = UniGene::macaca_mulatta
source = ncRNA::MULTI
[species xxxx] and taxonomy_id must be present.
It is usually best just to cut and paste an already existing similar species
and start from that.
4) How do i add a new external database source?
Edit the file xref_config.ini and add a new entry in the sources section
Here is an example:-
[source Fantom::mus_musculus]
# Used by mus_muscullus
name = Fantom
download = Y
order = 100
priority = 1
prio_descr =
parser = FantomParser
release_uri =
data_uri = ftp://fantom.gsc.riken.jp/DDBJ_fantom3_HTC_accession.txt.gz
name: The name you want to call the external database.
You must also add this to the core databases
download: Y if the data needs to be obtained online (i.e. not a local file)
N if you are getting the data from a file.
order: The order in which the source shpuld be parsed. 1 beinging the first.
priority: This is for sources where we get the data from multiple places
i.e. HGNC. For most sources just set this to 1.
prio_desc: Only used for priority sources. And sets a description to give
a way to diffentiate them and track which is which.
parser: Which parser to use. If this is a new source then you will probably
need a new parser. Find a parser that is similar and start from this.
Parsers must be in the ensembl/misc-scripts/xref_mapping/XrefParser
directory.
release_uri: a uri to get the release information from. The parser should
handle this.
data_uri: Explains how and where to get the data from. There can be multiple
lines of this.
The uri can get data via several methods and here is the list and a brief
explaination.
ftp: Get the file via ftp
script: Passes argumant to the parser. This might be things like a database
to connect to to run smome sql to get the data..
file: The name with full path of the file to be parsed.
http: To get data via an external webpage/cgi script.
5) How do i track my process?
If you did not use -noverbose then the output file should give you a general
idea of what stage you are at. By directly examining the xref database you
can see the last stage that was completed by viewing the entries in the
process_status table.
Another option is to use the script xref_tracker.pl which will give you some
information about the status. The script is ran similar to the xref_mapper.pl
code in that it needs a config_file.
xref_tracker.pl -file xref_config
This script gives more information when the xref_mapper is running the
mapping jobs or processing the mapping files as it will tell you how many
have finished and how many are left to run etc. These are the longer stages
of the process.
6) I have mapping errors how do i fix them?
If for some reason a mapping job failed this tends to be things like running
out of disk space, the compute farm loosing a job etc then you have a couple
of options.
i) reset the database to the parsing stage and rerun all the mappings
To reset the database use the option -reset_to_parsing_finished
xref_mapper.pl -file xref_config -reset_to_parsing_finished
then redo the mapping
xref_mapper.pl -file xref_config -dumpcheck
Note here we use -dumpcheck to make sure the program does not dump the fasta
files if they are already there as this process can take along time and the
fasta files will not have changed.
ii) just redo those jobs that failed.
Run the mapper with the -resubmit_failed_jobs flag
xref_mapper.pl -file xref_config -resubmit_failed_jobs
7) How do i start again from the parsing has finished stage?
To reset the database use the option -reset_to_parsing_finished
xref_mapper.pl -file xref_config -reset_to_parsing_finished
8) How do i start again from the mapping_finished stage?
To reset the database use the option -reset_to_mapping_finished
xref_mapper.pl -file xref_config -reset_to_mapping_finished
Remember to use -dumpcheck when you run xref_mapper.pl the next
time to save time.
9) What is fullmode and partupdate?
Fullmode means that all the xrefs are being updated and not just a few specific
external database sources. This is important as this affects the way the
display_xrefs, descriptions are calculated at the end.The user can override
this by setting -partupdate option in the mapper options or change the entry
in the table (key is "fullmode" in meta table).
If we are doing all the xref sources then we know that all the data is local
and hence can do some SQL to get the display_xrefs etc But if this is not the
case then the core database will have extra information in it that may be
needed so we have to query the core database. The xref database has extra
information that is not in the xref database and so simple SQL can be used
whereas with the core
database we have to go for each gene and then for each transcript etc using the
API which is alot slower.
In summary only alter the mode here if you know what you are doing and what
consequences there are.
10) How do i run my external database references without a compute farm?
Simply use the -nofarm option with the xref_mapper.pl script.
This will run the exonerate jobs locally.
11) I want to use a different list of external database sources for my
display_xrefs (names)?
The external databases to be used for the display_xrefs are taken from either
the BasicMapper.pm subroutine transcript_display_sources i.e.
sub transcript_display_xref_sources {
my @list = qw(miRBase
RFAM
HGNC_curated_gene
HGNC_automatic_gene
MGI_curated_gene
MGI_automatic_gene
Clone_based_vega_gene
Clone_based_ensembl_gene
HGNC_curated_transcript
HGNC_automatic_transcript
MGI_curated_transcript
MGI_automatic_transcript
Clone_based_vega_transcript
Clone_based_ensembl_transcript
IMGT/GENE_DB
HGNC
SGD
MGI
flybase_symbol
Anopheles_symbol
Genoscope_annotated_gene
Uniprot/SWISSPROT
Uniprot/Varsplic
RefSeq_peptide
RefSeq_dna
Uniprot/SPTREMBL
EntrezGene
IPI);
my %ignore;
$ignore{"EntrezGene"}= 'FROM:RefSeq_[pd][en][pa].*_predicted';
return [\@list,\%ignore];
}
or if you want to create your own list then you need to create a species.pm
file and create a new subroutine there an example here is for
drosophila_melanogaster.
So in the file drosophila_melanogaster.pm
(found in the directory ensembl/misc-scripts/xref_mapping/XrefMapper)
we have :-
sub transcript_display_xref_sources {
my @list = qw(FlyBaseName_transcript FlyBaseCGID_transcript flybase_annotation_id);
my %ignore;
$ignore{"EntrezGene"}= 'FROM:RefSeq_[pd][en][pa].*_predicted';
return [\@list,\%ignore];
}
12) I want to use a different list of external database sources for my gene
descriptions?
As above but this time we use the sub gene_description_sources.
UniProt/Swissprot - UniProt/Trembl (UNIversal PROTein resource)
---------------------------------------------------------------
The files can come in two types:
1) Contains data for all species
ftp://ftp.ebi.ac.uk/pub/databases/uniprot/knowledgebase/uniprot_sprot.dat.gz
or
ftp://ftp.ebi.ac.uk/pub/databases/uniprot/knowledgebase/uniprot_trembl.dat.gz
This is the normal case.
2) Contains data for one species only
ftp://ftp.ebi.ac.uk/pub/databases/integr8/uniprot/proteomes/17.D_melanogaster.dat.gz
These are primary Xrefs in that they contain sequence and hence can be
mapped to the Ensembl entities via normal alignment methods (we use
Exonerate).
This is a list of dependent Xrefs that might be added:
EMBL
PDB
protein_id
Note: For human, mouse and rat we also take the direct mappings from uniprot for the SWISSPROT entries.
Those not mapped by uniprot are then processed in the normal way.
Refseq_peptide
--------------
The files come in two types those for specific species i.e.
ftp://ftp.ncbi.nih.gov/genomes/Canis_familiaris/protein/protein.gbk.gz
or as a series of numbered none specific species files i.e.
ftp://ftp.ncbi.nih.gov/refseq/release/vertebrate_other/vertebrate_other3.protein.gpff.gz
These files are parsed by the parser RefSeqGPFFParser.pm
These are primary Xrefs in that they contain sequence and hence can be
mapped to the Ensembl entities via normal alignment methods (we use
Exonerate).
Below is a list of dependent Xrefs that might be added:
EntrezGene
Refseq_dna
----------
The files come in two types those for specific species i.e.
ftp://ftp.ncbi.nih.gov/genomes/Gallus_gallus/RNA/rna.gbk.gz
or as a series of numbered none specific species files i.e.
ftp://ftp.ncbi.nih.gov/refseq/release/vertebrate_mammalian/vertebrate_mammalian46.rna.fna.gz
These files are parsed by the parser RefSeqParser.pm
These are primary Xrefs in that they contain sequence and hence can be
mapped to the Ensembl entities via normal alignment methods (we use
Exonerate).
IPI (International Protein Index)
---------------------------------
Comes as species specific file i.e.
ftp://ftp.ebi.ac.uk/pub/databases/IPI/current/ipi.HUMAN.fasta.gz
The files have something like
>IPI:IPI00000005.1|SWISS-PROT:P01111|TREMBL:Q5U091|ENSEMBL:ENSP00000261444;ENSP00000358548|REFSEQ:NP_002515|VEGA:OTTHUMP00000013879 Tax_Id=9606 GTPase NRas precursor
sequence..................
But most of the header information is ignored except for the description
and the IPI value. The sequence is used to position the IPI Xref.
These are primary Xrefs in that they contain sequence and hence can be
mapped to the Ensembl entities via normal alignment methods (we use
Exonerate).
Has no dependent Xrefs.
UniGene
-------
Comes as species specific file i.e.
ftp://ftp.ncbi.nih.gov/repository/UniGene/Bos_taurus/Bt.seq.uniq.gz
ftp://ftp.ncbi.nih.gov/repository/UniGene/Bos_taurus/Bt.data.gz
These are primary Xrefs in that they contain sequence and hence can be
mapped to the Ensembl entities via normal alignment methods (we use
Exonerate). No longer loaded via UniProt.
Has no dependent Xrefs.
EMBL
----
These are dependent Xrefs and are linked to Ensembl via the UniProt
entries.
PDB
---
Protein Data Bank entries are dependent Xrefs and are linked to Ensembl
via the UniProt entries.
protein_id
----------
These are dependent Xrefs and are linked to Ensembl via the UniProt
entries.
PUBMED + Medline
----------------
These are no longer stored due to the large numbers of these. If you
want to add these then see the UniProtParser and RefseqPArser for more
details.
GO
--
Can come in a species specific file or can contain all species.
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/UNIPROT/gene_association.goa_uniprot.gz
ftp://ftp.ebi.ac.uk/pub/databases/GO/goa/HUMAN/gene_association.goa_human.gz
GO information in the UniProt and RefSeq files are ignored and just the
information from the above files are used. The files have references to
UniProt and RefSeq entries and so the GO entries are set to be dependent
Xref on these.
EntrezGene
----------
Gene-centred information at NCBI is stored as a dependent Xref and is
obtained from the RefSeq entries.
InterPro
--------
InterPro is a database of protein families, domains and functional sites
and gets it data from the file
ftp://ftp.ebi.ac.uk/pub/databases/interpro/interpro.xml.gz
NOTE: InterPro has its own table and hence the Xrefs are stored but
are not linked to the Ensembl entities directly but a list of InterPro
and identifiers are stored. The identifiers stored are of the type
PROSITE, PFAM, PRINTS, PREFILE, PROFILE, TIGRFAMs
ncRNA, RFAM, miRNA_Registry
---------------------------
This is a local and is not down loaded automatically via FTP so you must
copy this file first before running the parser.
file:ncRNA/ncRNA.txt
These are direct Xrefs so the file contains data on what the Xref is and
which Ensembl entity it matches too.
SPECIES SPECIFIC ENTRIES
------------------------
------------------------
Human
-----
MIM - Online Mendelian Inheritance in Man
-----------------------------------------
Descriptions and types are obtained from the file
ftp://grcf.jhmi.edu/OMIM/omim.txt.Z
This creates two set of Xrefs:
1) MIM_GENE (disease genes and other expressed gene)
2) MIM_MORBID (the disease genes)
Note those in set 2 will also be in set 1.
These MIM Xrefs are linked to UniProt/SwissProt entries using the
UniProtParser.pm creating dependent Xrefs. Note if the Swissprot entry
does not specify whether the MIM entry is a phenotype or a gene then it
is ignored. For this same reason MIM dependent Xrefs are NOT obtained
from the RefSeq entries.
So when the Swissprot entries are matched to Ensembl the MIM entries
will also be matched.
HGNC
----
The HUman Genome Organisation Xrefs are obtained from various sources:-
1) HGNC (ensembl_mapped)
HGNC has direct mapping to ensembl which have been manually curated.
So information is obtianed from the script http://www.genenames.org/cgi-bin/hgnc_downloads.cgi
2) CCDS
The HGNC's are connected to the same ensembl object that the CCDS are linked
to. We connec to the ccds database to get this information.
3) Vega
This is made from the Havana manually curated database.
4) HGNC
HGNC has links to other databases like uniprot,refseq etc and these can be used to link to ensembl
Which of these is chosen at the mapping stage is based on the prioritys of
the sources. Here they are listed in order above.
This is known as a priority xref as the mapping with the best priority is
chosen.
CCDS
----
The CCDS database identifies a core set of human protein coding regions
that are consistently annotated by multiple public resources and pass
quality tests.
A local file is used here:
file:CCDS/CCDS.txt
The file contains a list of CCDS identifiers and the Ensembl entities
they match to. So direct Xrefs are created for these.
Mouse
-----
MGI
------------
Previously known as 'MarkerSymbol'.
ftp://ftp.informatics.jax.org/pub/reports/MRK_SwissProt_TrEMBL.rpt
ftp://ftp.informatics.jax.org/pub/reports/MRK_Synonym.sql.rpt
This is mouse specific Xref being the Mouse Genome Informatics data.
The files have references to UniProt entries and so the GO entries are
set to be dependent Xrefs on these.
Rat
---
RGD
--
Rat Genome Database entries are populate by using the file
ftp://rgd.mcw.edu/pub/data_release/GENES
The RGD Xrefs are dependent Xrefs on the Refseq entries.
Zebra fish
----------
ZFIN_ID
-------
The two files
http://zfin.org/data_transfer/Downloads/refseq.txt
http://zfin.org/data_transfer/Downloads/swissprot.txt
contains list of ZFIN identifiers and RefSeq or Swissprot identifiers
depending on the file.
This creates a set of dependent Xrefs on RefSeq and UniProt entries.
C Elegans
---------
wormpep_id, wormbase_locus, wormbase_gene, wormbase_transcript
--------------------------------------------------------------
Uses the file
ftp://ftp.sanger.ac.uk/pub/databases/wormpep/wormpep180/wormpep.table180
and the database (last release should do)
mysql:ensembldb.ensembl.org:3306:caenorhabditis_elegans_core_46_170b:anonymous
This creates direct Xrefs for all these.
This diff is collapsed.
The Xref System
========================================================================
The external database references (Xrefs) are added to the Ensembl
databases using the code found in this directory. The process consists
of two parts. First part is parsing the data into a temporary database
(Xref database). The second part is to map the new Xrefs to the Ensembl
database.
Parsing the external database references
------------------------------------------------------------------------
In this directory you will find an ini-file called 'xref_config.ini'.
This file contains two types of configuration sections: source sections
and species sections. A source section defines Xref priority, order
etc. (as key-value pairs, see the comment at the top of the source
sections for a fuller explanation of these keys) for the source and
also the URIs pointing to the data files that the source should use.
The source label will only be used to refer to the source within the
ini-file (from a species section), so this can be any text string which
is easy to understand the meaning of.
A species section contains information about species aliases, the
numerical taxonomy ID(s) and what sources to use for that species. If
a species has more than one taxonomy ID (in the case where there are
multiple strains or subspecies, for example), there can be more than one
'taxonomy_id' key. The name of the species is defined by the source
label and will be store in the Xref database.
For now, the script 'xref_config2sql.pl' (also found in this directory)
should be used to convert the ini-file into a SQL file which you
should replace the file 'sql/populate_metadata.sql' with. The
'xref_config2sql.pl' script expects to find 'xref_config.ini' in the
current directory, but you may specify an alternative file as the first
command line argument to the script if you have moved or renamed the
ini-file. When 'xref_parser.pl' is run it will load the generated SQL
file into the database and will then download and parse all external
data files for one or several specified species.
If you want to add a new source you will have to add a new source
section, following the pattern used by the other source sections. You
will then have to add it to the species that require the data.
If the new data comes in files not previously handled by the Xref
system, you will now also have to write the parser NewSourceParser.pm
(the parser name may be arbitrary chosen) in the XrefParser directory.
You can find lots of examples of parsers in this directory.
Before running the Xref parser, make sure that the environment
variable 'http_proxy' is set to point to the local HTTP proxy to get
outside the firewall. For Sanger, the value of the variable should be
"http://cache.internal.sanger.ac.uk:3128", i.e. for tcsh shells you
should have
setenv http_proxy http://cache.internal.sanger.ac.uk:3128
in your ~/.tcshrc file, while for bash-like shell you should have
export http_proxy=http://cache.internal.sanger.ac.uk:3128
in your ~/.profile or ~/.bashrc file.
When you run the script 'xref_parser.pl' to do the Xrefs you must pass
to it several options but for most runs all you need to specify it the
user (user name on the database), pass (password), host (database host),
dbname, and species, i.e.
perl xref_parser.pl -host mymachine -user admin -pass XXXX \
-dbname new_human_xref -species human
Please keep the output from this script and check it later. At the end
of the output there will be a summary of what was successful and what
failed to run. This is important.
The parsing can create three types of Xrefs these are
1) Primary (These have sequence and are mapped via exonerate)
2) Dependent (Have no sequence but are dependent on the Primary ones)
3) Direct (These are directly linked to the Ensembl entities, so the
mapping is already done)
Some sources will have more than one set of files associated with it,
in these cases they have the same source name but different source IDs.
These are known as "priority Xrefs" as the Xrefs are mapped according to
the priority of the source. An example of this is the HUGOs.
For more information on the what data can be parsed see the
'parsing_information.txt' file.
Mapping the external database references to the Ensembl core database
------------------------------------------------------------------------
This is an overview of what goes on in the script 'xref_mapper.pl' .
Primary Xrefs are dumped out to two Fasta files, one for peptides and
the other for DNA. Ensembl Transcripts and Translations are then dumped
out to two files in Fasta format.
Exonerate is then used to find the best matches for the Xrefs.
If there is more than one best match then the Xref is mapped to
more than one Ensembl entity. A cutoff is used to filter the best
matches to make sure they pass certain criteria. By default this
is that the query identity OR the target identity must be over
90%. This can be changed by creating your own '<method>.pm' file
in the directory 'XrefMapper/Methods' and creating subroutines
'query_identity_threshold()' and 'target_identity_threshold()' which
return the new values.
So exonerate will generate a set of .map files with the mapping in. The
map-files are then parsed and any that pass the criteria are stored in
the 'xref' table, 'object_xref' table and the 'identity_xref' table.
All dependent Xrefs are also stored if the parent is mapped.
Direct Xrefs are also stored at this stage but no mapping is needed here
as we already knew what each Xref maps too.
For priority Xrefs (ones that have multiple sources) the highest
priority one is only stored.
Any Xrefs which fail to be mapped are written to the unmapped_object
table with a brief explanation of why they could not be mapped.
Once all the mapping have been stored the display Xrefs and the
descriptions are generated for the transcripts and genes.
If you want to change any of the default settings you can create a new
'<species>.pm' for your particular species, or '<taxon>.pm' and override
the script 'BasicMapper.pm' (see 'rattus_norvegicus.pm' as an example).
The 'xref_mapper.pl' script needs a configuration file which has
information on the Xref database and the core database and also the
species name. Below is an example of running the mapping.
perl ~/ensembl-live/ensembl/misc-scripts/xref_mapping/xref_mapper.pl \
-file xref_input -upload >&MAPPER.OUT
Here is an example of a configuration file for 'xref_mapper.pl':
------------------------------------------------------------------------
xref
host=ensembl-machine
port=3306
dbname=human_xref_42
user=admin
password=xxxx
dir=./xref
species=homo_sapiens
taxon=mammalia (this is optional - use taxon if you need more than one species to use the same '<taxon>.pm' module)
host=ensembl-machine
port=3306
dbname=homo_sapiens_core_42_36d
user=admin
password=xxxx
dir=./ensembl
farm
queue=long
exonerate=/software/ensembl/bin/exonerate-1.4.0
------------------------------------------------------------------------
Note it is good practice to put a sub-directory for the Ensembl
directory as many files are generated and hence best to put these all
together and way from everything else or it will be hard to find things.
Also the directory can be tared and zipped in case you need to check
things later.
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment