pdbe issueshttps://gitlab.ebi.ac.uk/groups/pdbe/-/issues2018-04-18T10:10:49Zhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/40Docstring format and automatic python API documenation generation2018-04-18T10:10:49ZOliver SmartDocstring format and automatic python API documenation generationIt would be useful to agree on docstring format.It would be useful to agree on docstring format.refactor ccd_utils with better class structure.Lukas PravdaLukas Pravdahttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/39Agree coding standards2018-06-01T08:36:42ZOliver SmartAgree coding standardsFor the refactoring work it would be a good idea to agree on coding standards.
* Code to PEP8 compliant.
* Unit tests to use pytest rather than nose as this is much easier to use in practice (and not out of date)
although pytest suppor...For the refactoring work it would be a good idea to agree on coding standards.
* Code to PEP8 compliant.
* Unit tests to use pytest rather than nose as this is much easier to use in practice (and not out of date)
although pytest supports yield in tests (currently used a lot) this is deprecated and some this will be replaced by parameterised tests. refactor ccd_utils with better class structure.https://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/38Add process_components_cif script to back refactoring branch in stub form2018-03-31T14:35:20ZOliver SmartAdd process_components_cif script to back refactoring branch in stub formThe script `process_components_cif` was the major product of the initial ccd_utils project milestone \#1 %1 that was completed in September 2017.
Project milestone \#3 %3 is to completely replace the code in the initial project but p...The script `process_components_cif` was the major product of the initial ccd_utils project milestone \#1 %1 that was completed in September 2017.
Project milestone \#3 %3 is to completely replace the code in the initial project but process_components_cif` is a well considered script that achieved its aims:
```
Script for PDBeChem backend infrastructure.
Processes the wwPDB Chemical Components Dictionary file components.cif
producing files for
http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/
To do this components.cif is split into individual PDB chemical component
definitions cif files, sdf files, pdb files and image files.
In addition creates chem_comp.xml and chem_comp.list for all components.
```
The unit tests [test_process_components_cif_cli.py](pdbeccdutils/tests/test_process_components_cif_cli.py) are worth updating and preservingrefactor ccd_utils with better class structure.Oliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/26create svg with "image not available" for ligand where image cannot be generated2018-10-13T09:15:40ZOliver Smartcreate svg with "image not available" for ligand where image cannot be generated* For example [10R](http://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/10R) where rdkit cannot handle the caborane * Sameer says it is easier to always have file.
* So create a place holder "image not available" svg for the...* For example [10R](http://www.ebi.ac.uk/pdbe-srv/pdbechem/chemicalCompound/show/10R) where rdkit cannot handle the caborane * Sameer says it is easier to always have file.
* So create a place holder "image not available" svg for these cases.PDBeChem Backend Processing: get into preproductionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/25Rethink on 2D chemical diagram images2017-09-20T07:56:31ZOliver SmartRethink on 2D chemical diagram images* original aim was to replace images in the ftp area (3 gifs: a `large`, a `small` and `hydrogen` for each ccd).
* in addition produce labelled and unlabelled svg images
* but on reconsideration this is not wise - as the small and hydrog...* original aim was to replace images in the ftp area (3 gifs: a `large`, a `small` and `hydrogen` for each ccd).
* in addition produce labelled and unlabelled svg images
* but on reconsideration this is not wise - as the small and hydrogen images are not very good and the ftp area images are not used on the PDBe page
* just produce the svg images.Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/24improve fragment searching method & library2017-10-02T14:38:24ZOliver Smartimprove fragment searching method & library* issue #11 developed fragment matching code.
* accepted the fragment file from the original prototype.
* there are some issues with this file:
- [x] file format - switch to using a normal multimolecule SMILES .smi file format.
- [x] it ...* issue #11 developed fragment matching code.
* accepted the fragment file from the original prototype.
* there are some issues with this file:
- [x] file format - switch to using a normal multimolecule SMILES .smi file format.
- [x] it would be useful to generate pictures of fragment library molecules
- [x] the single cif command line script would be very useful to test what fragments are in an individual ccd cif. Need issue #19 to be done!
- [x] The tools need to be usable by other people #18 needs to be done.
- [x] needs to improved to support SMARTS as well as SMILES alter the fragment library file to tsv with columns `name, smarts?, query, comment`.
- [ ] *peptide* fragment is not a peptide bond - is it an attempt to search for amino acid? replace with Daylight tutorial *amide*
- [x] *steroid* needs to pick up all steroids.
- [x] *deoxyribose* fragment needs to not pick up ribose.
- [ ] *pyranose* fragment is wrong
- [ ] A.M. wants to add additional fragments - provide tools for him to be able to take over the work.
## it would be useful to run a search for a SMILES or SMARTS fragment against the complete CCD.
* This would help developing checking the fragment library but is a reasonably big task.
* thinking of a simple interactive command line 'server'
* that processes the components.cif holding all PDBCCD in memory
* then waits for user to enter a SMILES or SMARTS string (or fragment name) on keyboard
* does a substructure search against the PDBCCD
* then waits for the next string or to switch mode - S=SMILES T=SMARTS F=fragment
## Question: do we really want SMILES substrings to define fragments?
* currently the deoxyribose fragment matches every ribose because deoxyribose is a substructure of ribose. Can SMILES be used to say that C2' is CH2? Does it matter. SMART could be used. [Daylight>SMARTS Examples](http://www.daylight.com/dayhtml_tutorials/languages/smarts/smarts_examples.html) **27-Sept-2017**PDBeChem Backend Processing: get into preproductionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/23PDBeChem ftp Output: Supply divided directory2018-10-13T09:14:48ZOliver SmartPDBeChem ftp Output: Supply divided directory* From issue #13
* Currently it is difficult to navigate the ftp site because it takes around 40 seconds to see http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/images/large/ as it contains 24936 files.
* Instead provide a divided direct...* From issue #13
* Currently it is difficult to navigate the ftp site because it takes around 40 seconds to see http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/images/large/ as it contains 24936 files.
* Instead provide a divided directory with all the files for an individual chemical component.
* so for ATP directory `divided/A/ATP` would contain:
```
divided/A/ATP/ATP.cif
divided/A/ATP/coordinates/ideal/ATP.sdf
divided/A/ATP/coordinates/ideal/ATP.pdb
divided/A/ATP/coordinates/ideal/ATP_no_hydrogen.sdf
divided/A/ATP/coordinates/ideal/ATP.xml
divided/A/ATP/coordinates/model/ATP.sdf
divided/A/ATP/coordinates/model/ATP.pdb
divided/A/ATP/coordinates/model/ATP_no_hydrogen.sdf
divided/A/ATP/2Dimages/with_labels/ATP.xml
divided/A/ATP/2Dimages/with_labels/ATP.png
divided/A/ATP/2Dimages/without_labels/ATP.xml
divided/A/ATP/2Dimages/without_labels/ATP.png
```
* this will enable users to quickly find what they want (hopefully?)PDBeChem Backend Processing: get into preproductionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/21write xyz format file for PDB-CCD2017-08-25T15:59:49ZOliver Smartwrite xyz format file for PDB-CCDPDBeChem currently provides xyz format files
https://en.wikipedia.org/wiki/XYZ_file_format
not sure how useful they are is but easy to write so just implement.
# current files:
* ideal coordinates
```
cat /nfs/ftp/pub/databases/msd...PDBeChem currently provides xyz format files
https://en.wikipedia.org/wiki/XYZ_file_format
not sure how useful they are is but easy to write so just implement.
# current files:
* ideal coordinates
```
cat /nfs/ftp/pub/databases/msd/pdbechem/files/xyz/EOH.xyz
9
EOH
C 0.0070 -0.5690 0.0000
C -1.2850 0.2500 -0.0000
O 1.1300 0.3150 -0.0000
H 0.0390 -1.1970 0.8900
H 0.0390 -1.1970 -0.8900
H -1.3170 0.8780 0.8900
H -1.3170 0.8780 -0.8900
H -2.1420 -0.4240 0.0000
H 1.9860 -0.1370 0.0000
```
* model coordinates:
```
cat /nfs/ftp/pub/databases/msd/pdbechem/files/xyz_r/EOH.xyz
9
EOH
C 15.2120 49.1980 7.4910
C 16.0690 50.3860 7.1040
O 15.8610 48.1850 8.2560
H 14.3750 49.5790 8.0940
H 14.8580 48.7310 6.5600
H 15.4670 51.0980 6.5200
H 16.4420 50.8800 8.0130
H 16.9200 50.0420 6.4980
H 15.2440 47.4880 8.4470
```
* Currently elements are upper case:
```
cat /nfs/ftp/pub/databases/msd/pdbechem/files/xyz/FES.xyz
4
FES
FE 0.0000 -0.2130 -1.5310
FE 0.0000 -0.2130 1.5310
S 1.4610 0.3720 0.0000
S -1.4610 0.3720 0.0000
```
but better if the iron atoms are written as Fe.
Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/18Package code so it can be installed with pip2017-09-19T19:06:24ZOliver SmartPackage code so it can be installed with pip* https://python-packaging.readthedocs.io/en/latest/index.html
* https://stackoverflow.com/questions/8247605/configuring-so-that-pip-install-can-work-from-github* https://python-packaging.readthedocs.io/en/latest/index.html
* https://stackoverflow.com/questions/8247605/configuring-so-that-pip-install-can-work-from-githubImprove PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/17process_components_cif script to read complete components.cif and produce P...2017-09-27T08:53:24ZOliver Smartprocess_components_cif script to read complete components.cif and produce PDBeChem ftp areaIssue #13 have got the basic functioning of process_components_cif_cli.py with a simple test with 4 chemical components.
Now exercise script on the real thing. We have not worried about edge cases so far - what happens if there are no...Issue #13 have got the basic functioning of process_components_cif_cli.py with a simple test with 4 chemical components.
Now exercise script on the real thing. We have not worried about edge cases so far - what happens if there are no ideal coordinates. Do so now:
* make sure script does not fall over on problems but log's error and continues
* for each problem deal with it (adding unit test were possible).
-------------------
12 September 2017
# Summary of progress and outstanding issues.
* Have got script that produces required output in a reasonable way.
* needs some clean up and further work of:
- [x] need to look into inchi mismatch observation.
- [x] improve command line options.
- [ ] logging output - would be good to list number of unsuccessful sdf, pdb, images etc.
- [x] how gif images are produced (avoid svg conversion)
- [x] need to look into RDKit Invariant Violation
Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/16Test fails in directories other than ccd_utils2017-09-21T14:18:05ZIjaz AhmadTest fails in directories other than ccd_utilsTest works in ccd_utils, but if we go upper directory, then we get problems.
```
(my-rdkit-env) [qyuan@ch-qyuan-z440 ccd_utils]$ nosetests test_pdb_chemical_components.py
....................................
-----------------------------...Test works in ccd_utils, but if we go upper directory, then we get problems.
```
(my-rdkit-env) [qyuan@ch-qyuan-z440 ccd_utils]$ nosetests test_pdb_chemical_components.py
....................................
----------------------------------------------------------------------
Ran 36 tests in 0.090s
OK
(my-rdkit-env) [qyuan@ch-qyuan-z440 ccd_utils]$ cd ..
(my-rdkit-env) [qyuan@ch-qyuan-z440 pdbe]$ nosetests ccd_utils/test_pdb_chemical_components.py
ERROR: Failure: ValueError (cannot read PDB chemical components from data/pdb_ccd_mmcif_test_files/HEM.cif as file not found)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/qyuan/anaconda2/envs/my-rdkit-env/lib/python2.7/site-packages/nose/loader.py", line 251, in generate
for test in g():
File "/home/qyuan/pdbe/ccd_utils/test_pdb_chemical_components.py", line 99, in test_load_hem_from_cif
hem = PdbChemicalComponents(file_name=cif_filename('HEM'), cif_parser=cif_parser)
File "/home/qyuan/pdbe/ccd_utils/pdb_chemical_components.py", line 80, in __init__
self.read_ccd_from_cif_file(file_name)
File "/home/qyuan/pdbe/ccd_utils/pdb_chemical_components.py", line 317, in read_ccd_from_cif_file
raise ValueError('cannot read PDB chemical components from {} as file not found'.format(file_name))
ValueError: cannot read PDB chemical components from data/pdb_ccd_mmcif_test_files/HEM.cif as file not found
```Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/14Document the current installation procedure2017-09-21T14:18:05ZOliver SmartDocument the current installation procedure* Currently the ccd_utils project needs a parallel check out of PDBeCIF project. https://github.com/glenveegee/PDBeCIF.git
* How to do this should be explained in the [README.md](README.md) file.
* Add a section *Installation instructions** Currently the ccd_utils project needs a parallel check out of PDBeCIF project. https://github.com/glenveegee/PDBeCIF.git
* How to do this should be explained in the [README.md](README.md) file.
* Add a section *Installation instructions*Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/13Script to read wwPDB chemical component dictionary and split it to produce PD...2017-09-08T09:57:00ZOliver SmartScript to read wwPDB chemical component dictionary and split it to produce PDBeChem outputs# What
This is the next step once all components in issues: #3 #4 #6 #8 (?) #9 #10 have been developed as classes with unit tests.
All of this will be put together to write a program to do the complete job described in %1
# How
* An...# What
This is the next step once all components in issues: #3 #4 #6 #8 (?) #9 #10 have been developed as classes with unit tests.
All of this will be put together to write a program to do the complete job described in %1
# How
* Anticipate using a single process initially
* Performance Testing will be important
* how many PDB CCD fail?
* how long does the process take? How can it be parallelized to use than more processor
# task list of things to code
* Use description from http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/readme.htm
* and files in `/nfs/ftp/pub/databases/msd/pdbechem/`
* For each CCD write a file in:
- [x] `files/mmcif/` individual CCD.cif files for each component
- [x] `files/sdf/` Molfile (SDF) with ideal coordinates and hydrogen atoms
- [x] `files/sdf_nh/` Molfile (SDF) with ideal coordinates without hydrogen atoms
- [x] `files/sdf_r/` Molfile (SDF) with representative coordinates and hydrogen atoms
- [x] `files/sdf_r_nh/` Molfile (SDF) with representative coordinates without hydrogen atoms
- [x] `files/pdb/` PDB with ideal coordinates
- [x] `files/pdb_r` PDB with representative coordinates.
- [x] `files/cml` CML format ideal coorinates
- [x] `files/xyz` (not mentioned in `readme.html `) xyz format ideal (see https://en.wikipedia.org
/wiki/XYZ_file_format)
- [x] `files/xyz_r` same for representative coordinates.
- [x] images svg - 3 different images (see below)
- [x] images gif - convert the svg images.
* overall write
- [x] `chem_comp.list` a simple list of the chem_comp_id's one per line
- [x] `chem.xml` an xml file for all chem_comps
- [x] `readme.htm` start with existing
- [x] tar.gz files for each of the subdirectories in `files` and `images` directories.
* new:
- [x] use logging warn for any problematic inchikey like HEM and CDL.
- [ ] `divided` subdirectory where all files for a chem_component are provided in a separate directory for that component. So for ATP the `divided/A/ATP/` directory will contain .cif, four .sdf files, .... do using softlinks. *Need a separate issue for this.* issue #23Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/12Add DMSO and Sildenafil to list of test CCD cif files2017-09-21T14:18:05ZOliver SmartAdd DMSO and Sildenafil to list of test CCD cif files* In previous work I found that SOx groups could cause problems.
* Please look up [DMSO](https://en.wikipedia.org/wiki/Dimethyl_sulfoxide) and [Sildenafil (Viagra)](https://en.wikipedia.org/wiki/Sildenafil) in PDBeChem and add to directo...* In previous work I found that SOx groups could cause problems.
* Please look up [DMSO](https://en.wikipedia.org/wiki/Dimethyl_sulfoxide) and [Sildenafil (Viagra)](https://en.wikipedia.org/wiki/Sildenafil) in PDBeChem and add to directory with test cif files.
* How do the molecules perform in current tests?
* When committing the files you must include where the files were obtained (exact url) in the commit message.
* Include reference to this issue number in the commit message. See https://docs.gitlab.com/ee/user/project/issues/automatic_issue_closing.htmlImprove PDBe Chemical Components backend infrastructure using RDKit: beta test versionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/11Tool for finding "fragments" in each CCD molecule.2017-09-14T15:35:12ZOliver SmartTool for finding "fragments" in each CCD molecule.## What
Requirements From PDBe confluence: **16 June 2016**
SV wants a tool to produce a file that lists the fragments present in each of the chemical compounds:
* read in the chemical component definition cif file.
* read in...## What
Requirements From PDBe confluence: **16 June 2016**
SV wants a tool to produce a file that lists the fragments present in each of the chemical compounds:
* read in the chemical component definition cif file.
* read in file smi.txt that contains lines like:
```
cyclopentane:C1CCCC1
cyclopropane:C1CC1
cytosine:C1=CNC(NC1N)O
```
* use rdkit to find which of the fragments is in the ccd.
* write results as a csv format file with the contents:
3 letter code eg. "ATP", fragment name from smi.txt, atom names comma delimited e.g. "C1,C2,C3,C4,O5"
* Producing this tool is a priority.
* name for tool ccd_find_fragments.py (provisional)Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/10Create chem.xml for all components2017-08-29T17:00:52ZOliver SmartCreate chem.xml for all components# What
Currently http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/chem.xml lists in xml format information for every PDB chemical component. For instance for ATP and ATQ:
```xml
<chemComp>
<id>ATP</id>
<name>ADENOSINE-5'-TRIPHOSPHATE</...# What
Currently http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/chem.xml lists in xml format information for every PDB chemical component. For instance for ATP and ATQ:
```xml
<chemComp>
<id>ATP</id>
<name>ADENOSINE-5'-TRIPHOSPHATE</name>
<formula>C10 H16 N5 O13 P3</formula>
<systematicName>[[(2R,3S,4R,5R)-5-(6-aminopurin-9-yl)-3,4-dihydroxy-oxolan-2-yl]methoxy-hydroxy-phosphoryl] phosphono hydrogen phosphate</systematicName>
<stereoSmiles>Nc1ncnc2n(cnc12)[C@@H]3O[C@H](CO[P@](O)(=O)O[P@@](O)(=O)O[P](O)(O)=O)[C@@H](O)[C@H]3O</stereoSmiles>
<nonStereoSmiles>Nc1ncnc2n(cnc12)[CH]3O[CH](CO[P](O)(=O)O[P](O)(=O)O[P](O)(O)=O)[CH](O)[CH]3O</nonStereoSmiles>
<InChi>InChI=1S/C10H16N5O13P3/c11-8-5-9(13-2-12-8)15(3-14-5)10-7(17)6(16)4(26-10)1-25-30(21,22)28-31(23,24)27-29(18,19)20/h2-4,6-7,10,16-17H,1H2,(H,21,22)(H,23,24)(H2,11,12,13)(H2,18,19,20)/t4-,6-,7-,10-/m1/s1</InChi>
</chemComp>
<chemComp>
<id>ATQ</id>
<name>2-AMINOTHIAZOLINE</name><formula>C3 H6 N2 S</formula>
<systematicName>4,5-dihydro-1,3-thiazol-2-amine</systematicName>
<stereoSmiles>NC1=NCCS1</stereoSmiles>
<nonStereoSmiles>NC1=NCCS1</nonStereoSmiles>
<InChi>InChI=1S/C3H6N2S/c4-3-5-1-2-6-3/h1-2H2,(H2,4,5)</InChi>
</chemComp>
```
the process developed needs to be able to produce this file.
N.B. file starts:
```
<chemCompList>
<chemComp>
<id>000</id>
<name>methyl hydrogen carbonate</name>
<formula>C2 H4 O3</formula>
<systematicName>methyl hydrogen carbonate</systematicName>
<stereoSmiles>COC(O)=O</stereoSmiles>
<nonStereoSmiles>COC(O)=O</nonStereoSmiles>
<InChi>InChI=1S/C2H4O3/c1-5-2(3)4/h1H3,(H,3,4)</InChi>
</chemComp>
```
and ends with ```</chemCompList>```
# How
* related to issue #9 if the cif file could be processed and split creating `chem.xml` this would be good.
```Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/9Split wwPDB chemical component dictionary file into separate cif file for eac...2017-09-11T14:57:41ZOliver SmartSplit wwPDB chemical component dictionary file into separate cif file for each component## What
* The wwPDB chemical component dictionary is available as a single big (around 215MB) file each week.
* download link ftp://ftp.wwpdb.org/pub/pdb/data/monomers/components.cif
* details https://www.wwpdb.org/data/ccd
* this needs...## What
* The wwPDB chemical component dictionary is available as a single big (around 215MB) file each week.
* download link ftp://ftp.wwpdb.org/pub/pdb/data/monomers/components.cif
* details https://www.wwpdb.org/data/ccd
* this needs to be split into individual CCD files - one for each chemical component.
* these then need to be processed to sdf/pdb and images.
## How
* It might be sensible to do this by parsing the complete file using a cif parser - could then process each component using the tools developed to write sdf, pdb files and images. If so could address issue #10 at same time.
* Or it might be necessary to split the file into individual small files in a separate program without using cif parser each component starts with line `data_ABC` where `ABC` is the chem_comp.id (aka residue name).
** for instance http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/files/mmcif/001.cif
```
data_001
#
_chem_comp.id 001
_chem_comp.name "1-[2,2-DIFLUORO-2-(3,4,5-TRIMETHOXY-PHENYL)-ACETYL]-PIPERIDINE-2-CARBOXYLIC ACID 4-PHENYL-1-(3-PYRIDIN-3-YL-PROPYL)-BUTYL ESTER"
_chem_comp.type NON-POLYMER
_chem_comp.pdbx_type HETAIN
_chem_comp.formula "C35 H42 F2 N2 O6"
```
* could simply read file and look for lines starting `data_...` when line found close previous file and open a new one for the code `...`Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/6Write 2D images of PDB CCD molecule2017-10-01T19:13:29ZOliver SmartWrite 2D images of PDB CCD moleculeImprove PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/5Handling heme2017-09-29T08:35:50ZOliver SmartHandling hemeThere are problems with HEM. See #2 HEM (heme): initial reaction _I think the CCD definition is wrong and pubchem defines it correctly https://pubchem.ncbi.nlm.nih.gov/compound/444098 with Fe2+ and the two nitrogen atoms as N-_
RDKit ...There are problems with HEM. See #2 HEM (heme): initial reaction _I think the CCD definition is wrong and pubchem defines it correctly https://pubchem.ncbi.nlm.nih.gov/compound/444098 with Fe2+ and the two nitrogen atoms as N-_
RDKit produces a warning line when parsing HEM:
```
[10:52:01] Explicit valence for atom # 39 N, 4, is greater than permitted
[10:52:01] Explicit valence for atom # 39 N, 4, is greater than permitted
[10:52:01] WARNING: Accepted unusual valence(s): N(4); Metal was disconnected; Proton(s) added/removed
ccd_utils.test_write_pdb.test_inchikey_match_for_all_sample_cifs('FEDYMSUPMFCVOD-UJJXFSCMSA-N', 'KABFMIBPWCXCRK-RGGAHWMASA-L', 'check inchikeys match for HEM') ... FAIL
```
The initial image created by Qi's test is:
![HEM.img_withH.svg](/uploads/e25842c4fce19867ed1765fe4862831f/HEM.img_withH.svg)
The pubchem inchikey is KABFMIBPWCXCRK-UHFFFAOYSA-L so the initial part of the from RDKit one agrees but the last part does not.Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/4Write pdb files for PDB CCD2017-09-21T14:18:05ZOliver SmartWrite pdb files for PDB CCD# What
* See #3 for sdf files
* in addition we need method to write old style PDB format files.
# How
* RDKit includes code to write PDB files
* Note that it is important the PDB files produce are well formed with correct atom nam...# What
* See #3 for sdf files
* in addition we need method to write old style PDB format files.
# How
* RDKit includes code to write PDB files
* Note that it is important the PDB files produce are well formed with correct atom names and residue names.
* In addition the occupancy and temperature factors should be well formed.
* probably also neccessary to include dummy CRYST1 card
* Testing should include comparison to existing files at PDBeChem as well as loading in coot, pymol, litemol.Improve PDBe Chemical Components backend infrastructure using RDKit: beta test versionOliver SmartOliver Smart