ccdutils issueshttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues2017-10-17T10:19:31Zhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/33fragment searching library: check current fragments and add any new2017-10-17T10:19:31ZOliver Smartfragment searching library: check current fragments and add any newIn previous work: issue #24 improved the fragment searching library to allow use of SMARTS and connectivity only (LIKE) SMILES and got porphin, porphin-like, steroid and amide fragments working.
This issue is to check the fragment libra...In previous work: issue #24 improved the fragment searching library to allow use of SMARTS and connectivity only (LIKE) SMILES and got porphin, porphin-like, steroid and amide fragments working.
This issue is to check the fragment library sufficiently to get it ready for initial production.
The current fragment searching library is a tsv file and can be found:
[fragment_library.tsv](pdbeccdutils/data/fragment_library.tsv)
includes some rather strange entries:
* [ ] `peptide` is not a peptide at all. Do we want an `amino acid`?
* [ ] `prosto` - do know what this is meant to be. There are no hits with current PDBeChem fragment search
* [ ] `acridone` - 2 hits with current PDBeChem does not work in ccd_utils
* [ ] `pyranose` and `furanose` vs ribose???
* [ ] additional
It would be worth checking that each fragment in the tsv file works. There is a column `comment` that is currently `unchecked`.
# Questions
* what are the named fragments **for**? Currently they are used in PDBeChem search but they could be used for a number of other things:
* probes for understanding interactions
* to produce images naming the different parts of a PDB-CCD
* Can one set of fragments fulfil all the different purposes? For instance the current `amide` SMARTS (see #24) matches carboxyamide sidechain like ASN and GLN and peptide bonds. If one is interested in probe then do we want C(=O)[NH2]PDBeChem Backend Processing: get into preproductionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/32write a document about porphryin remediation2017-09-29T08:37:43ZOliver Smartwrite a document about porphryin remediationWork on heme #5 reveals some inconsistency in how porphryin-like rings are described in the PDB-CCD. Should work up a document to propose a remediation.
Not a %2 issue but keep on board so as not to forget :elephant: Work on heme #5 reveals some inconsistency in how porphryin-like rings are described in the PDB-CCD. Should work up a document to propose a remediation.
Not a %2 issue but keep on board so as not to forget :elephant: PDBeChem Backend Processing: get into preproductionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/31Produce images of named fragments2017-10-01T09:52:34ZOliver SmartProduce images of named fragments* Have got the named fragment searching working but the fragments used have to improved see #24
* in the current PDBeChem web interface there is a Fragments `edit` popup:
![pdbe_chem_fragment_search](/uploads/7451398b8da99eae92051496eb...* Have got the named fragment searching working but the fragments used have to improved see #24
* in the current PDBeChem web interface there is a Fragments `edit` popup:
![pdbe_chem_fragment_search](/uploads/7451398b8da99eae92051496ebf88101/pdbe_chem_fragment_search.png)
* this shows a picture of the fragment (to the right) on a mouse over of the fragment name (here `oxazolidinedione`)
* this issue is provide nice images of the new fragments to replace the old ones.PDBeChem Backend Processing: get into preproductionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/30Agree exact aims for the project and timecourse2017-09-28T11:40:32ZOliver SmartAgree exact aims for the project and timecourse* exactly which processes are to be replaced.
* requires issue #29 to be done first.
* exactly which processes are to be replaced.
* requires issue #29 to be done first.
PDBeChem Backend Processing: get into preproductionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/29Document the existing chemistry-related release processes2017-09-28T11:40:32ZOliver SmartDocument the existing chemistry-related release processesWith Stephen prepare a confluence page documenting the current weekly release process for ligands
* script(s) that run the process
* what programs are run
* what the outcome is - what files are produced to which directories, what goes in...With Stephen prepare a confluence page documenting the current weekly release process for ligands
* script(s) that run the process
* what programs are run
* what the outcome is - what files are produced to which directories, what goes into the database.
* timingsPDBeChem Backend Processing: get into preproductionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/28process_components_cif separate out functionality to split into individual cif2017-09-28T11:40:31ZOliver Smartprocess_components_cif separate out functionality to split into individual cif* from discussion with Sameer.
* for production want to be able to start processes that just depend on the split PDB-CCD mmcif as soon as possible
* so add command line options to process_components cif
| option | what |
| -------- | --...* from discussion with Sameer.
* for production want to be able to start processes that just depend on the split PDB-CCD mmcif as soon as possible
* so add command line options to process_components cif
| option | what |
| -------- | -------- |
| `--just_mmcif` | does the split into individual mmcif files but no further processing. The `--output_dir` must be specified with this option. |
| `--apart_from_mmcif` | used after `--just_mmcif` to do the rest of the processing into coordinate files images `chem.xml` etc. The `--output_dir` option must be specified with this option. `--clean` must not be. |
| `--clean` | remove any existing `output_dir` completely |
* not sure whether this is necessary need to sort out #29 #30 first.PDBeChem Backend Processing: get into preproductionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/27process_components_cif improve logging output with a summary at the end of th...2017-09-28T10:55:02ZOliver Smartprocess_components_cif improve logging output with a summary at the end of the run.From issue #17 **need to improve logging output - would be good to list number of unsuccessful sdf, pdb, images etc.**From issue #17 **need to improve logging output - would be good to list number of unsuccessful sdf, pdb, images etc.**PDBeChem Backend Processing: get into preproductionhttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/19Develop utility command line script to read single cif and write coordinate f...2017-09-27T13:21:16ZOliver SmartDevelop utility command line script to read single cif and write coordinate files/images etc.## What
* Want a command line script so a user can read in any ccd cif and
* write a sdf if they want to - with options for ideal/model coordinates, hydrogen, alias on off
* write a pdb file with options ...
* write an image w...## What
* Want a command line script so a user can read in any ccd cif and
* write a sdf if they want to - with options for ideal/model coordinates, hydrogen, alias on off
* write a pdb file with options ...
* write an image with options ....
* Display properties about the molecule - lipinski things - num rings, rotable bonds etc. etc.
## How
* The command line arguments and the help text for each must be proposed as a comment on this page. **The proposal must be agreed to before any coding is done!**
* Script is to use argparse
* All points except 2 in https://ajminich.com/2013/08/01/10-things-i-wish-every-python-script-did/ must be followed.
* Can you unit test a command line script?
* If there are exceptions (file does not exist etc.) catch them produce a sensible error message to the user and call `sys.exit(1)` to indicate to the calling process there was problem.PDBeChem Backend Processing: get into preproductionOliver SmartOliver Smarthttps://gitlab.ebi.ac.uk/pdbe/ccdutils/-/issues/8Write CML files: *needs to be checked*2017-10-01T19:13:29ZOliver SmartWrite CML files: *needs to be checked** PDBeChem produces CML files.
* CML http://www.xml-cml.org/ might be a bit unpopular
* Should be fairly easy to write using standard xml library?
* But if difficult we could drop CML but it should be easy.* PDBeChem produces CML files.
* CML http://www.xml-cml.org/ might be a bit unpopular
* Should be fairly easy to write using standard xml library?
* But if difficult we could drop CML but it should be easy.PDBeChem Backend Processing: get into preproductionOliver SmartOliver Smart