fragment searching library: check current fragments and add any new
In previous work: issue #24 (closed) improved the fragment searching library to allow use of SMARTS and connectivity only (LIKE) SMILES and got porphin, porphin-like, steroid and amide fragments working.
This issue is to check the fragment library sufficiently to get it ready for initial production.
The current fragment searching library is a tsv file and can be found:
includes some rather strange entries:
-
peptideis not a peptide at all. Do we want anamino acid? -
prosto- do know what this is meant to be. There are no hits with current PDBeChem fragment search -
acridone- 2 hits with current PDBeChem does not work in ccd_utils -
pyranoseandfuranosevs ribose??? -
additional
It would be worth checking that each fragment in the tsv file works. There is a column comment that is currently unchecked.
Questions
- what are the named fragments for? Currently they are used in PDBeChem search but they could be used for a number of other things:
- probes for understanding interactions
- to produce images naming the different parts of a PDB-CCD
- Can one set of fragments fulfil all the different purposes? For instance the current
amideSMARTS (see #24 (closed)) matches carboxyamide sidechain like ASN and GLN and peptide bonds. If one is interested in probe then do we want C(=O)[NH2]