fragment searching library: check current fragments and add any new
In previous work: issue #24 (closed) improved the fragment searching library to allow use of SMARTS and connectivity only (LIKE) SMILES and got porphin, porphin-like, steroid and amide fragments working.
This issue is to check the fragment library sufficiently to get it ready for initial production.
The current fragment searching library is a tsv file and can be found:
includes some rather strange entries:
-
peptide
is not a peptide at all. Do we want anamino acid
? -
prosto
- do know what this is meant to be. There are no hits with current PDBeChem fragment search -
acridone
- 2 hits with current PDBeChem does not work in ccd_utils -
pyranose
andfuranose
vs ribose??? -
additional
It would be worth checking that each fragment in the tsv file works. There is a column comment
that is currently unchecked
.
Questions
- what are the named fragments for? Currently they are used in PDBeChem search but they could be used for a number of other things:
- probes for understanding interactions
- to produce images naming the different parts of a PDB-CCD
- Can one set of fragments fulfil all the different purposes? For instance the current
amide
SMARTS (see #24 (closed)) matches carboxyamide sidechain like ASN and GLN and peptide bonds. If one is interested in probe then do we want C(=O)[NH2]