Skip to content

fragment searching library: check current fragments and add any new

In previous work: issue #24 (closed) improved the fragment searching library to allow use of SMARTS and connectivity only (LIKE) SMILES and got porphin, porphin-like, steroid and amide fragments working.

This issue is to check the fragment library sufficiently to get it ready for initial production.

The current fragment searching library is a tsv file and can be found:

fragment_library.tsv

includes some rather strange entries:

  • peptide is not a peptide at all. Do we want an amino acid?
  • prosto - do know what this is meant to be. There are no hits with current PDBeChem fragment search
  • acridone - 2 hits with current PDBeChem does not work in ccd_utils
  • pyranose and furanose vs ribose???
  • additional

It would be worth checking that each fragment in the tsv file works. There is a column comment that is currently unchecked.

Questions

  • what are the named fragments for? Currently they are used in PDBeChem search but they could be used for a number of other things:
    • probes for understanding interactions
    • to produce images naming the different parts of a PDB-CCD
  • Can one set of fragments fulfil all the different purposes? For instance the current amide SMARTS (see #24 (closed)) matches carboxyamide sidechain like ASN and GLN and peptide bonds. If one is interested in probe then do we want C(=O)[NH2]