1. 03 Oct, 2017 1 commit
  2. 02 Oct, 2017 2 commits
  3. 01 Oct, 2017 2 commits
  4. 30 Sep, 2017 2 commits
    • Oliver Smart's avatar
      tsv fragment library: switch to using pandas dataframe · 5e822de5
      Oliver Smart authored
      code up search for SMARTS and 'LIKE' (connectivity only) searches
      
      * add "amide" as a SMARTS type search from the Daylight examples
      * alter "steroid" to "steroid-like"
      * add a "porphin-like" connectivity only.
      
      For issue #24 improve fragment searching library
      5e822de5
    • Oliver Smart's avatar
      tsv fragment library: switch to using pandas dataframe · 9d4096df
      Oliver Smart authored
      hack not yet functional.
      For issue #24 improve fragment searching library
      
      Running fragment_library.py: results in logging message showing the
      library successfully read into the dataframe:
          /home/osmart/anaconda/envs/new-rdkit-env3/bin/python /home/osmart/ccd_utils/pdbeccdutils/fragment_library.py
          DEBUG: parse /home/osmart/ccd_utils/pdbeccdutils/data/fragment_library.tsv into panda dataframe:
          DEBUG: dataframe:
                           name    type                                              query                                        description    comment                                    url
          0          acetylurea  SMILES                                    C1C(=O)NC(=O)N1                                                NaN  unchecked                                    NaN
          1            acridine  SMILES                           c1ccc2c(c1)nc1c(c2)cccc1                                                NaN  unchecked                                    NaN
          2
      9d4096df
  5. 29 Sep, 2017 2 commits
    • Oliver Smart's avatar
      use tsv file for fragment library including type description and comment... · 1efcd3e6
      Oliver Smart authored
      hack - read the tsv file and use logging to show information read:
      
              /ebi/msd/work2/osmart/conda/envs/new-rdkit-env3/bin/python /homes/osmart/work2/ccd_utils/pdbeccdutils/fragment_library.py
              DEBUG: parse /homes/osmart/work2/ccd_utils/pdbeccdutils/data/fragment_library.tsv with csv.Dictreader:
              DEBUG: {'type': 'SMILES', 'comment': 'unchecked', 'query': 'C1C(=O)NC(=O)N1', 'description': '', 'name': 'acetylurea', 'url': ''}
              DEBUG: {'type': 'SMILES', 'comment': 'unchecked', 'query': 'c1ccc2c(c1)nc1c(c2)cccc1', 'description': '', 'name': 'acridine', 'url': ''}
              DEBUG: {'type': 'SMILES', 'comment': 'unchecked', 'query': 'c1ccc2c(c1)C(c1c(N2)cccc1)O', 'description': '', 'name': 'acridone', 'url': ''}
              DEBUG: {'type': 'SMILES', 'comment': 'unchecked', 'query': 'c1ccc2c(c1)Nc1c(O2)cc(c(c1)N)O', 'description': '', 'name': 'actinophenoxazine', 'url': ''}
      
      For issue #24 improve fragment searching library
      1efcd3e6
    • Oliver Smart's avatar
      attempt at tsv file for fragment library including type description and comment... · 33c8fbdd
      Oliver Smart authored
      not yet read.
      
      For issue #24 improve fragment searching library
      33c8fbdd
  6. 26 Sep, 2017 6 commits
    • Oliver Smart's avatar
      improve fragment searching - break bonds to metals. · e3f96f04
      Oliver Smart authored
      Improve ccd_utils_cli output to include name.
      Add chlorin and the three pyrollines to SMILES fragments
      
      This means that HEM is recognized as having a porphin ring:
      
      	new-rdkit-env3) osmart@fuji01vm:~/temp$ ccd_utils_cli HEM.cif
      	INFO: : chem_comp_id HEM
      	INFO: : chem_comp_name: PROTOPORPHYRIN IX CONTAINING FE
      	INFO: : fragments:
      	INFO: :    porphin occurs 1 times:
      	INFO: :         C2A C3A C4A CHB C1B NB C4B C3B C2B CHC C1C NC C4C CHD C1D ND C4D CHA C1A NA C3D C2D C3C C2C
      	INFO: :    pyrrole occurs 4 times:
      	INFO: :         C1A C2A C3A C4A NA
      	INFO: :         C1B C2B C3B C4B NB
      	INFO: :         C1C C2C C3C C4C NC
      	INFO: :         C1D C2D C3D C4D ND
      
      and CLA is correctly recognized as having a chlorin:
      
      	INFO: : chem_comp_id CLA
      	INFO: : chem_comp_name: CHLOROPHYLL A
      	INFO: : fragments:
      	INFO: :    chlorin occurs 1 times:
      	INFO: :         C4B NB CHC C1C NC C4C C3C C2C CHD C1D ND C4D C3D C2D CHA C3B C2B C1B CHB C4A C3A C2A C1A NA
      	INFO: :    pyrrole occurs 3 times:
      	INFO: :         C1B C2B C3B C4B NB
      	INFO: :         C1C C2C C3C C4C NC
      	INFO: :         C1D C2D C3D C4D ND
      
      Was expecting the A ring to be identified as a pyrroline but it is not?
      
      In any case improvement over PDBeChem fragment that identifies CLA as having chlorin
      e3f96f04
    • Oliver Smart's avatar
      heme cleanup: start trying to sort other metals. · 0d837f44
      Oliver Smart authored
      provide template for taxol TA1
      
      for issue #5 handling heme
      0d837f44
    • Oliver Smart's avatar
      heme cleanup: adjust charges instead of breaking Fe-N bonds · 7800cb90
      Oliver Smart authored
      where the nitrogen atom has valence 4.
      
      Add a porphin_plus_fe_4_bonds_charged_n so that HEM gets reasonable geometry after this.
      
      for issue #5 handling heme
      7800cb90
    • Oliver Smart's avatar
      change porphin to ChEBI_8337 scaled up by 1.7 · 1b5c3fcd
      Oliver Smart authored
      works well but Fe not central.
      
      for issue #5 handling heme
      1b5c3fcd
    • Oliver Smart's avatar
      refactor using 2D coords from sdf as starting point for complex rings. · dd440725
      Oliver Smart authored
      Load the templates from an sdf file as this is clearer. Alter names.
      
      Still have the chemspider porphin.
      
      for issue #5 handling heme
      dd440725
    • Oliver Smart's avatar
      Alternative approach to handling bond between metal and a 4-valent nitrogen · 001dcddb
      Oliver Smart authored
      instead of breaking it adjust the charges. Currently commented out.
      
      Reasonable approach but would need a template with the iron atom in the porphin ring.
      
      for issue #5 handling heme
      001dcddb
  7. 25 Sep, 2017 3 commits
  8. 24 Sep, 2017 4 commits
    • Oliver Smart's avatar
      ccd_utils_cli: process and write out fragments identified. · c4579c82
      Oliver Smart authored
      for issue #5 handling heme
      and issue #24 fragment
      
      Trying out fragments for HEM and DE9:
      
          (new-rdkit-env3) osmart@fuji01vm:~/ccd_utils$ ccd_utils_cli ~/Downloads/DE9.cif
          INFO: : fragments:
          INFO: :    porphin occurs 1 times:
          INFO: :         C2A C3A C4A CHB C1B NB C4B C3B C2B CHC C1C NC C4C CHD C1D ND C4D CHA C1A NA C3D C2D C3C C2C
          INFO: :    pyrrole occurs 4 times:
          INFO: :         C1A C2A C3A C4A NA
          INFO: :         C1B C2B C3B C4B NB
          INFO: :         C1C C2C C3C C4C NC
          INFO: :         C1D C2D C3D C4D ND
          (new-rdkit-env3) osmart@fuji01vm:~/ccd_utils$ ccd_utils_cli pdbeccdutils/tests/ccd_mmcif_test_files/HEM.cif
          INFO: : fragments:
          INFO: :    pyrrole occurs 2 times:
          INFO: :         C1A C2A C3A C4A NA
          INFO: :         C1C C2C C3C C4C NC
          (new-rdkit-env3) osmart@fuji01vm:~/ccd_utils$ ccd_utils_cli HEM_chargeFe.cif
          INFO: : fragments:
          INFO: :    porphin occurs 1 times:
          INFO: :         C2A C3A C4A CHB C1B NB C4B C3B C2B CHC C1C NC C4C CHD C1D ND C4D CHA C1A NA C3D C2D C3C C2C
          INFO: :    pyrrole occurs 4 times:
          INFO: :         C1A C2A C3A C4A NA
          INFO: :         C1B C2B C3B C4B NB
          INFO: :         C1C C2C C3C C4C NC
          INFO: :         C1D C2D C3D C4D ND
      
      DE9 fragments correctly identified. But in HEM do not identify NB and
      ND pyrrole - these are the ones that have no bond to the metal?
      
      Try manually breaking the metal bond and charging the nitrogen atom
      then correctly identify the porphin and 4 pyroles
      c4579c82
    • Oliver Smart's avatar
      large refactoring of PdbChemicalComponents · beec653e
      Oliver Smart authored
      store seperate rdkit rwmol's - original and "cleaned"
      
      original has the PDB ccd bonds and is not sanitized whereas the 'cleaned'
      has things like breaking Fe-N bonds where N is valence 4 and sanitization.
      
      With this hem picture is improved but still not symmetrical.
      
      Passes nosetests.
      
      Incidently sorts lots of other issues -
      
      InChiKey's now match for HEM 08Y and 0OD
      
      can now produce images for 0OD
      
      for issue #5 handling heme
      beec653e
    • Oliver Smart's avatar
      c7bf8856
    • Oliver Smart's avatar
      revert HEM test file to current downloaded from PDBE · e4fd1204
      Oliver Smart authored
      with the 4 bonds to FE
      
      wget ftp://^Cp.ebi.ac.uk/pub/databases/msd/pdbechem/files/mmcif/HEM.cif
      
      for issue #5 handling heme
      e4fd1204
  9. 21 Sep, 2017 6 commits
  10. 20 Sep, 2017 4 commits
    • Oliver Smart's avatar
      extend chem_comp_xml to also deal with chem_comp.list - as simple list of ccd's · f4d46ca5
      Oliver Smart authored
      Needed to simplify process_component_cif_cli.py for new command line arguments - particularly no output directory
      
      For issue #17 process_components_cif script to read complete components.cif and produce PDBeChem ftp area
      
       tested by:
      
           (new-rdkit-env3) [osmart@ebi-cli-003 ccd_utils]$ nosetests -v pdbeccdutils/tests/test_chem_comp_xml.py
          ..... edited
          pdbeccdutils.tests.test_chem_comp_xml.test_chem_comp_id_list_to_string('EOH\nGLU\n', 'EOH\nGLU\n', 'test chem_comp_id_list_to_string') ... ok
          pdbeccdutils.tests.test_chem_comp_xml.test_chem_comp_id_list_to_file(True, 'call to cc_xml.chem_comp_id_list_to_file(/nfs/msd/work2/osmart/ccd_utils/pdbeccdutils/tests/out/test_chem_comp.list) must create a non-empty file.') ... ok
      
          ----------------------------------------------------------------------
          Ran 29 tests in 0.197s
      
          OK
      f4d46ca5
    • Oliver Smart's avatar
      start on new command line arguments - · 95add6ac
      Oliver Smart authored
      First create the arguments. N.B. have not yet implemented.
      
      For issue #17 process_components_cif script to read complete components.cif and produce PDBeChem ftp area
      
              (new-rdkit-env3) [osmart@ebi-cli-003 ccd_utils]$ process_components_cif -h
              usage: process_components_cif [-h] [--output_dir OUTPUT_DIR]
                                            [--chem_comp_xml CHEM_COMP_XML]
                                            [--test_first TEST_FIRST] [--library LIBRARY]
                                            [--debug]
                                            COMPONENTS_CIF
      
              Script for PDBeChem backend infrastructure.
              Processes the wwPDB Chemical Components Dictionary file components.cif
              producing files for
      
              http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/
      
              To do this components.cif is split into individual PDB chemical component
              definitions cif files, sdf files, pdb files and image files.
              In addition creates chem_comp.xml and chem_comp.list for all components.
      
              positional arguments:
                COMPONENTS_CIF        input PDB-CCD components.cif file (must be specified)
      
              optional arguments:
                -h, --help            show this help message and exit
                --output_dir OUTPUT_DIR, -o OUTPUT_DIR
                                      create an output directory with files suitable for PDBeChem ftp directory
                --chem_comp_xml CHEM_COMP_XML
                                      write chem_comp.xml file to this file.
                --test_first TEST_FIRST
                                      only process the first TEST_FIRST chemical component definitions (for testing).
                --library LIBRARY     use this fragment library in place of the one supplied with the code.
                --debug               turn on debug message logging output
      95add6ac
    • Oliver Smart's avatar
      refactor code separating out different things to clearly labelled methods before doing new work. · aa3345b4
      Oliver Smart authored
       For issue #17 process_components_cif script to read complete components.cif and produce PDBeChem ftp area
      aa3345b4
    • Oliver Smart's avatar
      70M has an inchikey mismatch - so needs to be ignored in test_inchikey_match_for_all_sample_cifs · 922fa614
      Oliver Smart authored
      for issue #17 process_components_cif script to read complete components.cif and produce PDBeChem ftp area
      922fa614
  11. 19 Sep, 2017 8 commits