Create chem.xml for all components
What
Currently http://ftp.ebi.ac.uk/pub/databases/msd/pdbechem/chem.xml lists in xml format information for every PDB chemical component. For instance for ATP and ATQ:
<chemComp>
<id>ATP</id>
<name>ADENOSINE-5'-TRIPHOSPHATE</name>
<formula>C10 H16 N5 O13 P3</formula>
<systematicName>[[(2R,3S,4R,5R)-5-(6-aminopurin-9-yl)-3,4-dihydroxy-oxolan-2-yl]methoxy-hydroxy-phosphoryl] phosphono hydrogen phosphate</systematicName>
<stereoSmiles>Nc1ncnc2n(cnc12)[C@@H]3O[C@H](CO[P@](O)(=O)O[P@@](O)(=O)O[P](O)(O)=O)[C@@H](O)[C@H]3O</stereoSmiles>
<nonStereoSmiles>Nc1ncnc2n(cnc12)[CH]3O[CH](CO[P](O)(=O)O[P](O)(=O)O[P](O)(O)=O)[CH](O)[CH]3O</nonStereoSmiles>
<InChi>InChI=1S/C10H16N5O13P3/c11-8-5-9(13-2-12-8)15(3-14-5)10-7(17)6(16)4(26-10)1-25-30(21,22)28-31(23,24)27-29(18,19)20/h2-4,6-7,10,16-17H,1H2,(H,21,22)(H,23,24)(H2,11,12,13)(H2,18,19,20)/t4-,6-,7-,10-/m1/s1</InChi>
</chemComp>
<chemComp>
<id>ATQ</id>
<name>2-AMINOTHIAZOLINE</name><formula>C3 H6 N2 S</formula>
<systematicName>4,5-dihydro-1,3-thiazol-2-amine</systematicName>
<stereoSmiles>NC1=NCCS1</stereoSmiles>
<nonStereoSmiles>NC1=NCCS1</nonStereoSmiles>
<InChi>InChI=1S/C3H6N2S/c4-3-5-1-2-6-3/h1-2H2,(H2,4,5)</InChi>
</chemComp>
the process developed needs to be able to produce this file.
N.B. file starts:
<chemCompList>
<chemComp>
<id>000</id>
<name>methyl hydrogen carbonate</name>
<formula>C2 H4 O3</formula>
<systematicName>methyl hydrogen carbonate</systematicName>
<stereoSmiles>COC(O)=O</stereoSmiles>
<nonStereoSmiles>COC(O)=O</nonStereoSmiles>
<InChi>InChI=1S/C2H4O3/c1-5-2(3)4/h1H3,(H,3,4)</InChi>
</chemComp>
and ends with </chemCompList>
How
- related to issue #9 (closed) if the cif file could be processed and split creating
chem.xml
this would be good.
Edited by Oliver Smart