Document_IDs for SGC datasets
SGC asked to harmonise their documents (i.e. standardise and update the title, abstract). For FAIRness, the documents should remain stable. However, a large number of replicated SGC assays have been provided in different datasets across multiple releases, and will be merged for v33. When we merge the SGC assays, we'll be merging multiple assays linked to multiple documents into a single assay linked to a single document. Therefore, we could consider harmonising the documents at the same time since some documents will change anyway during the merges. For now, I will perform the merges and will not update the document that remains live, but it would be good to consider tidying this up at some point.
Example (Incucyte datasets):
The assays from three Incucyte datasets will be merged into a single set of assays. There are three documents linked to these assays. After merges/assay downgrades, only a single document will be linked to the remaining live assay. The document that will remain live (as per the SGC suggested updates) is doc_ID 118026. However, this older document is missing authors, abstract etc. but these details were provided with the most recent document (doc_ID; 126208). It would make sense to map over the author, abstract information to the original document.
DOC_ID YEAR DOI CHEMBL_ID TITLE AUTHORS ABSTRACT RELEASE 118026 2021 10.6019/CHEMBL4689842 CHEMBL4689842 EUbOPEN Chemogenomics Library wave 1 30
122367 2022 10.6019/CHEMBL5058564 CHEMBL5058564 Tm Shift (DSF) assay results for EUbOPEN Chemogenomis Library 2 (Incucyte) 32
126208 2023 CHEMBL5303304 EUbOPEN Chemogenomics Library - IncuCyte EUbOPEN Cell Viability-IncuCyte assay results for EUbOPEN Chemogenomics Library: The InucyCyte Viability assay is used to investigate cytotoxicity over 24h. This first determinant as an in cell quality control of different compounds is based on confluence analysis by brightfield acquisition. Compounds are classfiied according to their calculated growth rate in healhty, cytostatic or cytotoxic.