Chebi ETL | One structure has more than 1 wurcs
In the migration process of wurcs
table, we realized that there are 19 structures in CHEZTST database with more than 1 wurcs
name. The idea is to migrate only the wurcs name corresponding to the molfile. For example, this compound has two WURC in the database:
WURCS=2.0/3,6,5/[a1122h-1x_1-5][a1122h-1a_1-5][a1122h-1b_1-5]/1-2-2-2-3-2/a2-b1_b2-c1_c3-d1_c6-f1_d2-e1
WURCS=2.0/2,6,5/[a1122h-1x_1-5][a1122h-1a_1-5]/1-2-2-2-2-2/a2-b1_b2-c1_c3-d1_c6-f1_d2-e1
But checking its molfile, we only get the first one:
WURCS=2.0/3,6,5/[a1122h-1x_1-5][a1122h-1a_1-5][a1122h-1b_1-5]/1-2-2-2-3-2/a2-b1_b2-c1_c3-d1_c6-f1_d2-e1
So, the idea is to migrate only the above. Finally, this was the SQL sentence used to get the structure ids with more than 1 wurcs
-- Gettings the structure ids
SELECT
s.id,
--n.name AS wurcs,
COUNT(*)
FROM names n
INNER JOIN structures s ON s.compound_id = n.compound_id
AND n.status = s.status
WHERE
s.TYPE = 'mol'
AND n.type = 'SYNONYM'
AND n.name LIKE 'WURCS%'
GROUP BY s.id
HAVING COUNT(*) > 1;
-- Checking the wurcs names of the above ids.
SELECT
s.id AS structure_id,
s.COMPOUND_ID,
n.name AS wurcs
FROM names n
INNER JOIN structures s ON s.compound_id = n.compound_id
AND s.status = n.status
WHERE n.type = 'SYNONYM'
AND n.name LIKE 'WURCS%'
AND s.type = 'mol'
AND s.id IN (3151748, 3141532, 3141937, 3150376, 3129897, 3135232, 3151379, 3148262, 3152216, 3130245, 3140566, 3129414, 3134433, 3141118,3151664, 3138920, 3150556, 3155468, 3137171);
See this slack message for additional information