Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-28-2022 01:57 PM
Hi everyone,
I have a situation as follows: 60,000,000 nodes under the label Service from which I would like to create new nodes and relationships. I have the following query:
// Service
CALL apoc.periodic.iterate(
"
MATCH (s:Service)
RETURN id(s) AS service_id
",
"
MATCH (s:Sevice)
WHERE id(s) = service_id
MERGE (sl:SLA{id: s.id})
ON CREATE SET
sl.respTimePrio4Nonoff = s.respTimePrio4Nonoff,
sl.respTimePrio4Office = s.respTimePrio4Office,
sl.repTimePrio3Office = s.repTimePrio3Office,
sl.respTimePrio1Nonoff = s.respTimePrio1Nonoff,
sl.respTimePrio1Office = s.respTimePrio1Office,
sl.compbaseValue = s.compbaseValue,
sl.repTimePrio2Nonoff = s.repTimePrio2Nonoff,
sl.repTimePrio4Nonoff = s.repTimePrio4Nonoff,
sl.repTimePrio2Office = s.repTimePrio2Office,
sl.respTimePrio2Nonoff = s.respTimePrio2Nonoff,
sl.respTimeOnsPrio1Office = s.respTimeOnsPrio1Office,
sl.respTimePrio3Office = s.respTimePrio3Office,
sl.repTimePrio1Nonoff = s.repTimePrio1Nonoff,
sl.repTimePrio3Nonoff = s.repTimePrio3Nonoff,
sl.respTimePrio3Nonoff = s.respTimePrio3Nonoff,
sl.repTimePrio4Office = s.repTimePrio4Office,
sl.respTimePrio2Office = s.respTimePrio2Office,
sl.repTimePrio1Office = s.repTimePrio1Office
ON MATCH SET
sl.respTimePrio4Nonoff = s.respTimePrio4Nonoff,
sl.respTimePrio4Office = s.respTimePrio4Office,
sl.repTimePrio3Office = s.repTimePrio3Office,
sl.respTimePrio1Nonoff = s.respTimePrio1Nonoff,
sl.respTimePrio1Office = s.respTimePrio1Office,
sl.compbaseValue = s.compbaseValue,
sl.repTimePrio2Nonoff = s.repTimePrio2Nonoff,
sl.repTimePrio4Nonoff = s.repTimePrio4Nonoff,
sl.repTimePrio2Office = s.repTimePrio2Office,
sl.respTimePrio2Nonoff = s.respTimePrio2Nonoff,
sl.respTimeOnsPrio1Office = s.respTimeOnsPrio1Office,
sl.respTimePrio3Office = s.respTimePrio3Office,
sl.repTimePrio1Nonoff = s.repTimePrio1Nonoff,
sl.repTimePrio3Nonoff = s.repTimePrio3Nonoff,
sl.respTimePrio3Nonoff = s.respTimePrio3Nonoff,
sl.repTimePrio4Office = s.repTimePrio4Office,
sl.respTimePrio2Office = s.respTimePrio2Office,
sl.repTimePrio1Office = s.repTimePrio1Office
MERGE (s)-[:FULFILLS]->(sl)
WITH s
CALL{
WITH s
WITH s
WHERE s.billProfileId IS NOT NULL
MERGE (bp:BillProfile{id: s.billProfileId})
MERGE (s)-[:IS_CHARGED_ON]->(bp)
}
CALL{
WITH s
WITH s
WHERE s.contractId IS NOT NULL
MERGE (c:Contract{id: s.contractId})
MERGE (s)-[:IS_INCLUDED_IN]->(c)
}
CALL{
WITH s
WITH s
WHERE s.productId IS NOT NULL
MERGE (p:Product{id: s.productId})
MERGE (s)-[:IS_OFFERED_FOR]->(p)
}
CALL{
WITH s
WITH s
WHERE s.subProductId IS NOT NULL
MERGE (p1:Product{id: s.subProductId})
MERGE (s)-[:IS_OFFERED_FOR]->(p1)
}
CALL{
WITH s
WITH s
WHERE s.agoProductId IS NOT NULL
MERGE (e:ExternalID:AGO{value: row.agoProductId})
ON CREATE SET e.type = 'Product ID'
ON MATCH SET e.type = 'Product ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e)
}
CALL{
WITH s
WITH s
WHERE s.cldBusinessGroupId IS NOT NULL
MERGE (e1:ExternalID:CLD{value: s.cldBusinessGroupId})
ON CREATE SET e1.type = 'Business Group ID'
ON MATCH SET e1.type = 'Business Group ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e1)
}
CALL{
WITH s
WITH s
WHERE s.ippSpId IS NOT NULL
MERGE (e2:ExternalID:IPP{value: s.ippSpId})
ON CREATE SET e2.type = 'Sp ID'
ON MATCH SET e2.type = 'Sp ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e2)
}
CALL{
WITH s
WITH s
WHERE s.mrkEntitlementId IS NOT NULL
MERGE (e3:ExternalID:MRK{value: s.mrkEntitlementId})
ON CREATE SET e3.type = 'Entitlement ID'
ON MATCH SET e3.type = 'Entitlement ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e3)
}
CALL{
WITH s
WITH s
WHERE s.samEquipmentId IS NOT NULL
MERGE (e4:ExternalID:SAM{value: s.samEquipmentId})
ON CREATE SET e4.type = 'Equipment ID'
ON MATCH SET e4.type = 'Equipment ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e4)
}
CALL{
WITH s
WITH s
WHERE s.symContractId IS NOT NULL
MERGE (e5:ExternalID:SYM{value: s.symContractId})
ON CREATE SET e5.type = 'Contract ID'
ON MATCH SET e5.type = 'Contract ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e5)
}
CALL{
WITH s
WITH s
WHERE s.tisKsapId IS NOT NULL
MERGE (e6:ExternalID:TIS{value: s.tisKsapId})
ON CREATE SET e6.type = 'Ksap ID'
ON MATCH SET e6.type = 'Ksap ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e6)
}
CALL{
WITH s
WITH s
WHERE s.oitInstanceId IS NOT NULL
MERGE (e7:ExternalID:OIT{value: s.oitInstanceId})
ON CREATE SET e7.type = 'Instance ID'
ON MATCH SET e7.type = 'Instance ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e7)
}
REMOVE s.respTimePrio4Nonoff, s.respTimePrio4Office, s.repTimePrio3Office, s.respTimePrio1Nonoff, s.respTimePrio1Office, s.compbaseValue, s.repTimePrio2Nonoff, s.repTimePrio4Nonoff, s.repTimePrio2Office, s.respTimePrio2Nonoff, s.respTimeOnsPrio1Office, s.respTimePrio3Office, s.repTimePrio1Nonoff, s.repTimePrio3Nonoff, s.respTimePrio3Nonoff, s.repTimePrio4Office, s.respTimePrio2Office, s.repTimePrio1Office, s.billProfileId, s.contractId, s.productId, s.subProductId, s.agoProductId, s.cldBusinessGroupId, s.ippSpId, s.mrkEntitlementId, s.samEquipmentId, s.symContractId, s.tisKsapId, s.oitInstanceId
",
{batchSize:10000, parallel:false}
);
This query fails due to an OOM error. I have tried different versions of the query, like separating the queries into two different queries or creating first only the nodes and then the relationships but without achieving a good performance/finish of the queries. I am using Neo4j 4.4.4 with 130GB of RAM in cluster mode. The properties used to Merge are all indexed. Is there any smart way to make this query performant without separating it into 10 different queries which would mean traversing the Service nodes non-efficiently? Thanks!
08-29-2022 12:40 PM
Hello @busymo16 😊
In your case, you should try apoc.periodic.commit() procedure instead of apoc.periodic.iterate() procedure.
Moreover, you don't need to return the internal id, you can directly return the node:
CALL apoc.periodic.iterate("
MATCH (s:Service)
RETURN s
", "
MERGE (sl:SLA {id: s.id})
ON CREATE SET
sl.respTimePrio4Nonoff = s.respTimePrio4Nonoff,
sl.respTimePrio4Office = s.respTimePrio4Office,
sl.repTimePrio3Office = s.repTimePrio3Office,
sl.respTimePrio1Nonoff = s.respTimePrio1Nonoff,
sl.respTimePrio1Office = s.respTimePrio1Office,
sl.compbaseValue = s.compbaseValue,
sl.repTimePrio2Nonoff = s.repTimePrio2Nonoff,
sl.repTimePrio4Nonoff = s.repTimePrio4Nonoff,
sl.repTimePrio2Office = s.repTimePrio2Office,
sl.respTimePrio2Nonoff = s.respTimePrio2Nonoff,
sl.respTimeOnsPrio1Office = s.respTimeOnsPrio1Office,
sl.respTimePrio3Office = s.respTimePrio3Office,
sl.repTimePrio1Nonoff = s.repTimePrio1Nonoff,
sl.repTimePrio3Nonoff = s.repTimePrio3Nonoff,
sl.respTimePrio3Nonoff = s.respTimePrio3Nonoff,
sl.repTimePrio4Office = s.repTimePrio4Office,
sl.respTimePrio2Office = s.respTimePrio2Office,
sl.repTimePrio1Office = s.repTimePrio1Office
ON MATCH SET
sl.respTimePrio4Nonoff = s.respTimePrio4Nonoff,
sl.respTimePrio4Office = s.respTimePrio4Office,
sl.repTimePrio3Office = s.repTimePrio3Office,
sl.respTimePrio1Nonoff = s.respTimePrio1Nonoff,
sl.respTimePrio1Office = s.respTimePrio1Office,
sl.compbaseValue = s.compbaseValue,
sl.repTimePrio2Nonoff = s.repTimePrio2Nonoff,
sl.repTimePrio4Nonoff = s.repTimePrio4Nonoff,
sl.repTimePrio2Office = s.repTimePrio2Office,
sl.respTimePrio2Nonoff = s.respTimePrio2Nonoff,
sl.respTimeOnsPrio1Office = s.respTimeOnsPrio1Office,
sl.respTimePrio3Office = s.respTimePrio3Office,
sl.repTimePrio1Nonoff = s.repTimePrio1Nonoff,
sl.repTimePrio3Nonoff = s.repTimePrio3Nonoff,
sl.respTimePrio3Nonoff = s.respTimePrio3Nonoff,
sl.repTimePrio4Office = s.repTimePrio4Office,
sl.respTimePrio2Office = s.respTimePrio2Office,
sl.repTimePrio1Office = s.repTimePrio1Office
MERGE (s)-[:FULFILLS]->(sl)
WITH s
CALL {
WITH s
WITH s
WHERE s.billProfileId IS NOT NULL
MERGE (bp:BillProfile{id: s.billProfileId})
MERGE (s)-[:IS_CHARGED_ON]->(bp)
}
CALL {
WITH s
WITH s
WHERE s.contractId IS NOT NULL
MERGE (c:Contract{id: s.contractId})
MERGE (s)-[:IS_INCLUDED_IN]->(c)
}
CALL {
WITH s
WITH s
WHERE s.productId IS NOT NULL
MERGE (p:Product{id: s.productId})
MERGE (s)-[:IS_OFFERED_FOR]->(p)
}
CALL {
WITH s
WITH s
WHERE s.subProductId IS NOT NULL
MERGE (p1:Product{id: s.subProductId})
MERGE (s)-[:IS_OFFERED_FOR]->(p1)
}
CALL {
WITH s
WITH s
WHERE s.agoProductId IS NOT NULL
MERGE (e:ExternalID:AGO{value: row.agoProductId})
ON CREATE SET e.type = 'Product ID'
ON MATCH SET e.type = 'Product ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e)
}
CALL {
WITH s
WITH s
WHERE s.cldBusinessGroupId IS NOT NULL
MERGE (e1:ExternalID:CLD{value: s.cldBusinessGroupId})
ON CREATE SET e1.type = 'Business Group ID'
ON MATCH SET e1.type = 'Business Group ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e1)
}
CALL {
WITH s
WITH s
WHERE s.ippSpId IS NOT NULL
MERGE (e2:ExternalID:IPP{value: s.ippSpId})
ON CREATE SET e2.type = 'Sp ID'
ON MATCH SET e2.type = 'Sp ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e2)
}
CALL {
WITH s
WITH s
WHERE s.mrkEntitlementId IS NOT NULL
MERGE (e3:ExternalID:MRK{value: s.mrkEntitlementId})
ON CREATE SET e3.type = 'Entitlement ID'
ON MATCH SET e3.type = 'Entitlement ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e3)
}
CALL {
WITH s
WITH s
WHERE s.samEquipmentId IS NOT NULL
MERGE (e4:ExternalID:SAM{value: s.samEquipmentId})
ON CREATE SET e4.type = 'Equipment ID'
ON MATCH SET e4.type = 'Equipment ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e4)
}
CALL {
WITH s
WITH s
WHERE s.symContractId IS NOT NULL
MERGE (e5:ExternalID:SYM{value: s.symContractId})
ON CREATE SET e5.type = 'Contract ID'
ON MATCH SET e5.type = 'Contract ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e5)
}
CALL {
WITH s
WITH s
WHERE s.tisKsapId IS NOT NULL
MERGE (e6:ExternalID:TIS{value: s.tisKsapId})
ON CREATE SET e6.type = 'Ksap ID'
ON MATCH SET e6.type = 'Ksap ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e6)
}
CALL {
WITH s
WITH s
WHERE s.oitInstanceId IS NOT NULL
MERGE (e7:ExternalID:OIT{value: s.oitInstanceId})
ON CREATE SET e7.type = 'Instance ID'
ON MATCH SET e7.type = 'Instance ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e7)
}
REMOVE s.respTimePrio4Nonoff, s.respTimePrio4Office, s.repTimePrio3Office, s.respTimePrio1Nonoff, s.respTimePrio1Office, s.compbaseValue, s.repTimePrio2Nonoff, s.repTimePrio4Nonoff, s.repTimePrio2Office, s.respTimePrio2Nonoff, s.respTimeOnsPrio1Office, s.respTimePrio3Office, s.repTimePrio1Nonoff, s.repTimePrio3Nonoff, s.respTimePrio3Nonoff, s.repTimePrio4Office, s.respTimePrio2Office, s.repTimePrio1Office, s.billProfileId, s.contractId, s.productId, s.subProductId, s.agoProductId, s.cldBusinessGroupId, s.ippSpId, s.mrkEntitlementId, s.samEquipmentId, s.symContractId, s.tisKsapId, s.oitInstanceId
",
{batchSize:10000, parallel:false}
);
Regards,
Cobra
08-30-2022 02:19 AM
Hi @Cobra,
Thanks for your response. I am aware of the fact that I should return only node, but I just wanted to not return it from the first query but only from the second. I have tried the following modifications also, which they all led to OOM errors.
As I import using apoc.load.jdbc 60mil records from Oracle as following:
// Service
WITH $hostname as hostname, $username as username, $password as password, $date as date
CALL apoc.periodic.iterate(
"
cypher runtime=slotted
CALL apoc.load.jdbc($hostname, \" SELECT \\\"id\\\", \\\"consumerTag\\\", \\\"serviceNameCustomer\\\", \\\"productNameOffering\\\", \\\"name\\\", \\\"room\\\", \\\"startDate\\\", \\\"endDate\\\", \\\"customerLabel\\\", \\\"phone\\\", \\\"assetId\\\", \\\"state\\\", \\\"type\\\", \\\"billProfileId\\\", \\\"contractId\\\", \\\"productId\\\", \\\"subProductId\\\", \\\"deleted\\\", \\\"respTimePrio4Nonoff\\\", \\\"respTimePrio4Office\\\", \\\"repTimePrio3Office\\\", \\\"respTimePrio1Nonoff\\\", \\\"respTimePrio1Office\\\", \\\"compbaseValue\\\", \\\"repTimePrio2Nonoff\\\", \\\"repTimePrio4Nonoff\\\", \\\"repTimePrio2Office\\\", \\\"respTimePrio2Nonoff\\\", \\\"respTimeOnsPrio1Office\\\", \\\"respTimePrio3Office\\\", \\\"repTimePrio1Nonoff\\\", \\\"repTimePrio3Nonoff\\\", \\\"respTimePrio3Nonoff\\\", \\\"repTimePrio4Office\\\", \\\"respTimePrio2Office\\\", \\\"repTimePrio1Office\\\", \\\"agoProductId\\\", \\\"cldBusinessGroupId\\\", \\\"ippSpId\\\", \\\"mrkEntitlementId\\\", \\\"samEquipmentId\\\", \\\"symContractId\\\", \\\"tisKsapId\\\", \\\"oitInstanceId\\\" FROM cms_data.hv_service WHERE timestamp \" + $date + \" >= \\\"fromDate\\\" AND timestamp \" + $date + \" < \\\"toDate\\\" \", [], {credentials:{user:$username, password: $password}}) YIELD row
",
"
MERGE (s:Service{id: toInteger(row.id)})
ON CREATE SET
s.consumerTag = toInteger(row.consumerTag),
s.serviceNameCustomer = row.serviceNameCustomer,
s.productNameOffering = row.productNameOffering,
s.name = row.name,
s.room = row.room,
s.startDate = date(row.startDate),
s.northName = row.northName,
s.southName = row.southName,
s.endDate = date(row.endDate),
s.customerLabel = row.customerLabel,
s.phone = row.phone,
s.assetId = row.assetId,
s.state = row.state,
s.type = row.type,
s.billProfileId = toInteger(row.billProfileId),
s.contractId = toInteger(row.contractId),
s.productId = toInteger(row.productId),
s.subProductId = toInteger(row.subProductId),
s.deleted = toInteger(row.deleted),
s.incidentInterventionTimePrio4Nonoff = row.respTimePrio4Nonoff,
s.incidentInterventionTimePrio4 = row.respTimePrio4Office,
s.incidentTimeToResolvePrio3 = row.repTimePrio3Office,
s.incidentInterventionTimePrio1Nonoff = row.respTimePrio1Nonoff,
s.incidentInterventionTimePrio1 = row.respTimePrio1Office,
s.compbaseValue = toInteger(row.compbaseValue),
s.incidentTimeToResolvePrio2Nonoff = row.repTimePrio2Nonoff,
s.incidentTimeToResolvePrio4Nonoff = row.repTimePrio4Nonoff,
s.incidentTimeToResolvePrio2 = row.repTimePrio2Office,
s.incidentInterventionTimePrio2Nonoff = row.respTimePrio2Nonoff,
s.respTimeOnsPrio1Office = row.respTimeOnsPrio1Office,
s.incidentInterventionTimePrio3 = row.respTimePrio3Office,
s.incidentTimeToResolvePrio1Nonoff = row.repTimePrio1Nonoff,
s.incidentTimeToResolvePrio3Nonoff = row.repTimePrio3Nonoff,
s.incidentInterventionTimePrio3Nonoff = row.respTimePrio3Nonoff,
s.incidentTimeToResolvePrio4 = row.repTimePrio4Office,
s.incidentInterventionTimePrio2 = row.respTimePrio2Office,
s.incidentTimeToResolvePrio1 = row.repTimePrio1Office,
s.agoProductId = toInteger(row.agoProductId),
s.cldBusinessGroupId = row.cldBusinessGroupId,
s.ippSpId = toInteger(row.ippSpId),
s.mrkEntitlementId = row.mrkEntitlementId,
s.samEquipmentId = toInteger(row.samEquipmentId),
s.symContractId = row.symContractId,
s.tisKsapId = toInteger(row.tisKsapId),
s.oitInstanceId = row.oitInstanceId
ON MATCH SET
s.consumerTag = toInteger(row.consumerTag),
s.serviceNameCustomer = row.serviceNameCustomer,
s.productNameOffering = row.productNameOffering,
s.name = row.name,
s.room = row.room,
s.startDate = date(row.startDate),
s.northName = row.northName,
s.southName = row.southName,
s.endDate = date(row.endDate),
s.customerLabel = row.customerLabel,
s.phone = row.phone,
s.assetId = row.assetId,
s.state = row.state,
s.type = row.type,
s.billProfileId = toInteger(row.billProfileId),
s.contractId = toInteger(row.contractId),
s.productId = toInteger(row.productId),
s.subProductId = toInteger(row.subProductId),
s.deleted = toInteger(row.deleted),
s.incidentInterventionTimePrio4Nonoff = row.respTimePrio4Nonoff,
s.incidentInterventionTimePrio4 = row.respTimePrio4Office,
s.incidentTimeToResolvePrio3 = row.repTimePrio3Office,
s.incidentInterventionTimePrio1Nonoff = row.respTimePrio1Nonoff,
s.incidentInterventionTimePrio1 = row.respTimePrio1Office,
s.compbaseValue = toInteger(row.compbaseValue),
s.incidentTimeToResolvePrio2Nonoff = row.repTimePrio2Nonoff,
s.incidentTimeToResolvePrio4Nonoff = row.repTimePrio4Nonoff,
s.incidentTimeToResolvePrio2 = row.repTimePrio2Office,
s.incidentInterventionTimePrio2Nonoff = row.respTimePrio2Nonoff,
s.respTimeOnsPrio1Office = row.respTimeOnsPrio1Office,
s.incidentInterventionTimePrio3 = row.respTimePrio3Office,
s.incidentTimeToResolvePrio1Nonoff = row.repTimePrio1Nonoff,
s.incidentTimeToResolvePrio3Nonoff = row.repTimePrio3Nonoff,
s.incidentInterventionTimePrio3Nonoff = row.respTimePrio3Nonoff,
s.incidentTimeToResolvePrio4 = row.repTimePrio4Office,
s.incidentInterventionTimePrio2 = row.respTimePrio2Office,
s.incidentTimeToResolvePrio1 = row.repTimePrio1Office,
s.agoProductId = toInteger(row.agoProductId),
s.cldBusinessGroupId = row.cldBusinessGroupId,
s.ippSpId = toInteger(row.ippSpId),
s.mrkEntitlementId = row.mrkEntitlementId,
s.samEquipmentId = toInteger(row.samEquipmentId),
s.symContractId = row.symContractId,
s.tisKsapId = toInteger(row.tisKsapId),
s.oitInstanceId = row.oitInstanceId
",
{batchSize:10000, parallel:true, params: {hostname: hostname, username: username, password: password, date: date}}
)
YIELD batches, total, timeTaken, committedOperations, failedOperations, failedBatches, retries, errorMessages, batch, operations, wasTerminated, failedParams, updateStatistics
RETURN batches, total, timeTaken, committedOperations, failedOperations, failedBatches, retries, errorMessages, batch, operations, wasTerminated, failedParams, updateStatistics;
, which I manage to import for 50 mins and create the big Service nodes. After that, I want to create nodes and relationships, that are spawned from Service nodes, as following:
// Service
CALL apoc.periodic.iterate(
"
MATCH (s1:Service)
RETURN id(s1) AS service_id
",
"
MATCH (s:Sevice)
WHERE id(s) = service_id
MERGE (sl:SLA{id: s.id})
ON CREATE SET
sl.respTimePrio4Nonoff = s.respTimePrio4Nonoff,
sl.respTimePrio4Office = s.respTimePrio4Office,
sl.repTimePrio3Office = s.repTimePrio3Office,
sl.respTimePrio1Nonoff = s.respTimePrio1Nonoff,
sl.respTimePrio1Office = s.respTimePrio1Office,
sl.compbaseValue = s.compbaseValue,
sl.repTimePrio2Nonoff = s.repTimePrio2Nonoff,
sl.repTimePrio4Nonoff = s.repTimePrio4Nonoff,
sl.repTimePrio2Office = s.repTimePrio2Office,
sl.respTimePrio2Nonoff = s.respTimePrio2Nonoff,
sl.respTimeOnsPrio1Office = s.respTimeOnsPrio1Office,
sl.respTimePrio3Office = s.respTimePrio3Office,
sl.repTimePrio1Nonoff = s.repTimePrio1Nonoff,
sl.repTimePrio3Nonoff = s.repTimePrio3Nonoff,
sl.respTimePrio3Nonoff = s.respTimePrio3Nonoff,
sl.repTimePrio4Office = s.repTimePrio4Office,
sl.respTimePrio2Office = s.respTimePrio2Office,
sl.repTimePrio1Office = s.repTimePrio1Office
ON MATCH SET
sl.respTimePrio4Nonoff = s.respTimePrio4Nonoff,
sl.respTimePrio4Office = s.respTimePrio4Office,
sl.repTimePrio3Office = s.repTimePrio3Office,
sl.respTimePrio1Nonoff = s.respTimePrio1Nonoff,
sl.respTimePrio1Office = s.respTimePrio1Office,
sl.compbaseValue = s.compbaseValue,
sl.repTimePrio2Nonoff = s.repTimePrio2Nonoff,
sl.repTimePrio4Nonoff = s.repTimePrio4Nonoff,
sl.repTimePrio2Office = s.repTimePrio2Office,
sl.respTimePrio2Nonoff = s.respTimePrio2Nonoff,
sl.respTimeOnsPrio1Office = s.respTimeOnsPrio1Office,
sl.respTimePrio3Office = s.respTimePrio3Office,
sl.repTimePrio1Nonoff = s.repTimePrio1Nonoff,
sl.repTimePrio3Nonoff = s.repTimePrio3Nonoff,
sl.respTimePrio3Nonoff = s.respTimePrio3Nonoff,
sl.repTimePrio4Office = s.repTimePrio4Office,
sl.respTimePrio2Office = s.respTimePrio2Office,
sl.repTimePrio1Office = s.repTimePrio1Office
MERGE (s)-[:FULFILLS]->(sl)
WITH s
CALL{
WITH s
WITH s
WHERE s.billProfileId IS NOT NULL
MERGE (bp:BillProfile{id: s.billProfileId})
MERGE (s)-[:IS_CHARGED_ON]->(bp)
}
CALL{
WITH s
WITH s
WHERE s.contractId IS NOT NULL
MERGE (c:Contract{id: s.contractId})
MERGE (s)-[:IS_INCLUDED_IN]->(c)
}
CALL{
WITH s
WITH s
WHERE s.productId IS NOT NULL
MERGE (p:Product{id: s.productId})
MERGE (s)-[:IS_OFFERED_FOR]->(p)
}
CALL{
WITH s
WITH s
WHERE s.subProductId IS NOT NULL
MERGE (p1:Product{id: s.subProductId})
MERGE (s)-[:IS_OFFERED_FOR]->(p1)
}
CALL{
WITH s
WITH s
WHERE s.agoProductId IS NOT NULL
MERGE (e:ExternalID:AGO{value: row.agoProductId})
ON CREATE SET e.type = 'Product ID'
ON MATCH SET e.type = 'Product ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e)
}
CALL{
WITH s
WITH s
WHERE s.cldBusinessGroupId IS NOT NULL
MERGE (e1:ExternalID:CLD{value: s.cldBusinessGroupId})
ON CREATE SET e1.type = 'Business Group ID'
ON MATCH SET e1.type = 'Business Group ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e1)
}
CALL{
WITH s
WITH s
WHERE s.ippSpId IS NOT NULL
MERGE (e2:ExternalID:IPP{value: s.ippSpId})
ON CREATE SET e2.type = 'Sp ID'
ON MATCH SET e2.type = 'Sp ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e2)
}
CALL{
WITH s
WITH s
WHERE s.mrkEntitlementId IS NOT NULL
MERGE (e3:ExternalID:MRK{value: s.mrkEntitlementId})
ON CREATE SET e3.type = 'Entitlement ID'
ON MATCH SET e3.type = 'Entitlement ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e3)
}
CALL{
WITH s
WITH s
WHERE s.samEquipmentId IS NOT NULL
MERGE (e4:ExternalID:SAM{value: s.samEquipmentId})
ON CREATE SET e4.type = 'Equipment ID'
ON MATCH SET e4.type = 'Equipment ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e4)
}
CALL{
WITH s
WITH s
WHERE s.symContractId IS NOT NULL
MERGE (e5:ExternalID:SYM{value: s.symContractId})
ON CREATE SET e5.type = 'Contract ID'
ON MATCH SET e5.type = 'Contract ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e5)
}
CALL{
WITH s
WITH s
WHERE s.tisKsapId IS NOT NULL
MERGE (e6:ExternalID:TIS{value: s.tisKsapId})
ON CREATE SET e6.type = 'Ksap ID'
ON MATCH SET e6.type = 'Ksap ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e6)
}
CALL{
WITH s
WITH s
WHERE s.oitInstanceId IS NOT NULL
MERGE (e7:ExternalID:OIT{value: s.oitInstanceId})
ON CREATE SET e7.type = 'Instance ID'
ON MATCH SET e7.type = 'Instance ID'
MERGE (s)-[:HAS_EXTERNAL_ID]->(e7)
}
REMOVE s.respTimePrio4Nonoff, s.respTimePrio4Office, s.repTimePrio3Office, s.respTimePrio1Nonoff, s.respTimePrio1Office, s.compbaseValue, s.repTimePrio2Nonoff, s.repTimePrio4Nonoff, s.repTimePrio2Office, s.respTimePrio2Nonoff, s.respTimeOnsPrio1Office, s.respTimePrio3Office, s.repTimePrio1Nonoff, s.repTimePrio3Nonoff, s.respTimePrio3Nonoff, s.repTimePrio4Office, s.respTimePrio2Office, s.repTimePrio1Office, s.billProfileId, s.contractId, s.productId, s.subProductId, s.agoProductId, s.cldBusinessGroupId, s.ippSpId, s.mrkEntitlementId, s.samEquipmentId, s.symContractId, s.tisKsapId, s.oitInstanceId
",
{batchSize:10000, parallel:false}
);
So all nodes and relationships, are spawned from Service nodes. I could break it down, to create ExternalID nodes and relationships in a separate query, and the rest in a separate query. The problem is that while loading the data (without having to separate queries 2-by-2) and creating the nodes and relationships, I end up having OOM error. One solution would be to separate the relationships creation 2-by-2, but like that I would end up having 13 queries, which I want to avoid. Any better idea? If using periodic commit, would it consider the limit (batchSize) as I add there and not load all data upfront? Thanks!
08-30-2022 05:45 AM
Another update, I tried this query with apoc.load.commit but it does not seem to make the job even though I do not see any problem with it:
CALL apoc.periodic.commit(
"
MATCH (s:Service)
WITH s LIMIT $limit
MERGE (sl:SLA{id: s.id})
ON CREATE SET
sl.incidentInterventionTimePrio4Nonoff = s.respTimePrio4Nonoff,
sl.incidentInterventionTimePrio4 = s.respTimePrio4Office,
sl.incidentTimeToResolvePrio3 = s.repTimePrio3Office,
sl.incidentInterventionTimePrio1Nonoff = s.respTimePrio1Nonoff,
sl.incidentInterventionTimePrio1 = s.respTimePrio1Office,
sl.compbaseValue = toInteger(s.compbaseValue),
sl.incidentTimeToResolvePrio2Nonoff = s.repTimePrio2Nonoff,
sl.incidentTimeToResolvePrio4Nonoff = s.repTimePrio4Nonoff,
sl.incidentTimeToResolvePrio2 = s.repTimePrio2Office,
sl.incidentInterventionTimePrio2Nonoff = s.respTimePrio2Nonoff,
sl.respTimeOnsPrio1Office = s.respTimeOnsPrio1Office,
sl.incidentInterventionTimePrio3 = s.respTimePrio3Office,
sl.incidentTimeToResolvePrio1Nonoff = s.repTimePrio1Nonoff,
sl.incidentTimeToResolvePrio3Nonoff = s.repTimePrio3Nonoff,
sl.incidentInterventionTimePrio3Nonoff = s.respTimePrio3Nonoff,
sl.incidentTimeToResolvePrio4 = s.repTimePrio4Office,
sl.incidentInterventionTimePrio2 = s.respTimePrio2Office,
sl.incidentTimeToResolvePrio1 = s.repTimePrio1Office
ON MATCH SET
sl.incidentInterventionTimePrio4Nonoff = s.respTimePrio4Nonoff,
sl.incidentInterventionTimePrio4 = s.respTimePrio4Office,
sl.incidentTimeToResolvePrio3 = s.repTimePrio3Office,
sl.incidentInterventionTimePrio1Nonoff = s.respTimePrio1Nonoff,
sl.incidentInterventionTimePrio1 = s.respTimePrio1Office,
sl.compbaseValue = toInteger(s.compbaseValue),
sl.incidentTimeToResolvePrio2Nonoff = s.repTimePrio2Nonoff,
sl.incidentTimeToResolvePrio4Nonoff = s.repTimePrio4Nonoff,
sl.incidentTimeToResolvePrio2 = s.repTimePrio2Office,
sl.incidentInterventionTimePrio2Nonoff = s.respTimePrio2Nonoff,
sl.respTimeOnsPrio1Office = s.respTimeOnsPrio1Office,
sl.incidentInterventionTimePrio3 = s.respTimePrio3Office,
sl.incidentTimeToResolvePrio1Nonoff = s.repTimePrio1Nonoff,
sl.incidentTimeToResolvePrio3Nonoff = s.repTimePrio3Nonoff,
sl.incidentInterventionTimePrio3Nonoff = s.respTimePrio3Nonoff,
sl.incidentTimeToResolvePrio4 = s.repTimePrio4Office,
sl.incidentInterventionTimePrio2 = s.respTimePrio2Office,
sl.incidentTimeToResolvePrio1 = s.repTimePrio1Office
MERGE (s)-[:FULFILLS]->(sl)
WITH s
CALL{
WITH s
WITH s
WHERE s.billProfileId IS NOT NULL
MERGE (bp:BillProfile{id: s.billProfileId})
MERGE (s)-[:IS_CHARGED_ON]->(bp)
RETURN s AS service
UNION
WITH s
WITH s
WHERE s.contractId IS NOT NULL
MERGE (c:Contract{id: s.contractId})
MERGE (s)-[:IS_INCLUDED_IN]->(c)
RETURN s AS service
UNION
WITH s
WITH s
WHERE s.productId IS NOT NULL
MERGE (p:Product{id: s.productId})
MERGE (s)-[:IS_OFFERED_FOR]->(p)
RETURN s AS service
UNION
WITH s
WITH s
WHERE s.subProductId IS NOT NULL
MERGE (p1:Product{id: s.subProductId})
MERGE (s)-[:IS_OFFERED_FOR]->(p1)
RETURN s AS service
}
RETURN count(s)
", {limit:10000});
08-30-2022 06:20 AM
What do you mean by "it does not seem to make the job"? Is there an error or something?
08-30-2022 06:26 AM
No, there is no error but also no relationships are being created. It is just hanging, inside the apoc.periodic.commit. When I try the query itself, outside of apoc, it works completely fine.
08-30-2022 06:34 AM - edited 08-30-2022 06:35 AM
Try a small limit like 100 just to check.
08-30-2022 06:43 AM
It keeps hanging also with limit 100, like shown below:
while when I give the limit hardcoded and I use not apoc.periodic.commit, the same query, performs correct.
08-30-2022 06:49 AM
Execute SHOW TRANSACTIONS while the query is running.
08-30-2022 06:54 AM
As expected the transaction is running but nothing is being commited.
08-30-2022 06:58 AM
Can you check in debug.log and query.log if you see something weird?
08-30-2022 07:07 AM - edited 08-30-2022 07:09 AM
nothing weird, only the query and the information about it that is being executed
This is how the log looks like:
{"time":"2022-08-30 14:05:04.961+0000","level":"INFO","event":"start","type":"query","id":"5522","elapsedTimeMs":0,"planning":0,"waiting":0,"allocatedBytes":0,"pageHits":0,"pageFaults":0,"source":"bolt-session\tbolt\tneo4j-browser/v4.4.2\t\tclient/xx.xxx.xxx:44920\tserver/xx.xx.xxx:7687>","database":"<none>","username":"XXXXX","executingUser":"XXXX","authenticatedUser":"XXXX","query":"CALL apoc.periodic.commit(\n\t\"\n\t\tMATCH (s:Service)\n\t\tWITH s LIMIT $limit\n\t\tMERGE (sl:SLA{id: s.id})\n\t\tON CREATE SET\n\t\t\t\tsl.incidentInterventionTimePrio4Nonoff = s.respTimePrio4Nonoff,\n\t\t\t\tsl.incidentInterventionTimePrio4 = s.respTimePrio4Office,\n\t\t\t\tsl.incidentTimeToResolvePrio3 = s.repTimePrio3Office,\n\t\t\t\tsl.incidentInterventionTimePrio1Nonoff = s.respTimePrio1Nonoff,\n\t\t\t\tsl.incidentInterventionTimePrio1 = s.respTimePrio1Office,\n\t\t\t\tsl.compbaseValue = toInteger(s.compbaseValue),\n\t\t\t\tsl.incidentTimeToResolvePrio2Nonoff = s.repTimePrio2Nonoff,\n\t\t\t\tsl.incidentTimeToResolvePrio4Nonoff = s.repTimePrio4Nonoff,\n\t\t\t\tsl.incidentTimeToResolvePrio2 = s.repTimePrio2Office,\n\t\t\t\tsl.incidentInterventionTimePrio2Nonoff = s.respTimePrio2Nonoff,\n\t\t\t\tsl.respTimeOnsPrio1Office = s.respTimeOnsPrio1Office,\n\t\t\t\tsl.incidentInterventionTimePrio3 = s.respTimePrio3Office,\n\t\t\t\tsl.incidentTimeToResolvePrio1Nonoff = s.repTimePrio1Nonoff,\n\t\t\t\tsl.incidentTimeToResolvePrio3Nonoff = s.repTimePrio3Nonoff,\n\t\t\t\tsl.incidentInterventionTimePrio3Nonoff = s.respTimePrio3Nonoff,\n\t\t\t\tsl.incidentTimeToResolvePrio4 = s.repTimePrio4Office,\n\t\t\t\tsl.incidentInterventionTimePrio2 = s.respTimePrio2Office,\n\t\t\t\tsl.incidentTimeToResolvePrio1 = s.repTimePrio1Office\n\t\tON MATCH SET\n\t\t\t\tsl.incidentInterventionTimePrio4Nonoff = s.respTimePrio4Nonoff,\n\t\t\t\tsl.incidentInterventionTimePrio4 = s.respTimePrio4Office,\n\t\t\t\tsl.incidentTimeToResolvePrio3 = s.repTimePrio3Office,\n\t\t\t\tsl.incidentInterventionTimePrio1Nonoff = s.respTimePrio1Nonoff,\n\t\t\t\tsl.incidentInterventionTimePrio1 = s.respTimePrio1Office,\n\t\t\t\tsl.compbaseValue = toInteger(s.compbaseValue),\n\t\t\t\tsl.incidentTimeToResolvePrio2Nonoff = s.repTimePrio2Nonoff,\n\t\t\t\tsl.incidentTimeToResolvePrio4Nonoff = s.repTimePrio4Nonoff,\n\t\t\t\tsl.incidentTimeToResolvePrio2 = s.repTimePrio2Office,\n\t\t\t\tsl.incidentInterventionTimePrio2Nonoff = s.respTimePrio2Nonoff,\n\t\t\t\tsl.respTimeOnsPrio1Office = s.respTimeOnsPrio1Office,\n\t\t\t\tsl.incidentInterventionTimePrio3 = s.respTimePrio3Office,\n\t\t\t\tsl.incidentTimeToResolvePrio1Nonoff = s.repTimePrio1Nonoff,\n\t\t\t\tsl.incidentTimeToResolvePrio3Nonoff = s.repTimePrio3Nonoff,\n\t\t\t\tsl.incidentInterventionTimePrio3Nonoff = s.respTimePrio3Nonoff,\n\t\t\t\tsl.incidentTimeToResolvePrio4 = s.repTimePrio4Office,\n\t\t\t\tsl.incidentInterventionTimePrio2 = s.respTimePrio2Office,\n\t\t\t\tsl.incidentTimeToResolvePrio1 = s.repTimePrio1Office\n\t\tMERGE (s)-[:FULFILLS]->(sl)\n\t\tWITH s\n\t\tCALL{\n\t\t\t\t\tWITH s\n\t\t\t\t\tWITH s\n\t\t\t\t\tWHERE s.billProfileId IS NOT NULL\n\t\t\t\t\tMERGE (bp:BillProfile{id: s.billProfileId}) \n\t\t\t\t\tMERGE (s)-[:IS_CHARGED_ON]->(bp)\n\t\t\t\t\tRETURN s AS service\n\t\t\t\t\tUNION\n\t\t\t\t\tWITH s\n\t\t\t\t\tWITH s\n\t\t\t\t\tWHERE s.contractId IS NOT NULL\n\t\t\t\t\tMERGE (c:Contract{id: s.contractId}) \n\t\t\t\t\tMERGE (s)-[:IS_INCLUDED_IN]->(c)\n\t\t\t\t\tRETURN s AS service\n\t\t\t\t\tUNION\n\t\t\t\t\tWITH s\n\t\t\t\t\tWITH s\n\t\t\t\t\tWHERE s.productId IS NOT NULL\n\t\t\t\t\tMERGE (p:Product{id: s.productId}) \n\t\t\t\t\tMERGE (s)-[:IS_OFFERED_FOR]->(p)\n\t\t\t\t\tRETURN s AS service\n\t\t\t\t\tUNION\n\t\t\t\t\tWITH s\n\t\t\t\t\tWITH s\n\t\t\t\t\tWHERE s.subProductId IS NOT NULL\n\t\t\t\t\tMERGE (p1:Product{id: s.subProductId}) \n\t\t\t\t\tMERGE (s)-[:IS_OFFERED_FOR]->(p1)\n\t\t\t\t\tRETURN s AS service\n\t\t\t}\n\t\tRETURN count(s)\n\t\", {limit:100});","runtime":"null","annotationData":"{type: 'user-direct', app: 'neo4j-browser_v4.4.2'}"}
08-30-2022 08:06 AM
Well, I don't know 😕
You can retry to use apoc.periodic.iterate() procedure with the updated query, maybe reduce the batchSize parameter and play with the database configuration.
08-30-2022 09:10 AM
I am now breaking down the query to multiple subqueries until I find a better solution where I can group them again. Maybe I would need to upgrade my Neo4j database version as well, but for now I can live with multiple queries. Thanks for your inputs 😃
All the sessions of the conference are now available online