Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-29-2020 08:43 AM
I am looking to test different structures of my knowledge graph and one consistent piece of advice I have received is to turn otherwise categorical node properties into separate nodes.
However, the CSVs imported and used to create my current graph were created by someone else and subsequently changed/corrupted so I can no longer edit them manually to just create new CSV files designating new nodes, relationships, and the appropriate headers for easy import with the import tool.
Recreating the CSVs will take an enormous amount of time due to volume of data so I am wondering if there is a simple cypher query that can accomplish this task.
Thanks!
01-29-2020 09:24 AM
APOC to the rescue, use apoc.refactor.categorize
, see http://neo4j-contrib.github.io/neo4j-apoc-procedures/3.5/graph-refactoring/categorize/ for details.
02-03-2020 05:39 PM
Hello, thank you for the update!
I tried applying the solution but received the following error throwback: " Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure apoc.refactor.categorize
: Caused by: java.lang.StackOverflowError
"
Any advice?
02-04-2020 03:31 AM
What statement have you used exactly? Any stacktrace found in debug.log
?
02-04-2020 08:26 AM
Hello,
For some context, I am working with neo4j community edition 3.5.3. To test the apoc command above I copied my graph and edited my .conf file to point the active database parameter to my copied db instead of the original so I could test the property explosion. (graph_explode.db is the new graph, graph.db is the original graph.
It looks I'm getting two different sets of errors, one from this morning and one from yesterday evening.
Yesterday evening I was getting warnings from procedures regarding plugin loading failures such as:
apoc.util.s3.S3URLConnection
from plugin jar /home/ubuntu/neo4j-community-3.5.3/plugins/apoc-3.5.0.2-all.jar
: com/amazonaws/ClientConfiguratapoc.data.email.ExtractEmail
from plugin jar /home/ubuntu/neo4j-community-3.5.3/plugins/apoc-3.5.0.2-all.jar
: javax/mail/internet/AddressExceptioncom.jayway.jsonpath.spi.json.GsonJsonProvider
from plugin jar /home/ubuntu/neo4j-community-3.5.3/plugins/apoc-3.5.0.2-all.jar
: com/google/gson/JsonElementorg.neo4j.driver.internal.shaded.io.netty.handler.ssl.JettyAlpnSslEngine$ClientEngine
from plugin jar /home/ubuntu/neo4j-community-3.5.3/plugins/apoc-3.5.0.2-all.jar
: org/eclipse/jetty/alpn/ALPN$Provider------------------------------------------------------------------------------------------------------------------------------------
This morning I was getting the following warnings with a new apoc failure for the same refactor command
Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure apoc.refactor.categorize
: Caused by: org.neo4j.kernel.DeadlockDetectedException: LockClient[1264920] can't wait on resource RWLock[NODE(72798004), hash=1583583385] since => LockClient[1264920] <-[:HELD_BY]- RWLock[NODE(72798452), hash=28924501] <-[:WAITING_FOR]- LockClient[1265128] <-[:HELD_BY]- RWLock[NODE(72798004), hash=1583583385]
RegistryNum
{registryNumber
: {value}}) MERGE (n)-[:HAS_REGISTRYNUMBER
]->(cat) RETURN catRegistryNum
{registryNumber
: {value}}) MERGE (n)-[:HAS_REGISTRYNUMBER
]->(cat) RETURN cat02-04-2020 08:31 AM
Sorry, forgot to include my actual statement. The statements I was running yesterday and today were:
1.Command call apoc.refactor.categorize('nameOfSubstance','HAS_NAMEDSUBSTANCE',true,'SubstanceName','nameOfSubstance',,100)
This command gave the following error during execution on the browser:
Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure apoc.refactor.categorize
: Caused by: java.lang.StackOverflowError\
This command gave the following error during execution on the browser:
Neo.ClientError.Procedure.ProcedureCallFailed: Failed to invoke procedure apoc.refactor.categorize
: Caused by: org.neo4j.kernel.DeadlockDetectedException: LockClient[1264920] can't wait on resource RWLock[NODE(72798004), hash=1583583385] since => LockClient[1264920] <-[:HELD_BY]- RWLock[NODE(72798452), hash=28924501] <-[:WAITING_FOR]- LockClient[1265128] <-[:HELD_BY]- RWLock[NODE(72798004), hash=1583583385]
Essentially, I just applied the new apoc command to separate properties on two different nodes to test it out and got the above errors
02-04-2020 12:53 PM
Can you make sure you only run on categorize
command at a time? With that limitation you shouldn't see any DeadlockDetectedException
.
The Failed to load
messages are uncritical - disgregard them.
3.5.3 is a pretty old release, consider an upgrade to 3.5.14 together with the latest apoc 3.5.x.x version.
02-04-2020 01:11 PM
What do you mean run one categorize command at a time? I ran it as single query on the Neo4j browser. Is there another way to perform the command?
Thanks.
02-06-2020 11:15 AM
The locking issues you're seeing in the logs indicate that you're running concurrent modifications - which should not happen in the apoc.refactor.categorize
is the only query running at that time.
All the sessions of the conference are now available online