Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-31-2021 07:57 PM
Hi, all,
I have a MultiDiGraph that has edges whick properties as date, frame1, frame2 and frame3, where frame1, frame2 and frame3 are boolean properties.
Given a specific date, I have to calculate pagerank and betweenness for each frame property. So I will calculate 6 metrics.
Now, I am creating 3 named graphs for a given date, for example, 2020-03-15, I create one named graph for 2020-03-15/frame1, one for 2020-03-15/frame2 and one for 2020-03-15/frame3, and calculate the two metrics with these 3 named graphs. This is faster(about 2 times) than calculating the six metrics with 6 anonymous graphs, one for each metrics/date/frame.
The time to build the named graph is about 90% of the total time to calculate the 2 metrics. I wonder if there is a way of creating a named graph for 2020-03-15 and then create the 3 frames subgraphs as new named graphs of this date named graph, that is already in memory, as a way to build these 3 frames named graphs faster. I know that it seems a litlle confusing.
Someone has a guess about these topic or this does not make sense?
Thanks in advance, Laufer
08-02-2021 11:41 AM
Hey Laufer,
I think what you're looking for is something like gds.beta.graph.subgraph().
It should allow you to filter your existing "2020-03-15" graph into separate sub-graphs based on your frame1/frame2/frame3 properties, though I'm not sure how much of a performance lift it will provide. But definitely report back with the results.
I hope that offers some help,
Sean
08-02-2021 12:51 PM
Thank you very much, Sean.
I will try and return the results.
Best regards, Laufer
08-03-2021 08:52 AM
Hi, Sean,
Coming back to report the results.
There is an error in the documents in respect to the name of the method: the correct name is gds.beta.graph.create.subgraph()
I create the date graph with the query:
CALL gds.graph.create.cypher('date_20200519',
'MATCH (p)-[{data: date("2020-05-19")}]-()
RETURN DISTINCT id(p) as id',
'MATCH (p1)-[r {data: date("2020-05-19")}]->(p2)
RETURN id(p1) as source, id(p2) as target, r.atrib_resp as atrib_resp, r.conflito as conflito, r.moralidade as moralidade, r.conseq_pandemia as conseq_pandemia, r.med_contencao as med_contencao, r.met_tratamento as met_tratamento',
{validateRelationships: True})
Actually I have six frames: atrib_resp, conflito, moralidade, conseq_pandemia, med_contencao, met_tratamento. The date graph has 222.244 nodes and 444.246 edges. If I run, for example, pageRank over it, it' ok.
But when I try to create a subgraph I get an 'out of bounds error':
CALL gds.beta.graph.create.subgraph('date_20200519_atrib_resp_subgraph', 'date_20200519', '*', 'r.atrib_resp = 1')
YIELD graphName, fromGraphName, nodeCount, relationshipCount
ERROR Neo.ClientError.Procedure.ProcedureCallFailed
Failed to invoke procedure gds.beta.graph.create.subgraph
: Caused by: java.lang.ArrayIndexOutOfBoundsException: Index 45647 out of bounds for length 3473
I searched for this error an got a similar closed issue that was resolved: gds.graph.create.cypher ArrayIndexOutOfBoundsException error · Issue #15 · neo4j/graph-data-science ...
It was about edges with nodes not present in the graph. I cannot see where I am making a mistake.
Do you have any guess?
Thank you, Laufer
All the sessions of the conference are now available online