Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-29-2021 04:45 AM
I have imported one RDF file in Neo4J using NeoSemantics.
Total no of nodes : 1071461 (1 Million)
Total Relationships : 2553482 (2.5 Million)
When I query this imported graph for traversal queries, I usually get execution time arount 1-100 milliseconds, But when I go for using list in my queries, execution time is quite high ranging from 1.5-5 seconds.
Ex of list query: (having response time within 1.5-5 seconds):
Match (n:Resource) where n.skos__prefLabel ="Anti-Allergic Agents" with n.skos__notation AS dm_notation match (n) where dm_notation IN n.ns3__PA return n.skos__prefLabel
Ex of traversal query (having response time within 1-100 ms):
match (n:Resource{skos__prefLabel:'metabolism'})<-[:rdfs__subClassOf]-(p) return p.skos__prefLabel
Is there any way to bring down execution time of such queries?
Any help would be appreciated, Thanks in advance!
P.S. :
I have added single property indexes for Resource nodes on skos__prefLabel, skos__notation and ns3__PA property.
09-01-2021 09:54 PM
I think it makes sense that you have quite a big difference in performance here, for a few reasons:
Property searches are always going to be slower than label / relationship searches.
Your WITH
clause specifies a property on the resource, but anything not in your WITH
clause won't be ported over. Since you aren't, your MATCH (n)
literally does a comparison on every node in your entire graph as n
is undeclared at that point. I am unsure if this is intentional or not.
Also, you're still at a bad spot because you're taking each row's property and doing a comparison against EVERY other row EACH time. You could probably improve it with something like (I'm tired so someone else please help out):
MATCH (n:MyLabel { myFilterProp: "someValue" })
WITH collect(DISTINCT n.myPropToCheck) AS checkList
MATCH (n:MyLabel)
WHERE any(prop IN n.listProp WHERE prop IN checkList)
09-01-2021 11:36 PM
Thank you for the suggestion, it did improve my response time, and came down to 900 ms from 1.5 seconds! But it still is higher than expected execution time. I agree with you on the fact that this will take longer than relationship searches and want to make sure to bring down execution time as much as possible. With RDF4J I get response time for such queries in range 30-100 ms.
MATCH (n:Resource { skos__prefLabel : "Anti-Allergic Agents" })
WITH n.skos__notation AS dm_notation
MATCH (m:owl__Class) WHERE any(prop IN m.ns3__PA WHERE prop=dm_notation)
RETURN m.skos__prefLabel
Can this query be further optimized? Also does response time depend on memory configuration of neo4j? I have maximum heap size set to default 512MB
09-02-2021 01:01 AM
Well,
How many elements are usually in this list? Is there any option you turn it into a Label? You can always make relationships to a new node that represent this property-value for the whole model. If you aim for the performance you may need to adjust your into something more Graphish.
Bennu
All the sessions of the conference are now available online