cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

findById seems to be querying the whole database in one go

Hi There,


I am slowly progressing my migration from SDN 4 to SDN 6. I've completely detached from Neo4j-OGM and I am using Spring Boot and SDN 6.

Let's say I am writing my repository classes and sadly when I test findById, it is querying the whole graph database until it throws StackOverflowError.

The behaviour we had before when implemented GraphRepository it was depth*2 and relationships were loaded perfectly.


Our modelling is based on a superclass called DatabaseObject and any other class must extend it. Once an Id of a subclass is given, I should be getting its attribute values. That behaviour is missed when I extend Neo4jRepository instead.

@Repository
public interface DatabaseObjectRepository extends Neo4jRepository<DatabaseObject, Long>{

    //Derived queries
    <T extends DatabaseObject> T findByDbId(Long dbId);
    
    @Query("MATCH (n:DatabaseObject{stId:$stId}) RETURN distinct n")
    <T extends DatabaseObject> T findByStId(@Param("stId")String stId);


findByStId retrieves ONLY the attributes and no relationship. This is just a test. In a real scenario, dbId and stId will behave in the same way


I have to say we heavily rely on the findById on our projects...

Not sure if people will want to test or see the code, so I pushed the code (still in progress) to a different branch and I have my on Google Drive

* Classes are in the [domain.model](https://github.com/reactome/graph-core/tree/feature/major-updates/src/main/java/org/reactome/server/graph/domain/model) package
* [Test class](https://github.com/reactome/graph-core/blob/feature/major-updates/src/test/java/org/reactome/server/graph/repository/DatabaseObjectRepositoryTest.java)
* The given ID:69620 is a Pathway


I've been checking the Query section of the Documentation but I don't really know what to use.
ANy help, comment or insight will be appreciated!


Thanks!
Guilherme

9 REPLIES 9

anthapu
Graph Fellow

SDN by default will try to load all the connected entities also to the max depth it can traverse. Most likely you have relationships setup that will traverse the whole database. In this case it tries to load all the data into sdn cache thus causing OOM issue.

The second repository method will not do that as you are using a cypher query. In this case it will only load the objects you are returning in the cypher query

Thanks @anthapu
Yes. I understood that
Any idea on how to possibly address it ? I do have tons of relationship and it was working before SDN 6 (with SDN4)

Cheers

I know nothing about SDN

But for the Cypher query part, it's important to have a constraint on your node property or it will takes years to get your result.

CREATE CONSTRAINT dataBaseObjectStId ON (d:DatabaseObject) ASSERT (d.stId) IS UNIQUE

Thanks @tard.gabriel . I already have this constraint!

As an example I am trying a trivial but heavily used query -

MATCH (p:Pathway{dbId:69620})-[r]->(m) RETURN p,r,m

3X_3_7_37396d6bdde1f1e08a3f5a45d40acf401b69be22.png

In Java, SDN6, i am doing this, which doesn't work:

    @Query("MATCH (p:Pathway{dbId:$dbId})-[r]->(m) RETURN p,r,m")
    Pathway customFindById(@Param("dbId") Long dbId);

Exception says query retrieve more than on Record. Then I tried with a List, which brings only p and no relationships.

For this special use-case your query has to look like:

MATCH (p:Pathway{dbId:$dbId})-[r]->(m) RETURN p,collect(r),collect(m)

For getting more control over what gets loaded, you should have a look at projections Spring Data Neo4j
They allow you to define what should get loaded for special use-cases. Right now you can define projections and also projections in projections to get used but only the filtered list of properties and relationships will be respected. For the nested projections, the underlying entity class with all properties and relationships will get used. This is nothing we can change in SDN alone but has to be solved in Spring Data commons. We have already addressed this and are working with VMware on a solution.

Taking one step ahead because I think you did not mention this:
I looked at the domain in the repository you have provided. There are limitations in SDN 6 regarding inheritance that won't work with the model you are providing because using inheritance that should be reflected in the graph is not supported (Spring Data Neo4j)

There is an ongoing discussion right now if it makes sense to enable this but it needs more time for us to have a look over the implications this might have.

Thanks @gerrit.meier. Thanks for taking the time to look into our repo.

Ok, the query in that way worked. Thanks. I will continue exploring Projections.


I am questioning myself all the time now if I am in the right direction with so many changes from SDN4 to SDN6. As you may have seen in our repo, the repository package is huge and full of custom queries.

I am struggling now with some missing method of the Neo4jOperations, such as loadByProperty or query

Result result = neo4jTemplate.query(query, map);
if (result != null && result.iterator().hasNext())
     return (T) result.iterator().next().get("n");

Thanks

Hi @gerrit.meier @gerrit.meier1

Taking one step ahead because I think you did not mention this:
I looked at the domain in the repository you have provided. There are limitations in SDN 6 regarding inheritance that won't work with the model you are providing because using inheritance that should be reflected in the graph is not supported (Spring Data Neo4j)


Based on your statement, I've added the primaryLabel to all domain classes and I am still having some weird behaviour here. Every time I run a query which I am expecting e.g Pathway I will get the some result out of it but with wrong instance.

@Repository
public interface DatabaseObjectRepository extends Neo4jRepository<DatabaseObject, Long> {

    @Query("MATCH (a:DatabaseObject{dbId:$dbId})-[r]->(m) RETURN a,collect(r) as C,collect(m) as G")
    <T extends DatabaseObject> T  run(@Param("dbId") Long dbId);

And, this time I got Figure, next time Publication and rarely the correct one....

The branch has the changes with the primaryLabel and again I can't replicate the behaviour we had prior to SDN6.

Thank you very much , again

I have pushed the changes with @Node primaryLabel according to the docs. I am not sure if I did it right because i am not getting the results as expected