cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Efficiently getting all out-degrees in a large graph

I have a reasonably large and dense graph of 1.5 million nodes. I'd like to get a list of all out-degrees of relationships of a certain type. I've played around a bit, but the queries I've come up with are all slow enough that I'm looking for feedback on whether I can tune it further.

I'm currently using the following:

MATCH (p:Package)-[r:PACKAGE_DEPENDS_ON]->()
RETURN p.name, COUNT(r) AS count

I'm not actually interested in p.name, but without it, the query just returns a single number. My question is whether there's a way to write this query that doesn't return unnecessary information, and whether that might improve performance?

5 REPLIES 5

  1. can you share the schema ?
  2. explain plan for the query ?

Yes, here is the schema. I've added the table format since the visualization did not show all labels.

Graph schema

2X_4_4cf70b2db586c2a0d49a019a1c28328eb75f048d.png

  • Red: Version
  • Blue: VersionRequirement
  • Orange: Package
  • Beige: User

Nodes

[
{
  "identity": -3,
  "labels": [
    "VersionRequirement"
  ],
  "properties": {
"indexes": [],
"name": "VersionRequirement",
"constraints": []
  }
}
,
{
  "identity": -2,
  "labels": [
    "User"
  ],
  "properties": {
"indexes": [],
"name": "User",
"constraints": []
  }
}
,
{
  "identity": -4,
  "labels": [
    "Version"
  ],
  "properties": {
"indexes": [],
"name": "Version",
"constraints": []
  }
}
,
{
  "identity": -1,
  "labels": [
    "Package"
  ],
  "properties": {
"indexes": [],
"name": "Package",
"constraints": []
  }
}
]

Relationships

[
{
  "identity": -6,
  "start": -4,
  "end": -4,
  "type": "DEPENDS_ON_RESOLVES_TO",
  "properties": {

  }
}
,
{
  "identity": -8,
  "start": -1,
  "end": -1,
  "type": "PACKAGE_DEPENDS_ON",
  "properties": {

  }
}
,
{
  "identity": -4,
  "start": -3,
  "end": -4,
  "type": "RESOLVES_TO",
  "properties": {

  }
}
,
{
  "identity": -7,
  "start": -4,
  "end": -4,
  "type": "NEXT_VERSION",
  "properties": {

  }
}
,
{
  "identity": -1,
  "start": -4,
  "end": -3,
  "type": "DEPENDS_ON",
  "properties": {

  }
}
,
{
  "identity": -2,
  "start": -2,
  "end": -4,
  "type": "MAINTAINS",
  "properties": {

  }
}
,
{
  "identity": -3,
  "start": -3,
  "end": -1,
  "type": "REQUIREMENT_OF",
  "properties": {

  }
}
,
{
  "identity": -5,
  "start": -4,
  "end": -1,
  "type": "VERSION_OF",
  "properties": {

  }

And the explain plan:
2X_3_38fd249130e8c3274f22833f03f428c0cb43dd80.png

@taobojlen I guess you could use degree centrality by carefully choosing the arguments. In your case, you could specify "OUTGOING" since you only want to compute the out edges. https://neo4j.com/blog/graph-algorithms-neo4j-degree-centrality/

I might be missing something from this link, but it looks like their query is very similar to the one I have? In my query I do specify a direction for the relationship.

Well, running the query today seems totally fine -- yesterday I was waiting for ~30 minutes and assumed that it was due to a badly written query. It seems like Neo4j was stuck in a bad state. Sometimes turning it off and on again does work...!