cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Neo4J on AWS EKS half of our replica sets stuck with loads of 97% CPU after running few queries

We have been running Neo4J on AWS EKS, and some our replica sets get stuck with loads of 97% CPU after few sending few simple queries to it. We can't access GraphQL using port 7474, but using cypher-shell, we can see the queries stuck using "CALL dbms.listQueries();"

The docker init logs for a working replica and a non working one looks exactly the same, they both can connect to the core servers, no errors.

Queries stuck in the replica, CPU reaches 97% after 5-7 tries and we cant' delete queries
````
| "idp" | "slotted" |
| "2019-08-13T15:47:47.099Z" | "embedded" | NULL
| NULL | "waiting" | {waitTimeMillis: 52463, queryId: "query-5"} | 0 | 52825 | NULL | 52463 | NULL | NULL
| 776840876 | 0 | NULL |

=============
Docker Init for non working replica

kubectl logs neo4j-cluster-replica-1 neo4j --follow
command failed: the provided initial password was not set because existing Neo4j users were detected at `/var/lib/neo4j/data/dbms/auth`. Please remove the existing `auth` and `role
s` files if you want to reset your database to only have a default user with the provided password.
Active database: graph.db
Directories in use:
  home:         /var/lib/neo4j
  config:       /var/lib/neo4j/conf
  logs:         /logs
  plugins:      /plugins
  import:       /var/lib/neo4j/import
  data:         /var/lib/neo4j/data
  certificates: /var/lib/neo4j/certificates
  run:          /var/lib/neo4j/run
Starting Neo4j.
2019-08-13 15:43:22.281+0000 WARN  ha.host.data is deprecated.
2019-08-13 15:43:22.296+0000 INFO  ======== Neo4j 3.5.3 ========
2019-08-13 15:43:22.300+0000 INFO  Starting...
2019-08-13 15:43:23.208+0000 INFO  Initiating metrics...
2019-08-13 15:43:23.332+0000 INFO  Resolved initial host 'neo4j-cluster-core.default.svc.cluster.local:5000' to [10.9.8.132:5000, 10.9.237.10:5000, 10.9.80.109:5000]
2019-08-13 15:43:26.417+0000 INFO  Connected to neo4j-cluster-core-2.neo4j-cluster-core.default.svc.cluster.local/10.9.8.132:6000 [catchup version:1]
2019-08-13 15:43:33.113+0000 INFO  Connected to neo4j-cluster-core-1.neo4j-cluster-core.default.svc.cluster.local/10.9.237.10:6000 [catchup version:1]
2019-08-13 15:43:37.792+0000 INFO  Sending metrics to CSV file at /var/lib/neo4j/metrics
2019-08-13 15:43:38.194+0000 INFO  Bolt enabled on 0.0.0.0:7687.
2019-08-13 15:43:39.129+0000 INFO  Connected to neo4j-cluster-core-0.neo4j-cluster-core.default.svc.cluster.local/10.9.80.109:6000 [catchup version:1]
2019-08-13 15:43:39.835+0000 INFO  Started.
2019-08-13 15:43:39.996+0000 INFO  Mounted REST API at: /db/manage
2019-08-13 15:43:40.010+0000 INFO  Mounted unmanaged extension [org.neo4j.graphql] at [/graphql]
2019-08-13 15:43:40.062+0000 INFO  Server thread metrics have been registered successfully
2019-08-13 15:43:40.471+0000 WARN  The following warnings have been detected with resource and/or provider classes:
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResourceExperimental.executeOperation(java.lang.String), with URI template, "", is
 treated as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResourceExperimental.options(javax.ws.rs.core.HttpHeaders), with URI template, "",
 is treated as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResourceExperimental.get(java.lang.String,java.lang.String), with URI template, ""
, is treated as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.ManagementResource.executeOperation(java.lang.String), with URI template, "", is treated
as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.ManagementResource.options(javax.ws.rs.core.HttpHeaders), with URI template, "", is treat
ed as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.ManagementResource.get(java.lang.String,java.lang.String), with URI template, "", is trea
ted as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResource.executeOperation(java.lang.String), with URI template, "", is treated as
a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResource.options(javax.ws.rs.core.HttpHeaders), with URI template, "", is treated
as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResource.get(java.lang.String,java.lang.String), with URI template, "", is treated
 as a resource method
2019-08-13 15:43:40.821+0000 INFO  Remote interface available at http://neo4j-cluster-replica-1.neo4j-cluster-replica.default.svc.cluster.local:7474/

=============

Docker Init for working replica

command failed: the provided initial password was not set because existing Neo4j users were detected at `/var/lib/neo4j/data/dbms/auth`. Please remove the existing `auth` and `role
s` files if you want to reset your database to only have a default user with the provided password.
Active database: graph.db
Directories in use:
  home:         /var/lib/neo4j
  config:       /var/lib/neo4j/conf
  logs:         /logs
  plugins:      /plugins
  import:       /var/lib/neo4j/import
  data:         /var/lib/neo4j/data
  certificates: /var/lib/neo4j/certificates
  run:          /var/lib/neo4j/run
Starting Neo4j.
2019-08-13 15:51:16.848+0000 WARN  ha.host.data is deprecated.
2019-08-13 15:51:16.863+0000 INFO  ======== Neo4j 3.5.3 ========
2019-08-13 15:51:16.867+0000 INFO  Starting...
2019-08-13 15:51:17.829+0000 INFO  Initiating metrics...
2019-08-13 15:51:17.959+0000 INFO  Resolved initial host 'neo4j-cluster-core.default.svc.cluster.local:5000' to [10.9.8.132:5000, 10.9.237.10:5000, 10.9.80.109:5000]
2019-08-13 15:51:21.031+0000 INFO  Connected to neo4j-cluster-core-2.neo4j-cluster-core.default.svc.cluster.local/10.9.8.132:6000 [catchup version:1]
2019-08-13 15:51:27.717+0000 INFO  Connected to neo4j-cluster-core-1.neo4j-cluster-core.default.svc.cluster.local/10.9.237.10:6000 [catchup version:1]
2019-08-13 15:51:30.731+0000 INFO  Connected to neo4j-cluster-core-0.neo4j-cluster-core.default.svc.cluster.local/10.9.80.109:6000 [catchup version:1]
2019-08-13 15:51:32.549+0000 INFO  Sending metrics to CSV file at /var/lib/neo4j/metrics
2019-08-13 15:51:33.036+0000 INFO  Bolt enabled on 0.0.0.0:7687.
2019-08-13 15:51:34.696+0000 INFO  Started.
2019-08-13 15:51:34.862+0000 INFO  Mounted REST API at: /db/manage
2019-08-13 15:51:34.876+0000 INFO  Mounted unmanaged extension [org.neo4j.graphql] at [/graphql]
2019-08-13 15:51:34.929+0000 INFO  Server thread metrics have been registered successfully
2019-08-13 15:51:35.331+0000 WARN  The following warnings have been detected with resource and/or provider classes:
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResourceExperimental.options(javax.ws.rs.core.HttpHeaders), with URI template, "",
 is treated as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResourceExperimental.executeOperation(java.lang.String), with URI template, "", is
 treated as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResourceExperimental.get(java.lang.String,java.lang.String), with URI template, ""
, is treated as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.ManagementResource.options(javax.ws.rs.core.HttpHeaders), with URI template, "", is treat
ed as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.ManagementResource.executeOperation(java.lang.String), with URI template, "", is treated
as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.ManagementResource.get(java.lang.String,java.lang.String), with URI template, "", is trea
ted as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResource.options(javax.ws.rs.core.HttpHeaders), with URI template, "", is treated
as a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResource.executeOperation(java.lang.String), with URI template, "", is treated as
a resource method
  WARNING: A sub-resource method, public final javax.ws.rs.core.Response org.neo4j.graphql.GraphQLResource.get(java.lang.String,java.lang.String), with URI template, "", is treated
 as a resource method
2019-08-13 15:51:35.706+0000 INFO  Remote interface available at http://neo4j-cluster-replica-0.neo4j-cluster-replica.default.svc.cluster.local:7474/

=========

Non working replica 1 debug log
http://snippi.com/s/irfil7r

=======

Update

Running cypher query directly and bypassing graphql via plugin https://github.com/neo4j-graphql/neo4j-graphql returns results in the non working replica and query doesn’t get stuck. A potential plugin issue?

1 REPLY 1

We temporarily resolved by updating from GraphQL plugin to https://grandstack.io . We still running some performance tests but all our replicas are now replying to our GraphQL queries.