Neo4j

grpnpraveen · ‎10-26-2021

Hi , I am trying to make relations among nodes using python (neo4j lib). The main job that is creating the nodes for the data is done but making relationships among them is really taking time than I expected (it is almost 12hrs for exact 213867 relations).
The main concern is that no process is using my CPU more than 20% , especially Zulu platform ,which does the work i.e creating the relations . Is there a way to speedup this process or to increase my cpu usage?

Bennu · ‎10-27-2021

Hi @grpnpraveen !

Just as I tho. You have no Indexes! If you feel confident enough on us, try this:

CREATE INDEX ID_ACTOR for (n:actor) on (n.id);
CREATE INDEX ID_MOVIE for (n:movie) on (n.id);

Then change your query to:

MATCH(p:actor)
WHERE p.id = $actor_id
with p
MATCH (m:movie) 
WHERE m.id = $movie_id
WITH p,m
CREATE (p)-[r:acted-in]->(m)
RETURN type(r)

Bennu

View solution in original post

Bennu · ‎10-26-2021

Hi @grpnpraveen !

Can you share your model and the queries you are using in order to do the import?

I do assume you already created some indexes in order to speed up the MATCH process.

Bennu

grpnpraveen · ‎10-26-2021

@Bennu how can I share the model with you here ? by the way its 213867 relations, it just completed.
yes, I put some ids to match.
thanks in advance

Bennu · ‎10-27-2021

Hi @grpnpraveen!

You can always do it yourself on Arrow website.

Maybe you can share the queries with a EXPLAIN so we can check if the indexes are working properly on your query.

Bennu

grpnpraveen · ‎10-27-2021

@Bennu thanks for the reply.
I don't think that the nodes are that complex. First, I try here to explain about what I am facing with.
I read a csv first where it has ids of cast for each movie in a row ie "0991993|11688537|1190847" , which are ids of the cast that need to be connected.
This is the code.

pls do ask if any

Bennu · ‎10-27-2021

Hi @grpnpraveen !

Can you share the result pipeline of this query?

EXPLAIN MATCH(p:actor),(m:movie)
WHERE p.id = 0991993 AND m.id = setMovieIdHere
return *

grpnpraveen · ‎10-27-2021

Sure @Bennu !
I used this , here instead of 0958345 , I put 958345

WHERE p.id = 958345 AND m.id = "tt0037711"
return *

I have created relation of actor for 5000 movies only! It took 12 hrs +. I still need to add for 85000 movies.

Bennu · ‎10-27-2021

Hi @grpnpraveen !

Just as I tho. You have no Indexes! If you feel confident enough on us, try this:

CREATE INDEX ID_ACTOR for (n:actor) on (n.id);
CREATE INDEX ID_MOVIE for (n:movie) on (n.id);

Then change your query to:

MATCH(p:actor)
WHERE p.id = $actor_id
with p
MATCH (m:movie) 
WHERE m.id = $movie_id
WITH p,m
CREATE (p)-[r:acted-in]->(m)
RETURN type(r)

Bennu

grpnpraveen · ‎10-27-2021

@Bennu right now I am creating relations for directors and writers which may take some time.

I am confident enough to add those . I will, after completion of these directors and writers.
If you don't mind Can you explain what is this INDEX is?

Bennu · ‎10-27-2021

Hi!

You should add indexes on your directors and writers as well.

Index in general:

Database index - Wikipedia.

Index in particular:

Bennu

PS: With this you easily go from 12h to 12 minutues or much less.

grpnpraveen · ‎10-27-2021

Can I add this index even now ? Cz I have already added all the nodes .

Also,

CREATE INDEX ID_ACTOR for (n:actor) on (n.id);
CREATE INDEX ID_MOVIE for (n:movie) on (n.id);

what is ID_ACTOR,ID_MOVIE here?

Bennu · ‎10-27-2021

Sure. But I strongly suggest you to retry the whole process with the indexes Online in order to verify everything has being said.

grpnpraveen · ‎10-27-2021

okay ! thanks a lot .

Bennu · ‎10-27-2021

The name of the index.

Spend some time reading the articles shared. It may help a lot.

grpnpraveen · ‎10-27-2021

@Bennu OMG!! THANKS for millennium!
It just completed in 10 min for 5000 movies!

Bennu · ‎10-27-2021

Told ya

It's all about indexing

Neo4j

Less efficiency while performing create relation with neo4j from python