Neo4j

JB47394 · ‎08-24-2020

If I have a bunch of nodes identified by a MATCH, how do I get the relationships between those nodes and only those nodes? The Neo4j browser shows this to me when it shows the nodes matched by a query, but I've been unable to duplicate it.

I thought the following would work, but it gives me no results.

MATCH (rt:Something)
WITH rt,collect(rt) as rtc
MATCH (rt)-[r]-(rt2:Something)
WHERE rt2 in rtc
return r

I'm running an old project on 3.5.20.

tony_chiboucas · ‎08-24-2020

I think you can ignore most of my previous response... that was my working out. Maybe it'll help?

Here's a thing that will do the thing you want.
http://console.neo4j.org/r/dlpbhx

MATCH (n:Topo) WHERE id(n) IN [1,2,3,4,7,8,19]
WITH n
MATCH (n2:Topo) WHERE id(n2) IN [1,2,3,4,7,8,19]
MATCH (n)-[r:TO]-(n2)
RETURN r

View solution in original post

tony_chiboucas · ‎08-24-2020

Do you have a sample of your graph? There's either more than one step between the nodes, or no relationships between them...

Try this first

MATCH p=(r:Something)-[]-(r:Something)
RETURN p

For longer length paths

https://neo4j.com/docs/cypher-manual/current/syntax/patterns/

MATCH p=(r:Something)-[*]-(r:Something)
RETURN p

JB47394 · ‎08-24-2020

Nope.

I can issue the first line MATCH in the browser to see the set of nodes that I'm after. There are 7 nodes matched, and they are fully and directly interconnected by one type of relationship - which the browser shows me. So if the nodes are 1 through 7, 1 has one relationship with each of 2 through 7, 2 has one relationship with 1 and 3 through 7, etc. I can see the exact structure that I expect, but I cannot come up with a query that will return the relationships between those 7 nodes (and only those nodes).

How would I go about providing a sample?

tony_chiboucas · ‎08-24-2020

Just get a screen capture of the visualization in the Browser...

or, even better, create it in the neo4j console

JB47394 · ‎08-24-2020

Well, here's the seven nodes connected by relationships, for whatever value that provides. The nodes are returned by the initial MATCH. The browser fills in the relationships.

The node type is RecordTopology and the relationship is CONNECTED_TO.

tony_chiboucas · ‎08-24-2020

I think I'm missing a big piece of the equation here...

...you just did.

It looks like you're simplifying the problem, to make it easier for us to help you (THANK YOU!). However, in this case, I think the complexity of your graph, and how to isolate those "7 nodes" is the real problem you need to solve.

Could you give me a little more context, and maybe some of the data on those nodes, and more nodes?

MATCH p=(:RecordTopology)-[]-(:RecordTopology)
RETURN p LIMIT 120

Could you share the table format of that result?

JB47394 · ‎08-24-2020

No, I have a query that returns the nodes. I don't have a means of referring to the relationships so that I can modify their attributes. That's the reason I'm here.

Could you elaborate on that? Why does the manner in which nodes are located have anything to do with obtaining additional information about them?

It's a bunch of RecordTopology nodes that are heavily interconnected by CONNECTED_TO relationships.

Requested Table.txt (80.3 KB)

tony_chiboucas · ‎08-24-2020

Okay, here's a simplified snapshot of your data: http://console.neo4j.org/?id=a3cnq0

Let's start by creating a collection of "matched" nodes.

MATCH (n:Topo) WITH n LIMIT 7 WITH collect(n) as startingSet RETURN startingSet

But we could just as easily specify by id, or name, or other property:

Getting it into a workable list of nodes

Unwinding that starting set is the same as not collecting them in the first place, but since it looks like you have a collection to work with, I'll show both methods:

MATCH (n:Topo) WITH n LIMIT 7 WITH collect(n) as startingSet 
UNWIND startingSet AS n RETURN n

...will give you the exact same data to work with as...

MATCH (n:Topo) WITH n LIMIT 7

Get the relationships

Hang on a minute... this sounds like what you're running into... lets take a closer look at the graph...

... ah-ha... Topos 0 through 6 don't link to eachother at all, but most of them link to 7 and 8, so let's use a more specific subset:

MATCH (n:Topo) 
WHERE id(n) IN [1,2,3,4,7,8,19]
MATCH (n)-[r:TO]-()
RETURN r

That looks more like it, and now you can mutate those relationships to your heart's content.

tony_chiboucas · ‎08-24-2020

Wait a minute... I think I see what you're getting at...

tony_chiboucas · ‎08-24-2020

I think you can ignore most of my previous response... that was my working out. Maybe it'll help?

Here's a thing that will do the thing you want.
http://console.neo4j.org/r/dlpbhx

MATCH (n:Topo) WHERE id(n) IN [1,2,3,4,7,8,19]
WITH n
MATCH (n2:Topo) WHERE id(n2) IN [1,2,3,4,7,8,19]
MATCH (n)-[r:TO]-(n2)
RETURN r

JB47394 · ‎08-24-2020

Thank you a ton for going through the work to figure this out. It's much appreciated.

This is what I ended up going with:

MATCH <something that produces an n>
WITH collect(id(n)) AS c
MATCH (n:Topo) WHERE id(n) IN c
MATCH (n2:Topo) WHERE id(n2) IN c
MATCH (n)-[r:TO]-(n2)
RETURN r

I'm a little disappointed that Cypher doesn't have a more natural way of referring to an element of a pattern match (n) twice. But I probably just don't understand how to Cypher well enough yet.

Thanks again.

anthapu · ‎08-24-2020

You could also try

MATCH (n)-[r:TO]-(n2)
WHERE id(n) in c and id(n2) in c
RETURN r

JB47394 · ‎08-24-2020

Yes, that's a much cleaner notation and it works just fine. Thank you. The resulting performance of the query is unchanged.

I noted that when I PROFILE each style of query (mine with node labels and your queries without them),

MATCH (n)-[r:TO]-(n2)

will generate 344 dbhits while

MATCH (n:Topo)-[r:TO]-(n2:Topo)

will generate 408 dbhits.

That sure looks like the unlabeled query provides better performance - which I find somewhat counterintuitive. I would have thought that providing more details about what I'm after would always be better.

anthapu · ‎08-25-2020

That's because when you are using node ID's you are getting the node directly. When you add a label there in the query the query engine needs to do one extra check to see if the node returned by id has the label you specified. That's why you see increased db hits.

For index lookup's having label is mandatory. In traversals if you know the relationship traversal identifies the node distinctly not adding label makes query faster.

Same goes for retrieving nodes using node id's.

tony_chiboucas · ‎08-25-2020

Also, if you have properties on those nodes that you are using to isolate the specific nodes you're after, then you should add them to an index, and use those properties in your where clause. Should speed things up a bit.