Neo4j

ask_singh90 · ‎03-02-2022

Hi, I am a newbie to Neo4j and struck with following problem.
My Dataset is as below
Object, Size, Color
A,200,White
A,300,Black
A,200,Pink
B,300,White
B,300,Black

The expected output is two nodes A and B with two properties Size and Color but there should not be duplicate size being displayed. For Example Node A has 200 in White and Pink Color but Size property should be showing 200 only once i.e. (200,300) and Color property should be showing (White, Black, Pink). Can somebody please help me with cypher code?

bennu_neo · ‎03-02-2022

Hi @ask.singh90 ,

How have you modelled this problem? Or is it actually that the question?

Are you planning on creating instances of each object with property/tag the gives you type while you have specific size color combinations on each one of them? Dense Object nodes with HAS_SIZE and HAS_COLOR to other dense nodes?

Bennu

Oh, y’all wanted a twist, ey?

ask_singh90 · ‎03-02-2022

I want to show just the object nodes with multiple values for a property. like A with Size property(200,300) and Color property(White, Black, Pink).

bennu_neo · ‎03-02-2022

Hi @ask.singh90 ,

Based on a data like :

CREATE (o:OBJECT {type : 'A', size : 200, color : 'White'});
CREATE (o:OBJECT {type : 'A', size : 300, color : 'Black'});
CREATE (o:OBJECT {type : 'A', size : 200, color : 'Pink'});
CREATE (o:OBJECT {type : 'B', size : 300, color : 'White'});
CREATE (o:OBJECT {type : 'B', size : 300, color : 'Black'});

You can get the expected result with:

MATCH(o:OBJECT)
WITH o.type as type, collect(distinct o.size) as sizes, collect(distinct o.color) as colors
RETURN type, sizes, colors

Notice that it may be misleading because you don't have the object A on size 300 color pink.

Bennu

Oh, y’all wanted a twist, ey?

ask_singh90 · ‎03-02-2022

Hi bennu - thanks for your reply but I want to make the generic query. I have a spreadsheet with thousands of rows like this. I wrote following code but the problem with it is, it keeps on adding new value irrespective of whether it is already present or not so A gets size property(200,300,200) instead of expected (200,300)
LOAD CSV WITH HEADERS FROM "file:///myfile.csv" AS nodeRecord
MERGE (n: Object { object:nodeRecord.object })
on CREATE SET n.size = [ nodeRecord.size ]
on CREATE SET n.color = [ nodeRecord.color ]
on MATCH SET n.size = n.size + [ nodeRecord.size ]
on MATCH SET n.color = n.color + [ nodeRecord.color ]

bennu_neo · ‎03-02-2022

Hi @ask.singh90 ,

Are planning on use APOC on your application?

This may help you

Bennu

Oh, y’all wanted a twist, ey?

ask_singh90 · ‎03-02-2022

Hi Bennu, I am sorry I don't know much about usage of apoc. I am just doing a small poc to learn more about neo4j and cypher and my data set has values as i mentioned in my original post.

glilienfield · ‎03-02-2022

I don't see anything wrong with your query that would cause duplicate A and B nodes. I run it locally and got a single A node and a single B node, each with the aggregated data.

As a note, the query creates size and color lists that contain duplicate values because you are pushing new values to each list.

You could use a query like the following to avoid this. The query processes the entire import file first to calculate the aggregates of distinct values, then creates the nodes. Just a note, more memory would be required if you have huge files.

LOAD CSV WITH HEADERS FROM "file:///myfile.csv" AS nodeRecord
with nodeRecord.object as object, collect(distinct nodeRecord.size) as size, collect(distinct nodeRecord.color) as color
merge (n: Object { object:object }) set n.size = size, n.color = color

ask_singh90 · ‎03-02-2022

Thanks @glilienfield ..it works well

Neo4j

Multiple Values for a property