cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Table backfills data for empty values

gq16
Node Clone

Here is my problem: I want to look at varying sales each day of the month for April, May, and June, but I don't have any sales for some days in April. Why does Neo4j Backfill the values for May into April and June into May if there is no value for April? Can someone help!

Here is the table I'm expecting:

gq16_0-1667578149400.png

Here is the table I'm getting (It's just the first few days but the issue is clear):

Neo4j Incorrect Table Drawing.png

Here is my query:

MATCH (p:Party)-[:AGREED_TO]->(c:Contract)-[:HAS_LINE]->(cl:ContractLine)<-[:IS_ASSIGNED_TO]-(pr:Product),(cl)<-[:ACTUALIZED_ON]-(s:Shipment)-[:SHIPPED_ON]->(d:Date)
WHERE p.name = 'Winfield Solutions LLC' AND pr.type = 'UREA' AND d.id.year = 2022 AND d.id.month IN [4,5,6]
WITH sum(cl.quantity32e) AS Quantity32e,d.id.month AS Month, d.id.day AS Day
ORDER BY Month
WITH Day, collect([Month,Quantity32e]) AS Quantity_Month_Pairs
RETURN Day,
	Quantity_Month_Pairs[0][1] AS `April`,
	Quantity_Month_Pairs[1][1] AS `May` ,
    Quantity_Month_Pairs[2][1] AS `June`
ORDER BY Day

 

1 ACCEPTED SOLUTION

I take it back! It works, you just made an error in the coalesce statement saying that i[0] = 'April' rather than saying i[0] = 4 since the months are saved in the Quantity_Month_Pairs as numerical values, not the month names.

Here is the modified cypher:

MATCH (p:Party)-[:AGREED_TO]->(c:Contract)-[:HAS_LINE]->(cl:ContractLine)<-[:IS_ASSIGNED_TO]-(pr:Product),(cl)<-[:ACTUALIZED_ON]-(s:Shipment)-[:SHIPPED_ON]->(d:Date)
WHERE p.name = 'Winfield Solutions LLC' AND pr.type = 'UREA' AND d.id.year = 2022 AND d.id.month IN [4,5,6]
WITH sum(cl.quantity32e) AS Quantity32e,d.id.month AS Month, d.id.day AS Day
ORDER BY Month
WITH Day, collect([Month,Quantity32e]) AS Quantity_Month_Pairs
RETURN Day,
	COALESCE([i in Quantity_Month_Pairs where i[0] = 4| i[1]][0], 0) AS April,
	COALESCE([i in Quantity_Month_Pairs where i[0] = 5| i[1]][0], 0) AS May,
	COALESCE([i in Quantity_Month_Pairs where i[0] = 6 | i[1]][0], 0) AS June
ORDER BY Day

And here are the awesome results:

gq16_0-1667923820764.png

 

Amazing teamwork! Thank you @glilienfield & @ameyasoft  for your help!

View solution in original post

9 REPLIES 9

ameyasoft
Graph Maven
Try this:

MATCH (p:Party)-[:AGREED_TO]->(c:Contract)-[:HAS_LINE]->(cl:ContractLine)<-[:IS_ASSIGNED_TO]-(pr:Product),(cl)<-[:ACTUALIZED_ON]-(s:Shipment)-[:SHIPPED_ON]->(d:Date)
WHERE p.name = 'Winfield Solutions LLC' AND pr.type = 'UREA' AND d.id.year = 2022 AND d.id.month IN [4,5,6]
WITH d, COALESCE(cl.quantity32e, 0) as qty32e
WITH sum(qty32e) AS Quantity32e,d.id.month AS Month, d.id.day AS Day
ORDER BY Month
WITH Day, collect([Month,Quantity32e]) AS Quantity_Month_Pairs
RETURN Day,
	Quantity_Month_Pairs[0][1] AS `April`,
	Quantity_Month_Pairs[1][1] AS `May` ,
    Quantity_Month_Pairs[2][1] AS `June`
ORDER BY Day

 Seems to still be giving the same result. @ameyasoft 

gq16_0-1667594091280.png

 

 

Try this:

MATCH (p:Party)-[:AGREED_TO]->(c:Contract)-[:HAS_LINE]->(cl:ContractLine)<-[:IS_ASSIGNED_TO]-(pr:Product),(cl)<-[:ACTUALIZED_ON]-(s:Shipment)-[:SHIPPED_ON]->(d:Date)
WHERE p.name = 'Winfield Solutions LLC' AND pr.type = 'UREA' AND d.id.year = 2022 AND d.id.month IN [4,5,6]
WITH sum(cl.quantity32e) AS Quantity32e,d.id.month AS Month, d.id.day AS Day

WITH COALESCE(Quantity32e, 0) as Quantity32e, Month, Day ORDER BY Month

WITH Day, collect([Month,Quantity32e]) AS Quantity_Month_Pairs
RETURN Day,
	Quantity_Month_Pairs[0][1] AS `April`,
	Quantity_Month_Pairs[1][1] AS `May` ,
    Quantity_Month_Pairs[2][1] AS `June`
ORDER BY Day

 

Still the same.😭 

gq16_0-1667596859317.png

 

glilienfield
Ninja
Ninja

I believe the root cause is that each month does not have data for every day. As such, when you are accessing the first index in your two-dimensional array, it is not always true that index 0 represents 'April', index 1 represents 'May', and index 2 represents 'June'.  When the day is missing for a month, the later months shift down so the months that do have days always start at index 0. That explains the behavior you are seeing. 

Try this:

MATCH (p:Party)-[:AGREED_TO]->(c:Contract)-[:HAS_LINE]->(cl:ContractLine)<-[:IS_ASSIGNED_TO]-(pr:Product),(cl)<-[:ACTUALIZED_ON]-(s:Shipment)-[:SHIPPED_ON]->(d:Date)
WHERE p.name = 'Winfield Solutions LLC' AND pr.type = 'UREA' AND d.id.year = 2022 AND d.id.month IN [4,5,6]
WITH sum(cl.quantity32e) AS Quantity32e,d.id.month AS Month, d.id.day AS Day
ORDER BY Month
WITH Day, collect([Month,Quantity32e]) AS Quantity_Month_Pairs
RETURN Day,
	COALESCE([i in Quantity_Month_Pairs where i[0] = 'April' | i[1]][0], 0),
	COALESCE([i in Quantity_Month_Pairs where i[0] = 'May' | i[1]][0], 0),
	COALESCE([i in Quantity_Month_Pairs where i[0] = 'June' | i[1]][0], 0)
ORDER BY Day

If it does work, you should be able to remove the 'ORDER BY' on line 4. 

I take it back! It works, you just made an error in the coalesce statement saying that i[0] = 'April' rather than saying i[0] = 4 since the months are saved in the Quantity_Month_Pairs as numerical values, not the month names.

Here is the modified cypher:

MATCH (p:Party)-[:AGREED_TO]->(c:Contract)-[:HAS_LINE]->(cl:ContractLine)<-[:IS_ASSIGNED_TO]-(pr:Product),(cl)<-[:ACTUALIZED_ON]-(s:Shipment)-[:SHIPPED_ON]->(d:Date)
WHERE p.name = 'Winfield Solutions LLC' AND pr.type = 'UREA' AND d.id.year = 2022 AND d.id.month IN [4,5,6]
WITH sum(cl.quantity32e) AS Quantity32e,d.id.month AS Month, d.id.day AS Day
ORDER BY Month
WITH Day, collect([Month,Quantity32e]) AS Quantity_Month_Pairs
RETURN Day,
	COALESCE([i in Quantity_Month_Pairs where i[0] = 4| i[1]][0], 0) AS April,
	COALESCE([i in Quantity_Month_Pairs where i[0] = 5| i[1]][0], 0) AS May,
	COALESCE([i in Quantity_Month_Pairs where i[0] = 6 | i[1]][0], 0) AS June
ORDER BY Day

And here are the awesome results:

gq16_0-1667923820764.png

 

Amazing teamwork! Thank you @glilienfield & @ameyasoft  for your help!

Now it simply gives nothing but zeros. It seems to me that the inside of the coalesce statement is not finding any data, and therefore is coalescing all the values to 0.

gq16_0-1667921395912.png