cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Is reduce() the right way to do Cumulative Sums in Cypher

gq16
Node Clone

If I wanted to perform a cumulative sum operation similar to that of:

int x = 0;

int i = 0;

for ( i = 0; i = 10, i++)

{

x = x +5

}

system.out.println(x);

where I would get a returned value of 55.

How would I perform this action in Neo4j using cypher if the value I am accumulating is a node property? Is the reduce() function the proper way to do this?

 

1 ACCEPTED SOLUTION

glilienfield
Ninja
Ninja

I think this is it. I show the result for the test data, where the data is the exact same for each month. You can see each years starts with the January total and the output is the same for each year, which is should since the sales are identical for the test data.

 

 

 

with [[date('2019-01-01'),10],[date('2019-01-01'),10],[date('2019-02-01'),10],[date('2019-02-01'),10],[date('2019-02-01'),10],[date('2019-03-01'),10],[date('2019-03-01'),10],[date('2019-03-01'),10],[date('2019-04-01'),10],[date('2019-05-01'),10],[date('2019-06-01'),10],[date('2019-06-01'),10],[date('2019-07-01'),10],[date('2019-07-01'),10],[date('2019-08-01'),10],[date('2019-08-01'),10],[date('2019-09-01'),10],[date('2019-10-01'),10],[date('2019-10-01'),10],[date('2019-11-01'),10],[date('2019-11-01'),10],[date('2019-11-01'),10],[date('2019-12-01'),10],
[date('2020-01-01'),10],[date('2020-01-01'),10],[date('2020-02-01'),10],[date('2020-02-01'),10],[date('2020-02-01'),10],[date('2020-03-01'),10],[date('2020-03-01'),10],[date('2020-03-01'),10],[date('2020-04-01'),10],[date('2020-05-01'),10],[date('2020-06-01'),10],[date('2020-06-01'),10],[date('2020-07-01'),10],[date('2020-07-01'),10],[date('2020-08-01'),10],[date('2020-08-01'),10],[date('2020-09-01'),10],[date('2020-10-01'),10],[date('2020-10-01'),10],[date('2020-11-01'),10],[date('2020-11-01'),10],[date('2020-11-01'),10],[date('2020-12-01'),10],
[date('2021-01-01'),10],[date('2021-01-01'),10],[date('2021-02-01'),10],[date('2021-02-01'),10],[date('2021-02-01'),10],[date('2021-03-01'),10],[date('2021-03-01'),10],[date('2021-03-01'),10],[date('2021-04-01'),10],[date('2021-05-01'),10],[date('2021-06-01'),10],[date('2021-06-01'),10],[date('2021-07-01'),10],[date('2021-07-01'),10],[date('2021-08-01'),10],[date('2021-08-01'),10],[date('2021-09-01'),10],[date('2021-10-01'),10],[date('2021-10-01'),10],[date('2021-11-01'),10],[date('2021-11-01'),10],[date('2021-11-01'),10],[date('2021-12-01'),10]] as monthlyStats
unwind monthlyStats as stat
WITH stat[0].month AS month, stat[0].year AS year, sum(stat[1]) AS quantity
ORDER BY date({year: year, month: month}) ASC
WITH year, collect({month: month, quantity: quantity}) as monthlyStats
UNWIND range(1, 12) as month
WITH year, month, reduce(s=0, x in [i in monthlyStats where i.month <= month | i.quantity] | s + x) as monthTotal
WITH month, collect({year: year, sum: monthTotal}) as monthlyTotals
RETURN month as Month, 
[i in monthlyTotals where i.year = 2019 | i.sum ][0] as `2019`, 
[i in monthlyTotals where i.year = 2020 | i.sum ][0] as `2020`,
[i in monthlyTotals where i.year = 2021 | i.sum ][0] as `2021`
ORDER BY Month

 

 

 

Screen Shot 2022-09-15 at 12.58.47 PM.png 

Screen Shot 2022-09-15 at 12.58.57 PM.png

Adapting it to your query, this should work:

 

 

 

MATCH (q:Quantity)<-[:HAS_QUANTITY]-(sh:Shipment)-[:ON_ORDER]->(s:Sales)-[:FOR_CONTRACTED]->(a:Account{name: $neodash_account_name})
WHERE EXISTS((s)-[:HAS_ENTRY]->(:OrderEntry)-[:HAS_LINE]->(:OrderLine)-[:HAS_PRODUCT_ORIGIN]->(:ProductOrigin)-[:HAS_PRODUCT]->(:Product)-[:HAS_PRODUCT_TYPE]->(:ProductType{name: $neodash_producttype_name}))
WITH sh.shippedDate.month AS month, sh.shippedDate.year AS year, sum(q.quantity) AS quantity
ORDER BY date({year: year, month: month}) ASC
WITH year, collect({month: month, quantity: quantity}) as monthlyStats
UNWIND range(1, 12) as month
WITH year, month, reduce(s=0, x in [i in monthlyStats where i.month <= month | i.quantity] | s + x) as monthTotal
WITH month, collect({year: year, sum: monthTotal}) as monthlyTotals
RETURN month as Month, 
[i in monthlyTotals where i.year = 2019 | i.sum ][0] as `2019`, 
[i in monthlyTotals where i.year = 2020 | i.sum ][0] as `2020`,
[i in monthlyTotals where i.year = 2021 | i.sum ][0] as `2021`
ORDER BY Month

 

 

BTW- the query does not limit itself to only the years 2019-2021. Only the output is limited to those years due to selecting only those years in lines 10-12. If you data goes back further in years, the query is calculating those years but not using the totals. You can add a predicate to the 'where' clause on line 2 to to restrict the years to only those needed for the report. 

You can also easily add or remove specific years for your report by adding/removing years on lines 10-12. For instance, to add 2022, just add a line for it as follows:

 

[i in monthlyTotals where i.year = 2019 | i.sum ][0] as `2019`, 
[i in monthlyTotals where i.year = 2020 | i.sum ][0] as `2020`,
[i in monthlyTotals where i.year = 2021 | i.sum ][0] as `2021`,
[i in monthlyTotals where i.year = 2022 | i.sum ][0] as `2022`

 

 

View solution in original post

20 REPLIES 20

glilienfield
Ninja
Ninja

The 'reduce' works over a list of elements, so if you already have a list that is a good method. If you don't, then you can use the 'sum' function to sum the value over each record (or a subset if you have a grouping). 

If you have a collection, you can also 'unwind' the collection into rows and use the 'sum' over the rows. The best approach depends on your scenario. 

I provide examples using your scenario. The test data is generated from the following:

 

unwind range(0, 10) as i
create(n:Test{value:i})

 

With 'sum' function over rows of data:

 

match(n:Test)
return sum(n.value)

 

 With 'reduce', after creating the list of values for the purpose of demonstration:

 

match(n:Test)
with collect(n.value) as values
return reduce(s = 0, i in values | s + i) as sum

 

Both return 55.  

@glilienfield I appreciate the response, the actual application of how I'm trying to use this is to create a shipment pace output that is cumulative (meaning the values add to the previous sum and the line chart should never go down) It doesn't seem to be working as expected for me though.

This is one of my attempts at the Cypher (lines 3 & 5 particularly):

MATCH path =(q:Quantity)<-[hq:HAS_QUANTITY]-(sh:Shipment)-[oo:ON_ORDER]->(s:Sales)-[fc:FOR_CONTRACTED]->(a:Account),(s)-[he:HAS_ENTRY]->(oe:OrderEntry)-[hl:HAS_LINE]->(ol:OrderLine)-[hpo:HAS_PRODUCT_ORIGIN]->(po:ProductOrigin)-[hp:HAS_PRODUCT]->(p:Product)-[hpt:HAS_PRODUCT_TYPE]->(pt:ProductType)
WHERE a.name = $neodash_account_name AND sh.shippedDate.year >= 2019 AND pt.name = $neodash_producttype_name
WITH collect(q.quantity) AS Quantities, sh.shippedDate.year AS Year, sh.shippedDate.month AS Month
ORDER BY Year ASC //reduce(total = 0, x IN Quantities | total + x.quantity)
WITH Month, collect([Year,reduce(total = 0, x IN Quantities | total + x.quantity)]) AS Quantity_Year_Pairs
RETURN Month,
	Quantity_Year_Pairs[0][1] AS `2019` ,
	Quantity_Year_Pairs[1][1] AS `2020` ,
    Quantity_Year_Pairs[2][1] AS `2021`
ORDER BY Month

Here is what I'm going for (note how all the lines only increase in value over time since it is a cumulative sum):

gq16_0-1662047827739.png

But I get the following error and I can't really find a workaround:

"Type mismatch: expected a map but was Double(1.099750e+02)"

I think your error is a result of the following expression on line 5:

total + x.quantity

'x' is an element of 'Quantities', which is the collection of 'q.quantities', which is a double value. The expression 'x.quantity' is expecting 'x' to be a map, as you are using dot notation to access a member of the map. Since 'x' is a double value, you get the error message.

I gathered from reading your explanation and the code, you are trying to compute the cumulative sum of the monthly quantity shipped. I just refactored the code to the following. Sorry, I don't have any test data to try it on. It runs without error, but that doesn't mean it works. 

MATCH (q:Quantity)<-[:HAS_QUANTITY]-(sh:Shipment)-[:ON_ORDER]->(s:Sales)-[:FOR_CONTRACTED]->(a:Account{name: $neodash_account_name})
WHERE  sh.shippedDate.year >= 2019
AND exists((s)-[:HAS_ENTRY]->(:OrderEntry)-[:HAS_LINE]->(:OrderLine)-[:HAS_PRODUCT_ORIGIN]->(:ProductOrigin)-[:HAS_PRODUCT]->(:Product)-[:HAS_PRODUCT_TYPE]->(:ProductType{name: $neodash_producttype_name}))
WITH sh.shippedDate.month AS Month, sh.shippedDate.year AS Year, sum(q.quantity) AS monthlyTotal
ORDER BY date({year: Year, month: Month}) ASC
WITH collect({year: Year, month: Month, quantity: monthlyTotal}) as monthlyStats
WITH monthlyStats, range(0, size(monthlyStats)-1) as indexes
UNWIND indexes as index
RETURN monthlyStats[index].month as Month, monthlyStats[index].year as Year, reduce(s=0, x in [i in indexes where i <= index | monthlyStats[i].quantity] | s + x) as cummulatedQuantity

Line 1 and 2 match on the account to get the quantity shipped, while restricting the shipping to the years greater and equal to 2019, as well as ensuring the each sales contains the specified product type. Question, it doesn't seem that the quantity being calculated is just for those products with the indicated type. Isn't the quantity the total quantity shipped for the entire sale?

Line 4 sums the quantity for each month and year, while line 5 orders the monthly totals by year and month. This is to order them for the accumulation calculation. 

Line 6 collects each month into a map, so we can calculate the cumulative sums on the list. 

Line 7 creates a list of integers representing each element in the list. This will be used to access each element by index.

Line 8 unwinds the indexes into a single integer (index) that represents how far we want to reduce the list of monthly stats by, i.e. one month, two months, etc. 

Line 9 returns the month and year, along with the cumulative sum of the monthly status up and including that month. 

As a note, this is not very efficient, as each cumulative sum starts from index 0 each time. I don't know of a way to it otherwise.

Let me know if this doesn't work and what the issue is. I can try to fix it. Or, if it is totally off, what I missed. Do you have any test data I can use?

 

Firstly, thank you for the robust response, but it isn't working as expected. Here is the result:

gq16_0-1662139118390.png

What seems to be missing is the collection of each line for each year, since we see the sum restarts back at month 1 (Jan) once it reaches the end of the year. I'll work on solving this as well and share the solution if/when I figure it out.

To answer your question: "Question, it doesn't seem that the quantity being calculated is just for those products with the indicated type. Isn't the quantity the total quantity shipped for the entire sale?"

The quantity being calculated in the tableau image I shared is from all products and all shipments regardless of party (for the sake of example), but what I am trying to make an output for just the indicated product for a specific party. 

It looks like the graph is by month number and does not consider the year. This seem problematic for data that spans multiple years. I would think the graph would start again at the origin when a year boundary is crossed, since the year component is being truncated. 

I don't think your query will give you that, since the product type is eventually related to a sale, while the quantity is related to a sale. I don't see a relationship between the product and how much quantity was included on the sale. The quantity in your query seems to represent the total quantity shipped for the sale. 

You're correct, It does represent the total quantity shipped for the sale. The reason that isn't really an issue is that in this context, every sale only contains a single product type. One sale cannot have multiple products.

@glilienfield @niels_dejong I've gotten really close to figuring it out! I've been able to get the lines to split accordingly.

It looks like there is just an issue now of the count not starting back at zero (for 2019 and 2021 lines, see green arrow in attached image) and that the sum goes down at random towards the end of the chart for the 2020 line (see red drawn line for expected behavior).

Here is my cypher, let me me know if you have any advice!

MATCH (q:Quantity)<-[:HAS_QUANTITY]-(sh:Shipment)-[:ON_ORDER]->(s:Sales)-[:FOR_CONTRACTED]->(a:Account{name: $neodash_account_name})
WHERE EXISTS((s)-[:HAS_ENTRY]->(:OrderEntry)-[:HAS_LINE]->(:OrderLine)-[:HAS_PRODUCT_ORIGIN]->(:ProductOrigin)-[:HAS_PRODUCT]->(:Product)-[:HAS_PRODUCT_TYPE]->(:ProductType{name: $neodash_producttype_name}))
WITH sh.shippedDate.month AS Month, sh.shippedDate.year AS Year, sum(q.quantity) AS monthlyTotal
ORDER BY date({year: Year, month: Month}) ASC
WITH collect({year: Year, month: Month, quantity: monthlyTotal}) as monthlyStats
WITH monthlyStats, range(0, size(monthlyStats)-1) as indexes
UNWIND indexes as index
WITH monthlyStats[index].month as Month, collect([monthlyStats[index].year, reduce(s=0, x in [i in indexes where i <= index | monthlyStats[i].quantity] | s + x)]) as cummulatedQuantity
RETURN Month,
	cummulatedQuantity[0][1] AS `2019` ,
	cummulatedQuantity[1][1] AS `2020` ,
    cummulatedQuantity[2][1] AS `2021`
ORDER BY Month

Here is the result I'm getting:
 

gq16_0-1662591157288.png

I appreciate any feedback!

GQ16

I see, you need the data by month so you can plot the cumulative shipped quantity on the y-axis. You are plotting on a chart by month that spans only one year 's of months. Do you want the yearly values to start at zero when the year begins, or should it start where it's last year ended? Did the red line drop to zero in the out years because there was not data for those years and Tableau platted it as zero? 

Correct. I want the annual sum to reset to zero at the end of the previous year, but I want the yearly values to start at whatever the total sales for month 1 (January) are, so they wouldn't technically start at zero. I hope that makes sense. Essentially, the point at January, should be the quantity of sales in the month of January, so I would expect it to start above zero. So in reality, the light green line for 2019, is correct (ignore the green arrow I had drawn pointing to zero) but I want the orange line for 2020 to NOT start where the previous year left off. I want this chart to compare the TREND in our sales. which is why I'd like to overlay them.


Also, in this data set there is no sales data after October 2020 which is why there shouldn't be any increase, but I would expect that the red line would continue horizontally until the year is over, but that isn't a big deal if we can't get that figured out.

I got it. My mistake, the origin is January, so it should represent January’s sales. I will look into it 

extending line into the future doesn’t seem right.  It’s not showing anything is not showing the trend.  The line should probably just stop at the last month. 

Give this a try:

MATCH (q:Quantity)<-[:HAS_QUANTITY]-(sh:Shipment)-[:ON_ORDER]->(s:Sales)-[:FOR_CONTRACTED]->(a:Account{name: $neodash_account_name})
WHERE EXISTS((s)-[:HAS_ENTRY]->(:OrderEntry)-[:HAS_LINE]->(:OrderLine)-[:HAS_PRODUCT_ORIGIN]->(:ProductOrigin)-[:HAS_PRODUCT]->(:Product)-[:HAS_PRODUCT_TYPE]->(:ProductType{name: $neodash_producttype_name}))
WITH sh.shippedDate.month AS month, sh.shippedDate.year AS year, sum(q.quantity) AS quantity
ORDER BY date({year: year, month: month}) ASC
WITH collect({year: year, month: month, quantity: quantity}) as monthlyStats
WITH monthlyStats, range(0, size(monthlyStats)-1) as indexes
UNWIND indexes as index
WITH monthlyStats[index].month as month, monthlyStats[index].year as year, reduce(s=0, i in [x in indexes where x <= index | monthlyStats[x].quantity] | s + i) as sum
WITH month, collect({year: year, sum: sum}) as monthlyTotals
RETURN month as Month, 
[i in monthlyTotals where i.year = '2019' | i.sum ][0] as `2019`, 
[i in monthlyTotals where i.year = '2020' | i.sum ][0] as `2020`,
[i in monthlyTotals where i.year = '2021' | i.sum ][0] as `2021`
ORDER BY Month

@glilienfield 

Seems the quantities are not there or are getting lost. This was the result I got:

gq16_0-1663077824680.png

 

I see the error in my ways.  The 'year' property of a date is returned as a number.  I was testing as a string.  Try this:

 

MATCH (q:Quantity)<-[:HAS_QUANTITY]-(sh:Shipment)-[:ON_ORDER]->(s:Sales)-[:FOR_CONTRACTED]->(a:Account{name: $neodash_account_name})
WHERE EXISTS((s)-[:HAS_ENTRY]->(:OrderEntry)-[:HAS_LINE]->(:OrderLine)-[:HAS_PRODUCT_ORIGIN]->(:ProductOrigin)-[:HAS_PRODUCT]->(:Product)-[:HAS_PRODUCT_TYPE]->(:ProductType{name: $neodash_producttype_name}))
WITH sh.shippedDate.month AS month, sh.shippedDate.year AS year, sum(q.quantity) AS quantity
ORDER BY date({year: year, month: month}) ASC
WITH collect({year: year, month: month, quantity: quantity}) as monthlyStats
WITH monthlyStats, range(0, size(monthlyStats)-1) as indexes
UNWIND indexes as index
WITH monthlyStats[index].month as month, monthlyStats[index].year as year, reduce(s=0, i in [x in indexes where x <= index | monthlyStats[x].quantity] | s + i) as sum
WITH month, collect({year: year, sum: sum}) as monthlyTotals
RETURN month as Month, 
[i in monthlyTotals where i.year = 2019 | i.sum ][0] as `2019`, 
[i in monthlyTotals where i.year = 2020 | i.sum ][0] as `2020`,
[i in monthlyTotals where i.year = 2021 | i.sum ][0] as `2021`
ORDER BY Month

 

Result looks pretty similar to the one I was able to get earlier, the sum is working, but it doesn't restart the sum for the next year. 🤔

gq16_0-1663091595286.png

 

What year is the green line that has not shipments?  Is that correct?  It looks like it is starting the new year were it left off, plus the January shipments.  Assume the green line is 2019, which ends with zero shipments.  I assume the blue line is year 2020, which does start above where 2019 left off.  Year 2022, which I assume is red, starts off above where the 2020 line ends. where the difference should be the January 2022 sales.  No?

Something is wrong with the blue line, as all lines should be non-decreasing.  Can I get the shipment counts by year and month?

MATCH (q:Quantity)<-[:HAS_QUANTITY]-(sh:Shipment)-[:ON_ORDER]->(s:Sales)-[:FOR_CONTRACTED]->(a:Account{name: $neodash_account_name})
WHERE EXISTS((s)-[:HAS_ENTRY]->(:OrderEntry)-[:HAS_LINE]->(:OrderLine)-[:HAS_PRODUCT_ORIGIN]->(:ProductOrigin)-[:HAS_PRODUCT]->(:Product)-[:HAS_PRODUCT_TYPE]->(:ProductType{name: $neodash_producttype_name}))
RETURN sh.shippedDate.month AS month, sh.shippedDate.year AS year, sum(q.quantity) AS quantity

I went ahead and replaced your query results with some fixed data over the three years to observe the computed results.  Each shipment has a quantity of 10 to keep it easy, but the number of shipments per month varies. The accumulated sum month-over-month is correct, with the years continuing as desired. 

 

with [[date('2019-01-01'),10],[date('2019-01-01'),10],[date('2019-02-01'),10],[date('2019-02-01'),10],[date('2019-02-01'),10],[date('2019-03-01'),10],[date('2019-03-01'),10],[date('2019-03-01'),10],[date('2019-04-01'),10],[date('2019-05-01'),10],[date('2019-06-01'),10],[date('2019-06-01'),10],[date('2019-07-01'),10],[date('2019-07-01'),10],[date('2019-08-01'),10],[date('2019-08-01'),10],[date('2019-09-01'),10],[date('2019-10-01'),10],[date('2019-10-01'),10],[date('2019-11-01'),10],[date('2019-11-01'),10],[date('2019-11-01'),10],[date('2019-12-01'),10],
[date('2020-01-01'),10],[date('2020-01-01'),10],[date('2020-02-01'),10],[date('2020-02-01'),10],[date('2020-02-01'),10],[date('2020-03-01'),10],[date('2020-03-01'),10],[date('2020-03-01'),10],[date('2020-04-01'),10],[date('2020-05-01'),10],[date('2020-06-01'),10],[date('2020-06-01'),10],[date('2020-07-01'),10],[date('2020-07-01'),10],[date('2020-08-01'),10],[date('2020-08-01'),10],[date('2020-09-01'),10],[date('2020-10-01'),10],[date('2020-10-01'),10],[date('2020-11-01'),10],[date('2020-11-01'),10],[date('2020-11-01'),10],[date('2020-12-01'),10],
[date('2021-01-01'),10],[date('2021-01-01'),10],[date('2021-02-01'),10],[date('2021-02-01'),10],[date('2021-02-01'),10],[date('2021-03-01'),10],[date('2021-03-01'),10],[date('2021-03-01'),10],[date('2021-04-01'),10],[date('2021-05-01'),10],[date('2021-06-01'),10],[date('2021-06-01'),10],[date('2021-07-01'),10],[date('2021-07-01'),10],[date('2021-08-01'),10],[date('2021-08-01'),10],[date('2021-09-01'),10],[date('2021-10-01'),10],[date('2021-10-01'),10],[date('2021-11-01'),10],[date('2021-11-01'),10],[date('2021-11-01'),10],[date('2021-12-01'),10]] as monthlyStats
unwind monthlyStats as stat
WITH stat[0].month AS month, stat[0].year AS year, sum(stat[1]) AS quantity
ORDER BY date({year: year, month: month}) ASC
WITH collect({year: year, month: month, quantity: quantity}) as monthlyStats
WITH monthlyStats, range(0, size(monthlyStats)-1) as indexes
UNWIND indexes as index
WITH monthlyStats[index].month as month, monthlyStats[index].year as year, reduce(s=0, i in [x in indexes where x <= index | monthlyStats[x].quantity] | s + i) as sum
WITH month, collect({year: year, sum: sum}) as monthlyTotals
RETURN month as Month, 
[i in monthlyTotals where i.year = 2019 | i.sum ][0] as `2019`, 
[i in monthlyTotals where i.year = 2020 | i.sum ][0] as `2020`,
[i in monthlyTotals where i.year = 2021 | i.sum ][0] as `2021`
ORDER BY Month

 

Screen Shot 2022-09-13 at 5.45.12 PM.png 

Screen Shot 2022-09-13 at 5.45.20 PM.png

This is great, but I DO want the lines to restart at the value for the January Sales for THAT year. So if 2019 sales ended at 230, and January 2020 sales were , I would want the 2020 line to start at the y value 40, not 270 (in this example). So instead of the 2020 line going from 270 to 460, I would want it to go from 40 to 230. This would make the lines overlap in a way such that I can compare the sales trends from one year to the next. That way I can achieve an output similar to that of the first tableau report I shared with you. Hopefully the image with the drawing below shows you what I mean in terms of where the 2020 and 2021 lines should be.
 gq16_1-1663257123011.png

I'm just stumped at how to create 3 different sums in a way that achieves the desired outcome but is also scalable.

Thank you so much for your time and devotion to this problem! @glilienfield 

 

I got it now.  You want three independent lines that sum just that year. Ok, I will fix it 

glilienfield
Ninja
Ninja

I think this is it. I show the result for the test data, where the data is the exact same for each month. You can see each years starts with the January total and the output is the same for each year, which is should since the sales are identical for the test data.

 

 

 

with [[date('2019-01-01'),10],[date('2019-01-01'),10],[date('2019-02-01'),10],[date('2019-02-01'),10],[date('2019-02-01'),10],[date('2019-03-01'),10],[date('2019-03-01'),10],[date('2019-03-01'),10],[date('2019-04-01'),10],[date('2019-05-01'),10],[date('2019-06-01'),10],[date('2019-06-01'),10],[date('2019-07-01'),10],[date('2019-07-01'),10],[date('2019-08-01'),10],[date('2019-08-01'),10],[date('2019-09-01'),10],[date('2019-10-01'),10],[date('2019-10-01'),10],[date('2019-11-01'),10],[date('2019-11-01'),10],[date('2019-11-01'),10],[date('2019-12-01'),10],
[date('2020-01-01'),10],[date('2020-01-01'),10],[date('2020-02-01'),10],[date('2020-02-01'),10],[date('2020-02-01'),10],[date('2020-03-01'),10],[date('2020-03-01'),10],[date('2020-03-01'),10],[date('2020-04-01'),10],[date('2020-05-01'),10],[date('2020-06-01'),10],[date('2020-06-01'),10],[date('2020-07-01'),10],[date('2020-07-01'),10],[date('2020-08-01'),10],[date('2020-08-01'),10],[date('2020-09-01'),10],[date('2020-10-01'),10],[date('2020-10-01'),10],[date('2020-11-01'),10],[date('2020-11-01'),10],[date('2020-11-01'),10],[date('2020-12-01'),10],
[date('2021-01-01'),10],[date('2021-01-01'),10],[date('2021-02-01'),10],[date('2021-02-01'),10],[date('2021-02-01'),10],[date('2021-03-01'),10],[date('2021-03-01'),10],[date('2021-03-01'),10],[date('2021-04-01'),10],[date('2021-05-01'),10],[date('2021-06-01'),10],[date('2021-06-01'),10],[date('2021-07-01'),10],[date('2021-07-01'),10],[date('2021-08-01'),10],[date('2021-08-01'),10],[date('2021-09-01'),10],[date('2021-10-01'),10],[date('2021-10-01'),10],[date('2021-11-01'),10],[date('2021-11-01'),10],[date('2021-11-01'),10],[date('2021-12-01'),10]] as monthlyStats
unwind monthlyStats as stat
WITH stat[0].month AS month, stat[0].year AS year, sum(stat[1]) AS quantity
ORDER BY date({year: year, month: month}) ASC
WITH year, collect({month: month, quantity: quantity}) as monthlyStats
UNWIND range(1, 12) as month
WITH year, month, reduce(s=0, x in [i in monthlyStats where i.month <= month | i.quantity] | s + x) as monthTotal
WITH month, collect({year: year, sum: monthTotal}) as monthlyTotals
RETURN month as Month, 
[i in monthlyTotals where i.year = 2019 | i.sum ][0] as `2019`, 
[i in monthlyTotals where i.year = 2020 | i.sum ][0] as `2020`,
[i in monthlyTotals where i.year = 2021 | i.sum ][0] as `2021`
ORDER BY Month

 

 

 

Screen Shot 2022-09-15 at 12.58.47 PM.png 

Screen Shot 2022-09-15 at 12.58.57 PM.png

Adapting it to your query, this should work:

 

 

 

MATCH (q:Quantity)<-[:HAS_QUANTITY]-(sh:Shipment)-[:ON_ORDER]->(s:Sales)-[:FOR_CONTRACTED]->(a:Account{name: $neodash_account_name})
WHERE EXISTS((s)-[:HAS_ENTRY]->(:OrderEntry)-[:HAS_LINE]->(:OrderLine)-[:HAS_PRODUCT_ORIGIN]->(:ProductOrigin)-[:HAS_PRODUCT]->(:Product)-[:HAS_PRODUCT_TYPE]->(:ProductType{name: $neodash_producttype_name}))
WITH sh.shippedDate.month AS month, sh.shippedDate.year AS year, sum(q.quantity) AS quantity
ORDER BY date({year: year, month: month}) ASC
WITH year, collect({month: month, quantity: quantity}) as monthlyStats
UNWIND range(1, 12) as month
WITH year, month, reduce(s=0, x in [i in monthlyStats where i.month <= month | i.quantity] | s + x) as monthTotal
WITH month, collect({year: year, sum: monthTotal}) as monthlyTotals
RETURN month as Month, 
[i in monthlyTotals where i.year = 2019 | i.sum ][0] as `2019`, 
[i in monthlyTotals where i.year = 2020 | i.sum ][0] as `2020`,
[i in monthlyTotals where i.year = 2021 | i.sum ][0] as `2021`
ORDER BY Month

 

 

BTW- the query does not limit itself to only the years 2019-2021. Only the output is limited to those years due to selecting only those years in lines 10-12. If you data goes back further in years, the query is calculating those years but not using the totals. You can add a predicate to the 'where' clause on line 2 to to restrict the years to only those needed for the report. 

You can also easily add or remove specific years for your report by adding/removing years on lines 10-12. For instance, to add 2022, just add a line for it as follows:

 

[i in monthlyTotals where i.year = 2019 | i.sum ][0] as `2019`, 
[i in monthlyTotals where i.year = 2020 | i.sum ][0] as `2020`,
[i in monthlyTotals where i.year = 2021 | i.sum ][0] as `2021`,
[i in monthlyTotals where i.year = 2022 | i.sum ][0] as `2022`

 

 

You @glilienfield , are a hero.

I will study this query to understand it's exact nuance and how you got everything working. I also really appreciate the comment on being able to add multiple years. That is awesome. Thank you for staying with me through this!