cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Count and nodes for each day of week and each hour

chrszrkl
Node Clone

I'd like to query and count orders in my database and return a row for each day of the week (Mon-Sun) and 24 columns for each hour of the day (0:00 - 23:00). The return should be like this:

DayOfWeek	| 0:00 |	 1:00 |	 2:00 |	 3:00 |	 4:00 |	 5:00 |	 ... |	 22:00 |	 23:00 | 
Mon						0			0			 1			2				0			1			...			 0	 		5	
Tue						1			0			 0			0				0			2			...			 2	 		2
Wed						0			0			 1			3				5			0			...			 4	 		3
Thu						4			0			 1			0				0			1			...			 3	 		9
Fri						0			0			 0			2				2			6			...			10	 		6
Sat						2			0			 0			6				0			1			...			12	 		9
Sun						1			0			 1			2				1			4			...			 0	 		2

My cypher currently looks like this:

MATCH (o:Order)
WITH datetime(o.createdAt) as dt, o 
WITH ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"][dt.dayOfWeek] as WeekDay, dt.hour as Hour, o
RETURN WeekDay, Hour, count(o) as Orders

Which returns the number of orders for a single day and hour.

I need to transform this query that I have a fixed result of 7 rows (for each day) and add the results for the count of order into the respective columns 0-23. Furthermore, I'd need to fill the rows with zeros if there are no order for some hours.

Any ideas how I could achieve this without implementing 7*24 subquerys for each hour of the day?

1 ACCEPTED SOLUTION

It depends on how much data you have, but you could consider using list comprehensions to do your filtering. Something like this: (You will have to adjust)

UNWIND [{ "name": "Mon", idx: 0 }, { name: "Tue", idx: 1 } /* ... */ ] AS WeekDay
MATCH (o:Order)
WHERE datetime(o.createdAt).dayOfWeek = WeekDay.idx
WITH WeekDay, collect(datetime(o.created_at)) as OrderTimestamps
RETURN 
     WeekDay.name as DayName,
    size([ts IN OrderTimestamps WHERE ts.hour = 0]) as zeroHour,
    size([ts IN OrderTimestamps WHERE ts.hour = 1]) as oneHour,
 /* ...  */

The way this query works (will need some adjustment) is that it starts with the day names, and collects up all orders that happened on that day into a list of timestamps (OrderTimestamps)

Then, to output how many are in zero hour, 1st hour, etc -- it uses a list comprehension to repeatedly filter the list of timestamps to only a list of those in the zero hour. It then counts the size of that list with size(), so you get counts per-hour, per-day.

View solution in original post

3 REPLIES 3

It depends on how much data you have, but you could consider using list comprehensions to do your filtering. Something like this: (You will have to adjust)

UNWIND [{ "name": "Mon", idx: 0 }, { name: "Tue", idx: 1 } /* ... */ ] AS WeekDay
MATCH (o:Order)
WHERE datetime(o.createdAt).dayOfWeek = WeekDay.idx
WITH WeekDay, collect(datetime(o.created_at)) as OrderTimestamps
RETURN 
     WeekDay.name as DayName,
    size([ts IN OrderTimestamps WHERE ts.hour = 0]) as zeroHour,
    size([ts IN OrderTimestamps WHERE ts.hour = 1]) as oneHour,
 /* ...  */

The way this query works (will need some adjustment) is that it starts with the day names, and collects up all orders that happened on that day into a list of timestamps (OrderTimestamps)

Then, to output how many are in zero hour, 1st hour, etc -- it uses a list comprehension to repeatedly filter the list of timestamps to only a list of those in the zero hour. It then counts the size of that list with size(), so you get counts per-hour, per-day.

chrszrkl
Node Clone

Thank you, that helped me quite a lot!

I continued with your example:

WITH [{idx: 1, name: 'Mon'}, {idx: 2, name: 'Tue'}, {idx: 3, name: 'Wed'}, {idx: 4, name: 'Thu'}, {idx: 5, name: 'Fri'}, {idx: 6, name: 'Sat'}, {idx: 7, name: 'Sun'} ] AS WeekDays
UNWIND WeekDays as WeekDay
MATCH (o:Order)
WHERE datetime(o.createdAt).dayOfWeek = WeekDay.idx
WITH WeekDay, collect(datetime(o.createdAt)) as OrderTimestamps
RETURN 
	WeekDay.name as DayName,
	size([ts IN OrderTimestamps WHERE ts.hour = 0]) as h00,
	size([ts IN OrderTimestamps WHERE ts.hour = 1]) as h01,
	size([ts IN OrderTimestamps WHERE ts.hour = 2]) as h02,
	size([ts IN OrderTimestamps WHERE ts.hour = 3]) as h03,
	size([ts IN OrderTimestamps WHERE ts.hour = 4]) as h04,
	size([ts IN OrderTimestamps WHERE ts.hour = 5]) as h05,
	size([ts IN OrderTimestamps WHERE ts.hour = 6]) as h06,
	size([ts IN OrderTimestamps WHERE ts.hour = 7]) as h07,
	size([ts IN OrderTimestamps WHERE ts.hour = 8]) as h08,
	size([ts IN OrderTimestamps WHERE ts.hour = 9]) as h09,
	size([ts IN OrderTimestamps WHERE ts.hour = 10]) as h10,
	size([ts IN OrderTimestamps WHERE ts.hour = 11]) as h11,
	size([ts IN OrderTimestamps WHERE ts.hour = 12]) as h12,
	size([ts IN OrderTimestamps WHERE ts.hour = 13]) as h13,
	size([ts IN OrderTimestamps WHERE ts.hour = 14]) as h14,
	size([ts IN OrderTimestamps WHERE ts.hour = 15]) as h15,
	size([ts IN OrderTimestamps WHERE ts.hour = 16]) as h16,
	size([ts IN OrderTimestamps WHERE ts.hour = 17]) as h17,
	size([ts IN OrderTimestamps WHERE ts.hour = 18]) as h18,
	size([ts IN OrderTimestamps WHERE ts.hour = 19]) as h19,
	size([ts IN OrderTimestamps WHERE ts.hour = 20]) as h20,
	size([ts IN OrderTimestamps WHERE ts.hour = 21]) as h21,
	size([ts IN OrderTimestamps WHERE ts.hour = 22]) as h22,
	size([ts IN OrderTimestamps WHERE ts.hour = 23]) as h23

Instead of executing 7 MATCH queries during UNWIND, I changed it to 1 MATCH query for all orders and do the filtering of DayOfWeek and Hour in the list comprehension on the timestamps. I'm not sure it is actually faster, I would just guess so?

MATCH (o:Order)
WITH collect(datetime(o.createdAt)) as OrderTimestamps, [{idx: 1, name: 'Mon'}, {idx: 2, name: 'Tue'}, {idx: 3, name: 'Wed'}, {idx: 4, name: 'Thu'}, {idx: 5, name: 'Fri'}, {idx: 6, name: 'Sat'}, {idx: 7, name: 'Sun'} ] AS WeekDays
UNWIND WeekDays as WeekDay
RETURN 
	WeekDay.name as DayName,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 0]) as h00,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 1]) as h01,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 2]) as h02,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 3]) as h03,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 4]) as h04,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 5]) as h05,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 6]) as h06,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 7]) as h07,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 8]) as h08,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 9]) as h09,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 10]) as h10,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 11]) as h11,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 12]) as h12,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 13]) as h13,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 14]) as h14,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 15]) as h15,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 16]) as h16,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 17]) as h17,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 18]) as h18,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 19]) as h19,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 20]) as h20,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 21]) as h21,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 22]) as h22,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and  ts.hour = 23]) as h23

Is there any way to further improve this cypher query?

chrszrkl
Node Clone

I further improved my Cypher query with a variable StartDate and calculated EndDate, which will be calculated by adding X days/weeks/months etc. onto the StartDate. Moreover, instead of providing a static list of WeekDays like in my example above, I now generate a list of days starting from the StartDate and iteratively add 0, 1, 2, ..., 6 days. So my week does not necessarily start at Monday, but at the given StartDate and ends 6 days later (the EndDate is irrelevant here).

I found this post as a very helpful: Creating date nodes by looping through a start date and an end date - #3 by tony.chiboucas

WITH datetime("2021-03-01") AS StartDate
WITH StartDate, StartDate + duration({days: 7}) AS EndDate // Add 7 days because we use < symbol in the WHERE clause
MATCH (o:Order)
WHERE datetime(o.createdAt) >= StartDate AND datetime(o.createdAt) < EndDate //IMPORTANT: use less than (<) EndDate 
WITH StartDate, EndDate, [i IN range(0, 6) | StartDate+duration({days: i})] as WeekDays, collect(datetime(o.createdAt)) as OrderTimestamps
UNWIND WeekDays AS WeekDay
RETURN 
	apoc.date.format(WeekDay.epochMillis, 'ms', 'E') as WeekDay,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 0]) as h00,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 1]) as h01,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 2]) as h02,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 3]) as h03,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 4]) as h04,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 5]) as h05,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 6]) as h06,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 7]) as h07,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 8]) as h08,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 9]) as h09,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 10]) as h10,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 11]) as h11,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 12]) as h12,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 13]) as h13,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 14]) as h14,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 15]) as h15,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 16]) as h16,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 17]) as h17,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 18]) as h18,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 19]) as h19,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 20]) as h20,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 21]) as h21,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 22]) as h22,
	size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 23]) as h23