Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-16-2021 03:43 AM
I'd like to query and count orders in my database and return a row for each day of the week (Mon-Sun) and 24 columns for each hour of the day (0:00 - 23:00). The return should be like this:
DayOfWeek | 0:00 | 1:00 | 2:00 | 3:00 | 4:00 | 5:00 | ... | 22:00 | 23:00 |
Mon 0 0 1 2 0 1 ... 0 5
Tue 1 0 0 0 0 2 ... 2 2
Wed 0 0 1 3 5 0 ... 4 3
Thu 4 0 1 0 0 1 ... 3 9
Fri 0 0 0 2 2 6 ... 10 6
Sat 2 0 0 6 0 1 ... 12 9
Sun 1 0 1 2 1 4 ... 0 2
My cypher currently looks like this:
MATCH (o:Order)
WITH datetime(o.createdAt) as dt, o
WITH ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"][dt.dayOfWeek] as WeekDay, dt.hour as Hour, o
RETURN WeekDay, Hour, count(o) as Orders
Which returns the number of orders for a single day and hour.
I need to transform this query that I have a fixed result of 7 rows (for each day) and add the results for the count of order into the respective columns 0-23. Furthermore, I'd need to fill the rows with zeros if there are no order for some hours.
Any ideas how I could achieve this without implementing 7*24 subquerys for each hour of the day?
Solved! Go to Solution.
03-16-2021 07:55 AM
It depends on how much data you have, but you could consider using list comprehensions to do your filtering. Something like this: (You will have to adjust)
UNWIND [{ "name": "Mon", idx: 0 }, { name: "Tue", idx: 1 } /* ... */ ] AS WeekDay
MATCH (o:Order)
WHERE datetime(o.createdAt).dayOfWeek = WeekDay.idx
WITH WeekDay, collect(datetime(o.created_at)) as OrderTimestamps
RETURN
WeekDay.name as DayName,
size([ts IN OrderTimestamps WHERE ts.hour = 0]) as zeroHour,
size([ts IN OrderTimestamps WHERE ts.hour = 1]) as oneHour,
/* ... */
The way this query works (will need some adjustment) is that it starts with the day names, and collects up all orders that happened on that day into a list of timestamps (OrderTimestamps)
Then, to output how many are in zero hour, 1st hour, etc -- it uses a list comprehension to repeatedly filter the list of timestamps to only a list of those in the zero hour. It then counts the size of that list with size()
, so you get counts per-hour, per-day.
03-16-2021 07:55 AM
It depends on how much data you have, but you could consider using list comprehensions to do your filtering. Something like this: (You will have to adjust)
UNWIND [{ "name": "Mon", idx: 0 }, { name: "Tue", idx: 1 } /* ... */ ] AS WeekDay
MATCH (o:Order)
WHERE datetime(o.createdAt).dayOfWeek = WeekDay.idx
WITH WeekDay, collect(datetime(o.created_at)) as OrderTimestamps
RETURN
WeekDay.name as DayName,
size([ts IN OrderTimestamps WHERE ts.hour = 0]) as zeroHour,
size([ts IN OrderTimestamps WHERE ts.hour = 1]) as oneHour,
/* ... */
The way this query works (will need some adjustment) is that it starts with the day names, and collects up all orders that happened on that day into a list of timestamps (OrderTimestamps)
Then, to output how many are in zero hour, 1st hour, etc -- it uses a list comprehension to repeatedly filter the list of timestamps to only a list of those in the zero hour. It then counts the size of that list with size()
, so you get counts per-hour, per-day.
03-17-2021 12:00 PM
Thank you, that helped me quite a lot!
I continued with your example:
WITH [{idx: 1, name: 'Mon'}, {idx: 2, name: 'Tue'}, {idx: 3, name: 'Wed'}, {idx: 4, name: 'Thu'}, {idx: 5, name: 'Fri'}, {idx: 6, name: 'Sat'}, {idx: 7, name: 'Sun'} ] AS WeekDays
UNWIND WeekDays as WeekDay
MATCH (o:Order)
WHERE datetime(o.createdAt).dayOfWeek = WeekDay.idx
WITH WeekDay, collect(datetime(o.createdAt)) as OrderTimestamps
RETURN
WeekDay.name as DayName,
size([ts IN OrderTimestamps WHERE ts.hour = 0]) as h00,
size([ts IN OrderTimestamps WHERE ts.hour = 1]) as h01,
size([ts IN OrderTimestamps WHERE ts.hour = 2]) as h02,
size([ts IN OrderTimestamps WHERE ts.hour = 3]) as h03,
size([ts IN OrderTimestamps WHERE ts.hour = 4]) as h04,
size([ts IN OrderTimestamps WHERE ts.hour = 5]) as h05,
size([ts IN OrderTimestamps WHERE ts.hour = 6]) as h06,
size([ts IN OrderTimestamps WHERE ts.hour = 7]) as h07,
size([ts IN OrderTimestamps WHERE ts.hour = 8]) as h08,
size([ts IN OrderTimestamps WHERE ts.hour = 9]) as h09,
size([ts IN OrderTimestamps WHERE ts.hour = 10]) as h10,
size([ts IN OrderTimestamps WHERE ts.hour = 11]) as h11,
size([ts IN OrderTimestamps WHERE ts.hour = 12]) as h12,
size([ts IN OrderTimestamps WHERE ts.hour = 13]) as h13,
size([ts IN OrderTimestamps WHERE ts.hour = 14]) as h14,
size([ts IN OrderTimestamps WHERE ts.hour = 15]) as h15,
size([ts IN OrderTimestamps WHERE ts.hour = 16]) as h16,
size([ts IN OrderTimestamps WHERE ts.hour = 17]) as h17,
size([ts IN OrderTimestamps WHERE ts.hour = 18]) as h18,
size([ts IN OrderTimestamps WHERE ts.hour = 19]) as h19,
size([ts IN OrderTimestamps WHERE ts.hour = 20]) as h20,
size([ts IN OrderTimestamps WHERE ts.hour = 21]) as h21,
size([ts IN OrderTimestamps WHERE ts.hour = 22]) as h22,
size([ts IN OrderTimestamps WHERE ts.hour = 23]) as h23
Instead of executing 7 MATCH queries during UNWIND, I changed it to 1 MATCH query for all orders and do the filtering of DayOfWeek and Hour in the list comprehension on the timestamps. I'm not sure it is actually faster, I would just guess so?
MATCH (o:Order)
WITH collect(datetime(o.createdAt)) as OrderTimestamps, [{idx: 1, name: 'Mon'}, {idx: 2, name: 'Tue'}, {idx: 3, name: 'Wed'}, {idx: 4, name: 'Thu'}, {idx: 5, name: 'Fri'}, {idx: 6, name: 'Sat'}, {idx: 7, name: 'Sun'} ] AS WeekDays
UNWIND WeekDays as WeekDay
RETURN
WeekDay.name as DayName,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 0]) as h00,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 1]) as h01,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 2]) as h02,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 3]) as h03,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 4]) as h04,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 5]) as h05,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 6]) as h06,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 7]) as h07,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 8]) as h08,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 9]) as h09,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 10]) as h10,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 11]) as h11,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 12]) as h12,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 13]) as h13,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 14]) as h14,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 15]) as h15,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 16]) as h16,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 17]) as h17,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 18]) as h18,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 19]) as h19,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 20]) as h20,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 21]) as h21,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 22]) as h22,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.idx and ts.hour = 23]) as h23
Is there any way to further improve this cypher query?
03-21-2021 02:43 AM
I further improved my Cypher query with a variable StartDate and calculated EndDate, which will be calculated by adding X days/weeks/months etc. onto the StartDate. Moreover, instead of providing a static list of WeekDays like in my example above, I now generate a list of days starting from the StartDate and iteratively add 0, 1, 2, ..., 6 days. So my week does not necessarily start at Monday, but at the given StartDate and ends 6 days later (the EndDate is irrelevant here).
I found this post as a very helpful: Creating date nodes by looping through a start date and an end date - #3 by tony.chiboucas
WITH datetime("2021-03-01") AS StartDate
WITH StartDate, StartDate + duration({days: 7}) AS EndDate // Add 7 days because we use < symbol in the WHERE clause
MATCH (o:Order)
WHERE datetime(o.createdAt) >= StartDate AND datetime(o.createdAt) < EndDate //IMPORTANT: use less than (<) EndDate
WITH StartDate, EndDate, [i IN range(0, 6) | StartDate+duration({days: i})] as WeekDays, collect(datetime(o.createdAt)) as OrderTimestamps
UNWIND WeekDays AS WeekDay
RETURN
apoc.date.format(WeekDay.epochMillis, 'ms', 'E') as WeekDay,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 0]) as h00,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 1]) as h01,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 2]) as h02,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 3]) as h03,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 4]) as h04,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 5]) as h05,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 6]) as h06,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 7]) as h07,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 8]) as h08,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 9]) as h09,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 10]) as h10,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 11]) as h11,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 12]) as h12,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 13]) as h13,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 14]) as h14,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 15]) as h15,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 16]) as h16,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 17]) as h17,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 18]) as h18,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 19]) as h19,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 20]) as h20,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 21]) as h21,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 22]) as h22,
size([ts IN OrderTimestamps WHERE ts.dayOfWeek = WeekDay.dayOfWeek and ts.hour = 23]) as h23
All the sessions of the conference are now available online