Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
09-24-2019 09:54 AM
I'm trying to convert a date into a YYYYMMDD and YYYYMM format in order to create an index for a GraphAware TimeTree, as suggested in this post.
As noted in their blog, the authors originally used as MM/DD/YYYY format for the UUID, which leads to some issues with sorting.
I've tried converting the original code as follows:
MATCH (root:TimeTreeRoot)-[:CHILD]->(year:Year)
WITH collect(year) AS years UNWIND years AS y
SET y.uuid = toString(y.value)
WITH y
MATCH (y)-[:CHILD]->(month:Month)
WITH y, collect(month) AS months UNWIND months AS m
SET m.uuid = y.value + substring("00", size(m.value)) + m.value
WITH y, m
MATCH (m)-[:CHILD]->(day:Day)
WITH y, m, collect(day) AS days UNWIND days AS d
SET d.uuid = y.value + substring("00", size(m.value)) + m.value + substring("00", size(d.value)) + d.value
;
First of all this does not seem very efficient to me. Second, I'm not getting the result I would expect. The days and months less than 10 are left padded with a0, but so are the days and months greater than 9. Running
MATCH (n:Month) WHERE EXISTS(n.uuid) RETURN DISTINCT "node" as entity, n.uuid AS uuid LIMIT 25 UNION ALL MATCH ()-[r]-() WHERE EXISTS(r.uuid) RETURN DISTINCT "relationship" AS entity, r.uuid AS uuid LIMIT 25
returns
|"node"|"201901"|
|"node"|"197001"|
|"node"|"201902"|
|"node"|"201903"|
|"node"|"201904"|
|"node"|"201905"|
|"node"|"201906"|
|"node"|"201907"|
|"node"|"201908"|
|"node"|"201909"|
|"node"|"2019010"|
|"node"|"2019011"|
|"node"|"2019012"|
Running the same query of Day returns
|node"|"201901027"|
|"node"|"201901028"|
|"node"|"201901029"|
|"node"|"201901030"|
|"node"|"201901031"|
|"node"|"20190201"|
|"node"|"20190202"|
|"node"|"20190203"|
|"node"|"20190204"|
|"node"|"20190205"|
|"node"|"20190206"|
|"node"|"20190207"|
|"node"|"20190208"|
|"node"|"20190209"|
|"node"|"201902010"|
|"node"|"201902011"|
|"node"|"201902012"|
|"node"|"201902013"|
|"node"|"201902014"|
|"node"|"201902015"|
|"node"|"201902016"|
|"node"|"201902017"|
I was looking for a format statement along the lines of the code below, but I cannot find anything in the Neo4j manuals.
SET m.uuid = y.value + format("00", m.value)
#GraphAware #TimeTree #format #convert-time
09-24-2019 09:59 AM
Are you sure you're trying to use the GraphAware timetree ? If yes, then it only accepts ms timestamps as time property.
09-24-2019 10:10 AM
Yes, I am indeed using GraphAware Time Tree. Yes, I know it uses ms timestamps to create the tree, but as the blog pointed out searching the time tree is very inefficient without an index on the nodes, so I'm trying to create the "uuid" values for the nodes so that they can be traversed very quickly.
The issue I'm having is not with creating the Time Tree, but rather with creating the "uuid" values on the Month and Day nodes.
09-24-2019 10:03 AM
Also make sure you actually need a timetree. Since indexable native temporal types were added in 3.4.x, many use cases can now be handled by using these new types and adding appropriate indexes.
That said, there are still several cases where timetrees remain useful. Can you give us some info on what you're using this for?
09-24-2019 10:19 AM
Good question, and I may be approaching this from the incorrect angle. I need to create a time series so that I can represent the state of a piece of equipment at different times. Let's say that the equipment has the following states: Off, Idle, Failure, Setup, Working. I need to capture the state changes over time.
I also need to follow a product during different stages of production and distribution. Think of a job application as it moves from initial inquiry to interviews to acceptance to start.
Hope this helps.
09-24-2019 11:16 AM
You could model your state/event nodes with temporal types as properties on those nodes, and relationships to the product in question going through these events. And while you can easily order and collect a product's events, you could even create a linked list between the event nodes themselves, if your queries require it.
Two primary cases come to mind as to why you might need a time tree (and I'm sure Christophe might come up with several more):
If you need to traverse through dates when there are not relationships connecting events on the same date, so starting with an event, finding other events that are on the same date via traversal rather than by executing an additional indexed query.
If you need to explore temporal elements across differing larger temporal contexts...for example, days spanning multiple of week/month/year (every day of the week...every first Monday of every month), or the same month/months spanning years (every October). Time trees can make it easier to query across discrete time elements. For instance, a query to find events in September would perform an index lookup for the :Month of September (or traverse from years to months and filter) then expand to events under each September month node.
By contrast, looking at temporal types, although we can index date values on nodes, lookup of nodes by specific date elements (WHERE n.eventDate.month = 9
) don't use the index, and would fall back to a label scan. Using the time tree for these cases would likely be more efficient.
09-24-2019 11:53 AM
Thanks Andrew, this is very useful information and confirms that I will need the Time Tree.
Which means I am still left with my original issue: How to create sortable indices on Month and Day nodes. The most sensible to me is integer values of YYYYMMDD, so Jan 1, 2019 will become 20190101 and Oct 1, 2019 will become 20191001.
Any thoughts on how to do this in Cypher?
09-24-2019 11:58 AM
APOC date/time conversion functions should do the trick, but if all you need is padding then you can use APOC text functions for this:
return apoc.text.lpad('51', 2, '0')
returns '51' while
return apoc.text.lpad('1', 2, '0')
returns '01'
09-24-2019 12:04 PM
Thank you Andrew, I solved it in Cypher. See below. Is an index on a string more efficient than on an integer? I'm using an integer.
I was hoping to avoid apoc, but maybe I should branch out.
MATCH (root:TimeTreeRoot)-[:CHILD]->(year:Year)
WITH collect(year) AS years UNWIND years AS y
SET y.uuid = toInt(y.value)
WITH y
MATCH (y)-[:CHILD]->(month:Month)
WITH y, collect(month) AS months UNWIND months AS m
SET m.uuid = toInt(y.value*100 + m.value)
WITH y, m
MATCH (m)-[:CHILD]->(day:Day)
WITH y, m, collect(day) AS days UNWIND days AS d
SET d.uuid = toInt(y.value*10000 + m.value*100 + d.value)
;
09-25-2019 12:20 PM
Andrew, you must have missed my related question:
Is an Integer more efficient as an index that a String?
09-26-2019 08:57 PM
I'm not currently aware of any notable performance difference here. If I find out otherwise I'll let you know.
All the sessions of the conference are now available online