Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-05-2019 06:11 AM
What is the best way of computing histograms of all properties of all selected nodes in Neo4j?
For example, assume that we have 3 nodes like this:
We want to show histograms of all attributes to the user (here, a histogram for age, one for Country, and one for gender).
08-06-2019 11:57 AM
You can use something like matplotlib to take graph data and convert to a histogram. The data science online training actually will show you how to do this hands-on. The link to the free online class is on our site. Hope this helps!
Cheers,
Jennifer
08-09-2019 11:53 PM
Thank you. However, my main problem is how to calculate the histogram values not necessarily how to show them. We have an unknown set of properties and each of them is available in a subset of nodes. What is the best way to calculate histogram data for each of the properties (for all of them).
08-22-2019 02:28 PM
Ah. I'm not super familiar with creating histograms with Neo4j data, but let's see what we can do.
We mostly need to create a query that pulls all the properties on User nodes and looks at all the values of those. That's what the query below handles. It pulls the properties for User nodes in our database and flattens all the values into a list based on the property.
MATCH (n:Node)
WITH apoc.coll.toSet(apoc.coll.flatten(collect(keys(n)))) AS allKeys
MATCH (n:User)
UNWIND allKeys AS key
WITH COLLECT(n[key]) AS values, key
RETURN key, apoc.coll.frequenciesAsMap(values) AS freq
Then, if you're doing it in python, you can dump those results to a dataframe and use the DataFrame.hist
function which will iterate over each key and give a histogram for each. Otherwise, tools like matplotlib or other chart visualizations could work, too. Hope this helps!
Cheers,
Jennifer
All the sessions of the conference are now available online