Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-04-2021 09:16 AM
I'm working my way through ~50 metrics for potential use as machine learning features. I want to compare different queries that access the information and compare resource allocation when the data is in different graph models and indexing.
I've created three versions of the data (Version 4.1.3) and stored it in Neo4j Desktop. I access the databases individually using the Neo4j Python driver and dump the results into a Jupyter Notebook.
The Cypher Workflow document indicates the queries return a stream of records, header and footer metadata, and a result summary that contains additional information relating to query execution and result content (which includes the information for EXPLAIN and PROFILE).
The Knowledge Base has an article on how to get to much of this information using the cypher shell, but I can't find the directions/functions that will allow me to access this through the Python Driver.
To help optimize my queries, graph models, and indexing, I want to compare query performance information such as heap size, physical memory, run time, caching, CPU use, thread count, and records returned.
This article on data science stack exchange references references something similar done in Neo4j in Action.
I'd like to pull this information programmatically in real time into the Jupyter Notebook when I run each specific query (performance information alone can be returned, or returned with the query results) and not have to monitor an application like the Halin Monitoring one.
03-12-2021 12:17 PM
For those interested, I was able to work out the answer.
In the Jupyter notebook I ran the query and stored the Result object, then was able to access the metadata:
from neo4j import GraphDatabase
uri = "bolt://localhost:7687"
driver = GraphDatabase.driver(uri, auth=('neo4j', 'password'))
session = driver.session()
count1 = session.run('''PROFILE MATCH (c)
RETURN count(c) as node_count''')
count1.consume().metadata
Other information available using the .consume()
method can be found by replacing metadata
with options like result_available_after
, result_consumed_after
, or profile
.
Additional details can be found in the Result section of the Python Driver API Documentation.
A working example of the code can be found in a notebook on my GitHub page.
All the sessions of the conference are now available online