Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
10-17-2018 01:13 AM
Hi All,
Neo4j version 3.4.1 community
I am tring to build the graph to find the most connected user for this specific topic and below is the simple query I have tried to find the user,
MATCH (p1:Username)-[:interest]->(p3:Topic)<-[:interest]-(p2:Username) WHERE(p3.topic STARTS WITH 'dc02 c drive high utilization')
RETURN *
Query response,
Below is the sample I have indexed it,
date_time,keywords,message_subject,recipient_address,sender_address
2018-09-10T17:36:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,gavin.debeer@test.com,NarendraChoudaryB@test.com
2018-09-10T17:37:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,Lisa.Sorenson@test.com,NarendraChoudaryB@test.com
2018-09-10T17:38:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,PrathamK@test.com,NarendraChoudaryB@test.com
2018-09-10T17:39:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,ArunR@test.com,NarendraChoudaryB@test.com
2018-09-10T17:40:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,RMCTeam@test.com,NarendraChoudaryB@test.com
2018-09-10T17:41:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,mike.blasberg@test.com,NarendraChoudaryB@test.com
2018-09-10T17:51:06.798Z,dc02 c drive high utilization,RE: PDX-VWIN-DC02 C drive High utilization,Lisa.Sorenson@test.com,GauravSi1@test.com
If you check the data in the sender address the user "Narendra" has sent mail to different users for 6 times but in the graph the relationship "interest' is showing only 5 times.
I would like to know why it is showing like that? Please correct me if I am doing anything wrong.
Regards,
Ganeshbabu R
10-17-2018 03:52 AM
Also I want add one more thing here,
We have tried by adding different sender & receiver address instead of above data and in that case we are able to see 6 interest relationships for narendra in the graph and below is the response
I am stuck in understanding these relationships and kindly clarify with your thoughts and it will be really helpful.
Thanks,
Ganeshbabu R
10-17-2018 10:52 AM
Can you provide the Cypher query used to generate the sample result data?
10-17-2018 09:55 PM
Below is the query I used to load the sample data,
LOAD CSV WITH HEADERS FROM "file:///data.csv" AS row
WITH row WHERE row.message_subject <> 'none' AND row.keywords <> 'none'
MERGE (p1:Username {mail_id: row.sender_address}) ON CREATE SET p1.timestamp = row.date_time
MERGE (p2:Username {mail_id: row.recipient_address})
MERGE (p3:Topic {topic: row.keywords})
WITH p1, p2, p3, row, COUNT(*) AS count
MERGE (p1)-[rel:sent]->(p2) ON CREATE SET rel.time = row.date_time
MERGE (p1)-[:interest]->(p3)<-[:interest]-(p2)
SET rel.count = count
For generating sample result data we have used python and it will create new csv with these columns,
date_time,keywords,message_subject,recipient_address,sender_address
Sample result data
2018-09-10T17:36:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,gavin.debeer@test.com,NarendraChoudaryB@test.com
2018-09-10T17:37:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,Lisa.Sorenson@test.com,NarendraChoudaryB@test.com
2018-09-10T17:38:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,PrathamK@test.com,NarendraChoudaryB@test.com
2018-09-10T17:39:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,ArunR@test.com,NarendraChoudaryB@test.com
2018-09-10T17:40:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,RMCTeam@test.com,NarendraChoudaryB@test.com
2018-09-10T17:41:48.823Z,dc02 c drive high utilization,PDX-VWIN-DC02 C drive High utilization,mike.blasberg@test.com,NarendraChoudaryB@test.com
2018-09-10T17:51:06.798Z,dc02 c drive high utilization,RE: PDX-VWIN-DC02 C drive High utilization,Lisa.Sorenson@test.com,GauravSi1@test.com
Thanks,
Ganeshbabu R
10-18-2018 05:15 PM
There's some buggy behavior in play here.
When I do your import in Neo4j 3.4.1 I get:
Added 9 labels, created 9 nodes, set 27 properties, created 19 relationships, completed after 107 ms.
12 :interest relationships are created.
When I do this in 3.4.8 I get:
Added 9 labels, created 9 nodes, set 25 properties, created 21 relationships, completed after 185 ms.
14 :interest releationships are created.
I'm not quite sure which bug is in play here, but in general it's important to be up to date with patch releases so you avoid buggy behavior that has since been fixed.
10-22-2018 05:40 AM
Hi @andrew.bowman,
I am trying the same in the neo4j version 3.4.9
Below is the response in the console as I am getting the same response which I got in the version of 3.4.1
Added 9 labels, created 9 nodes, set 29 properties, created 19 relationships, completed after 234 ms.
12 interest relationships are created.
I am using community edition of 3.4.9. Can I know which edition of neo4j you tried?
Regards,
Ganeshbabu R
10-18-2018 07:07 PM
Thanks @andrew.bowman
I will upgrade the neo4j version and will try & execute the same query then share you my feedback..
Regards,
Ganeshbabu R
10-18-2018 09:14 PM
Hi,
Try this query:
LOAD CSV WITH HEADERS FROM "file:///data.csv" As row
WITH row
MERGE (p1:Username {mail_id: row.sender_address}) ON CREATE SET p1.timestamp = row.date_time
MERGE (p2:Username {mail_id: row.recipient_address})
MERGE (p3:Topic {topic: row.keywords})
WITH p1, p2, p3, row, COUNT(*) AS cnt
MERGE (p1)-[rel:sent]->(p2) ON CREATE SET rel.time = row.date_time,rel.count = cnt
MERGE (p1)-[:interest]->(p3)<-[:interest]-(p2);
Here is the result:
Also,
-Kamal
10-22-2018 05:45 AM
Hi @ameyasoft
Yes I tried the same but didn't get the expected output and below is the respone in console,
Are you using the community edition of neo4j 3.4.9?
Let me know your thoughts.
Regards,
Ganeshbabu R
10-22-2018 02:12 PM
Hi,
My version was 3.3.1. I installed version 3.4.9 and got the same result as version 3.3.1. Here is the screenshot with version 3.4.9.
Make sure that you remove 'SET rel.count = count' from your query if you have as in your original query.
If you are still not getting it right send me your .csv file and I can check on my instance.
-Kamal
10-22-2018 04:29 PM
Hi,
I tried your original query in 3.4.9 and got the same result as with my query.
-Kamal
10-18-2018 09:16 PM
Hi,
Sorry I didn't finish the last sentence. Now the 'sent' relation has count property and it will be 1 always as it is evaluating the count per each row.
-Kamal
All the sessions of the conference are now available online