Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-20-2020 12:26 AM
Hello I am trying to find the preferential Attachment score for a large sample of nodes.
def Prefer_Attachment_query2(listval):
customer_id=listval[0]
merchant_id=listval[1]
#print(x,y)
prefquery="""MATCH (p1:CUSTOMER {WALLETID: '%s'})
MATCH (p2:MERCHANT {WALLETID: '%s'})
RETURN gds.alpha.linkprediction.preferentialAttachment(p1, p2,{relationshipQuery: "PAYMENT"}) as score"""%(customer_id,merchant_id)
#print(prefquery)
return prefquery
This function is called from a nested loop. My Customer_id sample is 1000 and marchant_id is nearly 50k. so each id is going to be compared with 50k Merchant_id for getting its pref_score with particular merchant_id. the code is working but performance is very slow. for 2 customer_id with 50k merchant_id pref score is calculated in 500 sec. and i have tried 1000 sample but didnt completed after 24 hours plus running the machine. my log showed that only 350 of them are processed .
I have tried multiprocessing package as well. But got some unstable outputs and comes up with this error
Failed to read from defunct connection Address(host='localhost', port=7687) (Address(host='127.0.0.1', port=7687))
Failed to read from defunct connection Address(host='localhost', port=7687) (Address(host='127.0.0.1', port=7687))
ServiceUnavailable: Failed to read from defunct connection Address(host='localhost', port=7687) (Address(host='127.0.0.1', port=7687))
i have searched and found the issue that using multiprocessing package is the issue in neo4j bolt driver
is there any way to use apoc library for this query to run parallel or batch-wise. Kindly help me out
All the sessions of the conference are now available online