cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Count how many times a node match

lx2pwnd
Node Clone

I have two groups of nodes in neo4j. The Mail node have this form

{
    time : year/month/day,
    content : Hey Anna Sorry I took awhile getting ...
}

The node word have this form:

{
  word : ***
}

I want to match how many times the nodes Word match with the nodes Content. How can I do this ? There aren't any relationship between the nodes. For example, if I have

{
  word : cat
}


{
    time : year/month/day,
    content : the cat is on the table
 },
 {
    time : year/month/day,
    content : the cat is on the floor
  },
 {
    time : year/month/day,
    content : the cat is somewhere over the raimbow
 }

A query like this should return 3

MATCH (n:Content) 
WHERE 
  n.time>'2001/01/01' and n.time<'2001/02/01'
MATCH (w:Word) 
WHERE 
  ANY(word IN split(n.content,' ') WHERE word = w.word)
return w.word, #number of count#
5 REPLIES 5

accounts
Node Clone

A. do you not want to add a relationship between word and mail? for example

(Word)-[:CONTAINS{count}]-(Mail)

then you'd populate these values on insert and never have to calculate it again

B. is you say no to A then are you looking for how many times a word is in every mail or how many mails contain that word?

Thanks for your answer. I suppose that A) is the best solution. But I don't understand what do you mean with B)

I think what @accounts means to convey is .. do you want to compute word count per mail OR compute a count of mails where the specified word was found? Cause depending on that query would need to be adjusted.

From your original question, I believe, you want to compute number of occurrence of a Word Per Mail? Is that right?

Made a quick example, if that helps..

create (:Mail{content:'this is a word count mail message'})
create (:Mail{content:'this is not a word count mail message'})
create (:Mail{content:'this is funny message'})
create (:Mail{content:'this is not so funny message'})
create (:Mail{content:'the message was lost in translation'})
create (:Mail{content:'this is a secret message'})
create (:Mail{content:'this is a not so secret message'})
create (:Mail{content:'this is a not so secret message, but this is a secret message'})
create (:Mail{content:'this is not so funny message, but this is a funny message'})

create (:Word{word:'this')
create (:Word{word:'is')
create (:Word{word:'word')
create (:Word{word:'count')
create (:Word{word:'mail')
create (:Word{word:'message')
create (:Word{word:'not')
create (:Word{word:'funny')
create (:Word{word:'interesting')
create (:Word{word:'boring')

match (m:Mail)
match (w:Word)
where m.content contains w.word
merge (w)-[o:OCCURS_IN]->(m)
set o.count = size(apoc.text.indexesOf(m.content, w.word,0, -1))

match(fun:Word{word:'funny'})-[o:OCCURS_IN]->(m:Mail)
return fun, o, m

ameyasoft
Graph Maven
Try this:

MATCH (w:Word)
with collect(distinct w.word) as w1

MATCH (a:Mail)
with split(a.content, " ") as s1, w1, a
with a, s1, w1, apoc.coll.removeAll(w1, s1) as rmvd

return a.content,  size(s1) as wordCount,  (size(w1) - size(rmvd)) as matchedWords

Result: