Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
12-16-2021 01:13 AM
I have a huge amount of text data in my db. Therefore when i need to calculate the bigrams and trigrams of those texts it happens to be very expensive as you can imagine. So I am looking for a way to import all the n-grams in the nodes, then group them on db to get the most frequent n-grams(like top 5). Here is an example:
for a sentence like the following;
"you have no idea how much i love neo4j graph database"
i want to import the n-grams with the query;
match (p)
where p.Id = 'da786ecb-7965-4ab6-84fd-3811da9b31a0'
set p.Ngrams = [['you', 'have'], ['you', 'have', 'no'], ['have', 'no'], ['have', 'no', 'idea'], ['no', 'idea'], ['no', 'idea', 'how'], ['idea', 'how'], ['idea', 'how', 'much'], ['how', 'much'], ['how', 'much', 'i'], ['much', 'i'], ['much', 'i', 'love'], ['i', 'love'], ['i', 'love', 'neo4j'], ['love', 'neo4j'], ['love', 'neo4j', 'graph'], ['neo4j', 'graph'], ['neo4j', 'graph', 'database'], ['graph', 'database']]
the query gives me the following error:
neo4j.exceptions.CypherTypeError: {code: Neo.ClientError.Statement.TypeError} {message: Property values can only be of primitive types or arrays thereof}
any idea how to solve a situation like this? Thanks
12-16-2021 01:57 AM
Hello @taylangezici1
You can store a list of strings but you cannot store a list of list of strings. In your case, you should create a new type of node NGram
and store one n-gram by node then link these new nodes to the node with the sentance.
Regards,
Cobra
12-17-2021 08:01 AM
Cobra's idea is probably the best one from a graph performance perspective.
However, it is possible to make a list-of-lists: use a Map
The main syntactic difference is that a map is wrapped in curly braces instead of brackets. Obviously, there are other performance implications, though.
All the sessions of the conference are now available online