cancel
Showing results for 
Search instead for 
Did you mean: 

Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.

Saving an actual search and it's results within graph technique

PeteM
Node Clone

Hi, so we are developing a new web application and utilising Neo4j to build reports on data.

Here is a use case scenario:

  1. User enters some search parameters into web form (selecting gender, age etc)
  2. A query is built and run against the database
  3. The results are presented to the user
  4. The user now wants to save this search and results
    ..
  5. Later on the user then wants to actually perform a search, against those search results

I can think of a couple of different ways of achieving point 4, but I wondered if the more experienced here might be able to offer better solutions or whether there is an efficient way of doing it?

One crude technique:

  1. Save the search query as a node and relate it to the user node in graph

(u:User)-[:MADE_SEARCH]->(s:Search)

  1. Create a relationship between that search node and the result nodes it found

(s:Search)-[:RESULT]->(c:Campaign)

However, the database is reasonably big (85m nodes, 600m+ relationships), and there will be many different searches run per day, with each search having results between a couple of hundred, up to potentially a few million.

This would end up with a lot of relationships being created from the search node to results. If we then wanted to actually perform a search against those results, is having that many relationships efficient?

Am I just worrying over nothing and this is an ok solution?

I am worried that say after 1 year of running this web app, there will be thousands of searches saved, with millions of those search-campaign relationships.

Thanks

2 REPLIES 2

clem
Graph Steward

Are there any properties associated with the Nodes or Relationships that can be indexed and limit the scope of the query? E.g. Date/Time, Info about the User, the Campaign, etc.

If so, it might not be so bad.

Yes, hadn't considered that, but there would be some attributes that would be indexed.

First the User node will have a user ID.
The Search node will have a unique search ID as well. So, in reality when searching against those search results, we would match by search ID, to find the records related to it.

I guess my concern is having hundreds of thousands or potentially several million nodes all point to one single node. There would then be multiple instances of this within the graph.

Most of these will also become redundant as the user becomes less interested in using those search results. In this case we could perhaps notify the user that a stored search would be deleted after 3 months of inactivity and offer them the choice to keep it.