Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
03-31-2022 05:39 AM
I am very new to graph databases and to neo4j, I am trying to understand the limitation of the tools and would like to see if I am able to solve the following problem with the database alone (not having to use the application layer at all).
I have the following graph:
Also defined by this graphql schema:
type Process {
id: ID! @id
name: String! @unique
assets: [Asset!]! @relationship(type: "PROCESS_ASSET", direction: OUT)
score: Int! @computed(from: ["id"])
}
type Asset {
id: ID! @id
name: String! @unique
type: AssetType! @relationship(type: "TYPE_OF", direction: OUT)
childAssets: [Asset!]! @relationship(type: "CHILD_ASSET", direction: OUT)
attributes: [AttributeValue!]!
@relationship(type: "ATTRIBUTE_VALUE_OF", direction: OUT)
score: Int @computed(from: ["id"])
}
type AssetType {
id: ID! @id
name: String! @unique
attributes: [AttributeWeight!]!
@relationship(type: "ATTRIBUTE_WEIGHT_OF", direction: OUT)
}
type Attribute {
id: ID! @id
name: String! @unique
}
type AttributeValue {
id: ID! @id
attribute: Attribute!
@relationship(type: "VALUE_ATTRIBUTE_OF", direction: OUT)
value: Int!
}
type AttributeWeight {
id: ID! @id
attribute: Attribute!
@relationship(type: "WEIGHT_ATTRIBUTE_OF", direction: OUT)
weight: Int!
}
My goal is to calculate the score properties of the Asset
nodes and the Process
nodes. The score should be the average of the attributes related to each asset and its children, weighted by the type. If the attribute isn't associated with the type the attribute should be weighted at 0. In the case of a circular reference, each child should only appear once in the calculation.
My direct questions are:
My progress so far:
Calculating the average value of attributes without consideration for weight (from type).
For Process -->
MATCH path = (p:Process {name: "process 1"})-[:PROCESS_ASSET*]->(a:Asset)-[:CHILD_ASSET*]->(b:Asset)-[:ATTRIBUTE_VALUE_OF]->(v)
RETURN avg(v.value)
For Asset --> (I would expect this can be done in a more elegant way) // doesn't work
MATCH (a:Asset {name: "asset 1"})-[:CHILD_ASSET*]->(b:Asset)-[:ATTRIBUTE_VALUE_OF]->(v1)
MATCH (a:Asset {name: "asset 1"})-[:ATTRIBUTE_VALUE_OF]->(v2)
RETURN (avg(v1.value) + avg(v2.value)) / 2
Updated -->
MATCH (:Asset {name: "asset 1"})-[:CHILD_ASSET*]->(c)
WITH collect(c) AS uc
MATCH (p:Asset) WHERE p.name = "asset 1"
WITH collect(p) AS up, uc
UNWIND (up + uc) AS v
WITH v
MATCH (v)-[:ATTRIBUTE_VALUE_OF]-(r)
RETURN avg(r.value)
Solved! Go to Solution.
03-31-2022 11:21 AM
To give you a little guidance on a way to incorporate the weight, the following is an example assuming the data model in the diagram and the above query are correct. It finds all the paths to attributes that are related to types. Attributes without a type will not be included, which is effectively the same as assuming a weight of zero in the average calculation. The 'where' clause ensures only those paths that also tie back to the original asset are included. It then averages over the product of the attribute's value and the type's weight. Maybe it can also help with ideas if you end up with a different data model.
MATCH (a:Asset {name: "asset 1"})-[:HAS_ASSET*0..]->()-[r1:HAS_ATTRIBUTE]->(:Attribute)<-[r2:HAS_ATTRIBUTE]-(t:Type)
where exists((a)-[:HAS_TYPE]->(t))
RETURN avg(r1.value * r2.weight)
03-31-2022 08:27 AM
First, welcome aboard.
Is there a reason you have broken out the attribute and weight values as entities? Could we simplify the schema by incorporating these as relationship properties? Maybe something like the following:
Does this accurately represent your domain model? We can help once we understand the data model?
03-31-2022 08:38 AM
Hi Gary, Thanks for the warm welcome and the fast and helpful response.
To answer your question I have no problem simplifying the model into this format. What you have shown looks good to me, I will work on updating things on my end and keep working (and updating the OP as I go).
03-31-2022 08:38 AM
What result do you expect from the second query? I think the query will include attribute '1' multiple times. I think the first match will include it twice and the second match once, for the graph presented.
03-31-2022 08:39 AM
I have updated the query with something that actually works. NB, as mentioned this does not take into consideration the weights, I will have to address this once I understand things better.
MATCH (:Asset {name: "asset 1"})-[:CHILD_ASSET*]->(c)
WITH collect(c) AS uc
MATCH (p:Asset) WHERE p.name = "asset 1"
WITH collect(p) AS up, uc
UNWIND (up + uc) AS v
WITH v
MATCH (v)-[:ATTRIBUTE_VALUE_OF]-(r)
RETURN avg(r.value)
03-31-2022 09:00 AM
I believe your query will result in the average equalling avg(r.1value, r1.value, r2.value). Is this what you want? If so, I think you can get the same result with the following query:
MATCH (a:Asset {name: "asset 1"})-[:CHILD_ASSET*0..]->(:Asset)-[:ATTRIBUTE_VALUE_OF]->(v)
RETURN avg(v.value)
The key difference is that it uses a variable length path criteria that includes a '0' length option. As such, it is able to include attribute 1 directly attached to asset1, as well as the other two paths through the "CHILD_ASSET" relationship. Using "*" as a variable length path criteria doesn't include zero.
03-31-2022 09:12 AM
I have to make a correction. The intermediate node can't have a type.
MATCH (a:Asset {name: "asset 1"})-[:CHILD_ASSET*0..]->()-[:ATTRIBUTE_VALUE_OF]->(v)
RETURN avg(v.value)
03-31-2022 11:21 AM
To give you a little guidance on a way to incorporate the weight, the following is an example assuming the data model in the diagram and the above query are correct. It finds all the paths to attributes that are related to types. Attributes without a type will not be included, which is effectively the same as assuming a weight of zero in the average calculation. The 'where' clause ensures only those paths that also tie back to the original asset are included. It then averages over the product of the attribute's value and the type's weight. Maybe it can also help with ideas if you end up with a different data model.
MATCH (a:Asset {name: "asset 1"})-[:HAS_ASSET*0..]->()-[r1:HAS_ATTRIBUTE]->(:Attribute)<-[r2:HAS_ATTRIBUTE]-(t:Type)
where exists((a)-[:HAS_TYPE]->(t))
RETURN avg(r1.value * r2.weight)
03-31-2022 12:22 PM
Wow! Thanks a tonne, this has me very excited for neo4j
Final GraphQL Schema -->
type Process {
id: ID! @id
name: String! @unique
assets: [Asset!]! @relationship(type: "HAS_ASSET", direction: OUT)
score: Float!
@cypher(
statement: "MATCH (p:Process {id: this.id})-[:HAS_ASSET]->(a:Asset)-[:HAS_ASSET*0..]->()-[r1:HAS_ATTRIBUTE]->(:Attribute)<-[r2:HAS_ATTRIBUTE]-(t:AssetType) WHERE EXISTS((a)-[:HAS_TYPE]->(t)) RETURN avg(r1.value * r2.value)"
)
}
type Asset {
id: ID! @id
name: String! @unique
type: AssetType! @relationship(type: "HAS_TYPE", direction: OUT)
childAssets: [Asset!]! @relationship(type: "HAS_ASSET", direction: OUT)
attributes: [Attribute!]!
@relationship(
type: "HAS_ATTRIBUTE"
direction: OUT
properties: "HasAttribute"
)
score: Float!
@cypher(
statement: "MATCH (a:Asset {id: this.id})-[:HAS_ASSET*0..]->()-[r1:HAS_ATTRIBUTE]->(:Attribute)<-[r2:HAS_ATTRIBUTE]-(t:AssetType) WHERE EXISTS((a)-[:HAS_TYPE]->(t)) RETURN avg(r1.value * r2.value)"
)
}
type AssetType {
id: ID! @id
name: String! @unique
attributes: [Attribute!]!
@relationship(
type: "HAS_ATTRIBUTE"
direction: OUT
properties: "HasAttribute"
)
}
type Attribute {
id: ID! @id
name: String! @unique
}
interface HasAttribute @relationshipProperties {
value: Int!
}
Graph -->
03-31-2022 03:50 PM
Terrific and glad to help. Your excitement is warranted. Cypher is so powerful and rocks over sql.
All the sessions of the conference are now available online