Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
08-04-2022 11:13 AM
I'm constructing a biomedical knowledge graph, I collected the data from different open sources, All values in each node are unique, and there are no duplicate rows in relationships. (verified thoroughly)
These are my nodes: assays, cells, clinicals, compounds, disorders, drugs, foods, genes, metabolites, organisms, pathways, peptides, proteins, targets, therapeutics.
These are my relationships: cell_FROM_species, clinical_IS_ASSOCIATED_disorder, clinical_IS_ASSOCIATED_drug, compound_IS_ASSOCIATED_protein, drug_CAUSES_disorder, drug_INTERACTS_target, food_IS_ASSOCIATED_compound, metabolite_IS_ASSOCIATED_pathway, peptide_TESTED_IN_assay, peptide_BINDS_TO_protein, peptide_IS_ASSOCIATED_therapeutics, protein_IS_ASSOCIATED_disorder, protein_IS_ASSOCIATED_gene, protein_COMES_FROM_organism, protein_IS_EXPRESSED_IN_pathway.
I used neo4j admin to import data using below command, (since it's a long one, I only mentioned a sample)
C:/Users/mypc/.Neo4jDesktop/relate-data/dbmss/bin/neo4j-admin import --database=db1 --nodes=import/assays.csv --nodes=import/cells.csv --nodes=import/clinicals.csv --………………………………………. --relationships=import/ cell_FROM_species.csv --relationships=import/ clinical_IS_ASSOCIATED_disorder.csv …………………………………………………………………………--multiline-fields=true
I ended up with this schema, I could see there are some new relationships been created between nodes, example:
Can someone correct me where I'm going wrong? Thanks in advance.
#neo4j-admin #relationships
Solved! Go to Solution.
08-04-2022 06:21 PM - edited 08-04-2022 08:23 PM
Did you use 'db.schema.visualization' to get the schema? I recall helping someone out months ago where the schema was not accurately representing their data. I believe someone else in the community mentioned there is a known issue with this method. The relationships did not actually exists in his data. I suggest you query your data to verify those relations do indeed exists or do not. Something like this for each relationship you don't expect:
return exists( (:Peptide)-[:IS_ASSOCIATED]->(:Compound) )
This should provide you an inventory of your relationships. I assumed that data model only has one label per node.
match(n)-[r]->(m)
return labels(n)[0] as `start node`, type(r) as `relationship type`, labels(m)[0] as `end node`, count(*) as count
08-04-2022 06:21 PM - edited 08-04-2022 08:23 PM
Did you use 'db.schema.visualization' to get the schema? I recall helping someone out months ago where the schema was not accurately representing their data. I believe someone else in the community mentioned there is a known issue with this method. The relationships did not actually exists in his data. I suggest you query your data to verify those relations do indeed exists or do not. Something like this for each relationship you don't expect:
return exists( (:Peptide)-[:IS_ASSOCIATED]->(:Compound) )
This should provide you an inventory of your relationships. I assumed that data model only has one label per node.
match(n)-[r]->(m)
return labels(n)[0] as `start node`, type(r) as `relationship type`, labels(m)[0] as `end node`, count(*) as count
08-07-2022 02:07 AM
Thanks for your response, well yes I used 'db.schema.visualization' to get this schema, I used your Cypher query and found that those relationships were not actually existing, but I don't know why the schema was showing those relationships. I also found that this issue was fixed with apoc version.
All the sessions of the conference are now available online