Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
07-27-2022 08:17 AM
Hi Neo4j community,
Looking for some general advice on the most efficient way of converting/importing an ontology file (Thesaurus_22.06d.OWL (https://evs.nci.nih.gov/ftp1/NCI_Thesaurus/Thesaurus_22.06d.OWL.zip)) into a neo4j knowledge graph.
Have explored using the Neosemantics plugin, and although I have had some success I'm not convinced it's importing the data in its entirety, Have used the following commands
CREATE CONSTRAINT n10s_unique_uri ON (r:Resource)
ASSERT r.uri IS UNIQUE;
call n10s.graphconfig.init( { handleMultival: "ARRAY" })
CALL n10s.onto.preview.fetch("file:///home/xxxxx/.config/Neo4j Desktop/Application/relate-data/dbmss/dbms-785d18d6-677e-4238-b22b-a94227bc4930/import/Thesaurus.owl","RDF/XML");
The result is as follows, the difference between the number of triplesLoaded and triplesParsed is somewhat disconcerting, and as far as I can tell not all relationships are displaying.
I'm not sure if it's related to the initial graph config have tried various variations to no avail.
╒═══════════════════╤═══════════════╤═══════════════╤════════════╤═══════════╤════════════╕
│"terminationStatus"│"triplesLoaded"│"triplesParsed"│"namespaces"│"extraInfo"│"callParams"│
╞═══════════════════╪═══════════════╪═══════════════╪════════════╪═══════════╪════════════╡
│"OK" │921897 │8734522 │null │"" │{} │
└───────────────────┴───────────────┴───────────────┴────────────┴───────────┴────────────┘
Any advice would be greatly appreciated.
Using neo4jDesktop 1.4.15
Thanks
Chris
Solved! Go to Solution.
07-28-2022 01:40 PM - edited 07-28-2022 02:08 PM
Hey there @ceiag - maybe I can be of some help,
Right off the bat - you are correct that the import is not importing the ontology entirely or preserving all triples.
That said:
There are a few things you can do here. Take a look at the n10s documentation for importing ontologies. You'll find there it notes that only the following 6 criteria will be accounted for upon import.
Named class (category) declarations with both rdfs:Class
and owl:Class
.
Explicit class hierarchies defined with rdf:subClassOf
statements.
Property definitions with owl:ObjectProperty
, owl:DatatypeProperty
and rdfs:Property
Explicit property hierarchies defined with rdfs:subPropertyOf
statements.
Domain and range information for properties described as rdfs:domain
and rdfs:range
statements.
Restrictions defined with owl:Restriction
.
(i believe the rdf:subClassOf is a typo and means rdfs:subClassOf)
A Solution (possibly helpful alternative):
Rather than using:
n10s.onto.import.fetch(url :: STRING?, format :: STRING?)
Try using:
n10s.rdf.import.fetch(url :: STRING?, format :: STRING?)
This will import the ontology and preserve all triples.
Hope this helps! Feel free to loop back for more help.
I did an example import using your configuration but using n10s.rdf.import.fetch() & here is output:
Terminationstatus: "OK" | triplesLoaded: 8734522 | triplesParsed: 8734522
{
"owl": "http://www.w3.org/2002/07/owl#",
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"ns0": "http://purl.org/dc/elements/1.1/",
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
"ns2": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#",
"ns1": "http://protege.stanford.edu/plugins/owl/protege#",
"ns3": "http://www.geneontology.org/formats/oboInOwl#"
}
We can now see that triplesLoaded is equal to triplesParsed 🙂
Best Regards,
Rob
07-28-2022 01:40 PM - edited 07-28-2022 02:08 PM
Hey there @ceiag - maybe I can be of some help,
Right off the bat - you are correct that the import is not importing the ontology entirely or preserving all triples.
That said:
There are a few things you can do here. Take a look at the n10s documentation for importing ontologies. You'll find there it notes that only the following 6 criteria will be accounted for upon import.
Named class (category) declarations with both rdfs:Class
and owl:Class
.
Explicit class hierarchies defined with rdf:subClassOf
statements.
Property definitions with owl:ObjectProperty
, owl:DatatypeProperty
and rdfs:Property
Explicit property hierarchies defined with rdfs:subPropertyOf
statements.
Domain and range information for properties described as rdfs:domain
and rdfs:range
statements.
Restrictions defined with owl:Restriction
.
(i believe the rdf:subClassOf is a typo and means rdfs:subClassOf)
A Solution (possibly helpful alternative):
Rather than using:
n10s.onto.import.fetch(url :: STRING?, format :: STRING?)
Try using:
n10s.rdf.import.fetch(url :: STRING?, format :: STRING?)
This will import the ontology and preserve all triples.
Hope this helps! Feel free to loop back for more help.
I did an example import using your configuration but using n10s.rdf.import.fetch() & here is output:
Terminationstatus: "OK" | triplesLoaded: 8734522 | triplesParsed: 8734522
{
"owl": "http://www.w3.org/2002/07/owl#",
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"ns0": "http://purl.org/dc/elements/1.1/",
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
"ns2": "http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#",
"ns1": "http://protege.stanford.edu/plugins/owl/protege#",
"ns3": "http://www.geneontology.org/formats/oboInOwl#"
}
We can now see that triplesLoaded is equal to triplesParsed 🙂
Best Regards,
Rob
07-29-2022 04:22 AM
Hi Rob,
Thanks for the reply, really appreciate it and yep that appears to have to done the trick!
Im going to take a closer look at the data, I may have some further questions down the line but for now thanks so much.
Regards
Chris
All the sessions of the conference are now available online