Neo4j

rahel · ‎06-03-2021

Hello Everyone,
I am using apoc.load.html() procedure to do scraping. So i have stored list of html links on ( u:Url) node with property urls so that it will access the urls from u.urls.
And It works fine for links with status 200 (that are available) the problem is when it got link/url that is removed /deleted the process terminate with error like this :

To solve it i tried to replace the url with new one (with working link) manually but this is not feasible.
So my question is, is there any apoc procedure or Cypher code that can allow the process to proceed even if some of the links are unavailable(404).

I am thinking of maybe something to catch the failed link and processed getting the data for the working link one.

Thank you

giuseppe_villan · ‎07-07-2021

@rahel
Nope, at the moment there is only a failSilently configuration which causes the execution to continue without errors, but it is only when the file itself is read (i.e. for parse errors).
I suggest you open an issue, to work on this one.

Neo4j

Failed to invoke procedure `apoc.load.html` when it link is not found (404)