Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-30-2020 05:49 AM
Hi
Given the html file below:
<html>
<body>
<h2 id='KPILIST'> Blah Blah 1</h2>
<div>
<div>
<table>
<tbody>
<tr><th>Col 1 Header</th><th>Col 2 Header</th></tr>
<tr><td>Line 1.1 Value</td><td>Line 2.1 Header</td></tr>
<tr><td>Line 2.1 Value</td><td>Line 2.2 Value</td></tr>
</tbody>
</table>
</div>
</div>
<div>
<div>
<table>
<tbody>
<tr><th>Col 1 Header T2</th><th>Col 2 Header T2</th></tr>
<tr><td>Line 1.1 Value T2</td><td>Line 2.1 Header T2</td></tr>
<tr><td>Line 2.1 Value T2</td><td>Line 2.2 Value T2</td></tr>
</tbody>
</table>
</div>
</div>
</body>
</html>
I would expect the below code to return the first bloc of 3 lines from the first table above
// Blah Blah
CALL apoc.load.html("file:///XXX.html",{line: "#KPILIST ~ div:eq(0) div > table tr"}) yield value as lineList
unwind (lineList.line) as L
RETURN L
But unfortunately nothing is returned
Even a simple query line table:eq(0) doesn't work as it returns both tables and not only the first one
Solved! Go to Solution.
01-30-2020 08:26 AM
Jsoup selectos syntax is not so jquery-like:
https://jsoup.org/cookbook/extracting-data/selector-syntax
In my case I ended up using this query:
h2#KPILIST + div tr
01-30-2020 06:16 AM
It seems that jsoup does start counting at 1 and not at 0. The following worked for me - note also the additional div
:
CALL apoc.load.html("file:///var/lib/neo4j/import/dummy.html",
{line: "#KPILIST ~ div:eq(1) > div table tr"}
) yield value as lineList
unwind (lineList.line) as L
RETURN L
01-30-2020 07:19 AM
Oh!
I should look more into jsoup as I was testing comparing results with jqueries
01-30-2020 08:26 AM
Jsoup selectos syntax is not so jquery-like:
https://jsoup.org/cookbook/extracting-data/selector-syntax
In my case I ended up using this query:
h2#KPILIST + div tr
All the sessions of the conference are now available online