Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
06-27-2021 11:21 AM
Coming from a traditional RDBMS and having a DBA background, I'm familiar with database concepts, but feel as though i'm missing something big. I'm attempting to import firewall log data. Nothing fancy, just (ip A) - [:SENT {srcport, dstport, protocol}] -> (ip B)
I have an import model that works using the API, but it is SLOOOOOOOOOW. It takes 30 minutes for a 5MB file of approximately 56k rows.
In a standard relational database, this could be held in active page pool memory on a Rev1 Raspberry Pi board, so why is it so slow in NEO4J?
I would be happy to post how i'm interfacing with data and answer any questions.
06-27-2021 05:12 PM
Hi and welcome!
Could you share the code you are using to import the data? That will be a huge help in understanding what the issue(s) could be.
07-05-2021 10:31 AM
Sure thing! Sorry it took me a while to reply... some life events occurred. The below is a script i'm running in Powershell to interact with the API.
I've also attached all the relevant files. Please note, I tried loading all the "ports" and created an index because I thought maybe it was getting hung up on trying to create in memory relationships that it didn't need. It takes just as long whether or not you do that step...
please download and rename ports.txt to ports.json
download the UUID file names and rename to .csv extension
# LOAD PORTS #
$retval = get-content "C:\users\mban\documents\ports.json"
$url = "http://127.0.0.1:7474"
$credPair = "neo4j:test"
$encodedCredentials = [System.Convert]::ToBase64String([System.Text.Encoding]::ASCII.GetBytes($credPair))
$headers = @{"Authorization"="Basic $encodedCredentials"; "Accept"="application/json; charset=UTF-8";"Content-Type"="application/json"}
$response = Invoke-WebRequest -Uri "$($url)/db/data/transaction/commit" -Method Post -Headers $headers -Body $($retval)
$response.content
# CREATE INDEX #
$queries = @{}
$queries['statements'] = @()
$query = "CREATE INDEX ports_port_index FOR (n:ports) ON (n.port)"
$queries['statements'] += [ordered]@{'statement'="$($query)"}
$retval = $queries| ConvertTo-Json
$url = "http://127.0.0.1:7474"
$credPair = "neo4j:test"
$encodedCredentials = [System.Convert]::ToBase64String([System.Text.Encoding]::ASCII.GetBytes($credPair))
$headers = @{"Authorization"="Basic $encodedCredentials"; "Accept"="application/json; charset=UTF-8";"Content-Type"="application/json"}
$response = Invoke-WebRequest -Uri "$($url)/db/data/transaction/commit" -Method Post -Headers $headers -Body $($retval)
$response.content
# LOAD IPOBJ DATA #
$data = "C:\users\mban\Documents\golang\data"
foreach ($UUID in ($(Get-ChildItem "$($data)\go_output\*.csv" -File).name | ForEach-Object {$_.split("_")[0]} | Sort-Object -Unique))
{
$queries = @{}
$queries['statements'] = @()
$query = "LOAD CSV WITH HEADERS FROM 'file:///data/go_output/$($uuid)_send_data.csv' AS row
MERGE (from:ipobj {ip: row.srcip, ipversion: row.srcipver, internal: row.srcipinternal})
MERGE (to:ipobj {ip: row.dstip, ipversion: row.dstipver, internal: row.dstipinternal})"
$queries['statements'] += [ordered]@{'statement'="$($query)"}
$query = "LOAD CSV WITH HEADERS FROM 'file:///data/go_output/$($uuid)_receive_data.csv' AS row
MERGE (from:ipobj {ip: row.srcip, ipversion: row.srcipver, internal: row.srcipinternal})
MERGE (to:ipobj {ip: row.dstip, ipversion: row.dstipver, internal: row.dstipinternal})"
$queries['statements'] += [ordered]@{'statement'="$($query)"}
$query = "LOAD CSV WITH HEADERS FROM 'file:///data/go_output/$($uuid)_forward_data.csv' AS row
MERGE (from:ipobj {ip: row.srcip, ipversion: row.srcipver, internal: row.srcipinternal})
MERGE (to:ipobj {ip: row.dstip, ipversion: row.dstipver, internal: row.dstipinternal})"
$queries['statements'] += [ordered]@{'statement'="$($query)"}
$query = "LOAD CSV WITH HEADERS FROM 'file:///data/go_output/$($uuid)_unknown_data.csv' AS row
MERGE (from:ipobj {ip: row.srcip, ipversion: row.srcipver, internal: row.srcipinternal})
MERGE (to:ipobj {ip: row.dstip, ipversion: row.dstipver, internal: row.dstipinternal})"
$queries['statements'] += [ordered]@{'statement'="$($query)"}
$retval = $queries| ConvertTo-Json
$url = "http://127.0.0.1:7474"
$credPair = "neo4j:test"
$encodedCredentials = [System.Convert]::ToBase64String([System.Text.Encoding]::ASCII.GetBytes($credPair))
$headers = @{"Authorization"="Basic $encodedCredentials"; "Accept"="application/json; charset=UTF-8";"Content-Type"="application/json"}
$response = Invoke-WebRequest -Uri "$($url)/db/data/transaction/commit" -Method Post -Headers $headers -Body $($retval)
$response.content
}
# CREATE IPOBJ INDEX #
$queries = @{}
$queries['statements'] = @()
$query = "CREATE INDEX ipobj_ip_index FOR (n:ipobj) ON (n.ip)"
$queries['statements'] += [ordered]@{'statement'="$($query)"}
$retval = $queries| ConvertTo-Json
$url = "http://127.0.0.1:7474"
$credPair = "neo4j:test"
$encodedCredentials = [System.Convert]::ToBase64String([System.Text.Encoding]::ASCII.GetBytes($credPair))
$headers = @{"Authorization"="Basic $encodedCredentials"; "Accept"="application/json; charset=UTF-8";"Content-Type"="application/json"}
$response = Invoke-WebRequest -Uri "$($url)/db/data/transaction/commit" -Method Post -Headers $headers -Body $($retval)
$response.content
# CREATE RELATIONSHIPS #
foreach ($UUID in ($(Get-ChildItem "$($data)\go_output\*.csv" -File).name | ForEach-Object {$_.split("_")[0]} | Sort-Object -Unique))
{
$queries = @{}
$queries['statements'] = @()
$query = "LOAD CSV WITH HEADERS FROM 'file:///data/go_output/$($uuid)_send_data.csv' AS row
MATCH (from:ipobj {ip: row.srcip})
MATCH (to:ipobj {ip: row.dstip})
MATCH (srcports:ports {port: row.srcport})
MATCH (dstports:ports {port: row.dstport})
MERGE (from)-[datatransfer:SENT {type: row.action, srcport: srcports.port, dstport: dstports.port}]->(to)"
$queries['statements'] += [ordered]@{'statement'="$($query)"}
$query = "LOAD CSV WITH HEADERS FROM 'file:///data/go_output/$($uuid)_receive_data.csv' AS row
MATCH (from:ipobj {ip: row.srcip})
MATCH (to:ipobj {ip: row.dstip})
MATCH (srcports:ports {port: row.srcport})
MATCH (dstports:ports {port: row.dstport})
MERGE (from)-[datatransfer:RECEIVED {type: row.action, srcport: srcports.port, dstport: dstports.port}]->(to)"
$queries['statements'] += [ordered]@{'statement'="$($query)"}
$query = "LOAD CSV WITH HEADERS FROM 'file:///data/go_output/$($uuid)_forward_data.csv' AS row
MATCH (from:ipobj {ip: row.srcip})
MATCH (to:ipobj {ip: row.dstip})
MATCH (srcports:ports {port: row.srcport})
MATCH (dstports:ports {port: row.dstport})
MERGE (from)-[datatransfer:FORWARDD {type: row.action, srcport: srcports.port, dstport: dstports.port}]->(to)"
$queries['statements'] += [ordered]@{'statement'="$($query)"}
$query = "LOAD CSV WITH HEADERS FROM 'file:///data/go_output/$($uuid)_unknown_data.csv' AS row
MATCH (from:ipobj {ip: row.srcip})
MATCH (to:ipobj {ip: row.dstip})
MATCH (srcports:ports {port: row.srcport})
MATCH (dstports:ports {port: row.dstport})
MERGE (from)-[datatransfer:UNKNOWN {type: row.action, srcport: srcports.port, dstport: dstports.port}]->(to)"
$queries['statements'] += [ordered]@{'statement'="$($query)"}
$retval = $queries| ConvertTo-Json
$url = "http://127.0.0.1:7474"
$credPair = "neo4j:test"
$encodedCredentials = [System.Convert]::ToBase64String([System.Text.Encoding]::ASCII.GetBytes($credPair))
$headers = @{"Authorization"="Basic $encodedCredentials"; "Accept"="application/json; charset=UTF-8";"Content-Type"="application/json"}
$response = Invoke-WebRequest -Uri "$($url)/db/data/transaction/commit" -Method Post -Headers $headers -Body $($retval)
$response.content
}
All the sessions of the conference are now available online