Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
05-05-2022 10:32 AM
I'm running neo4j server community 4.4.5 with browser 4.4.3 on Ubuntu 20.04 machine.
This is from the online manual:
This is the error:
And this is the query
:auto LOAD CSV WITH HEADERS FROM 'file:///transactions.csv' AS row FIELDTERMINATOR ';'
CALL {
WITH row
WITH row
WHERE row.id IS NOT NULL
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
MERGE (transaction:Transaction { id : toInteger(row.id) })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.id = toInteger(row.id),
transaction.consumer_id = NULL // remove the attribute no more used
WITH transaction, row, bankAccount, organisation
CALL {
WITH transaction, row, bankAccount
WITH transaction, row, bankAccount
WHERE bankAccount IS NOT NULL
MERGE (transaction)-[h:HAS]->(bankAccount)
}
CALL {
WITH transaction, row, bankAccount
WITH transaction, row, bankAccount
WHERE bankAccount IS NULL
MERGE (error:Error)
MERGE (transaction)-[h:HAS_NO_BANK_ACCOUNT]->(error)
}
CALL {
WITH transaction, row, organisation
WITH transaction, row, organisation
WHERE organisation IS NOT NULL
MERGE (transaction)-[h:BELONGS_TO]->(organisation)
}
CALL {
WITH transaction, row, organisation
WITH transaction, row, organisation
WHERE organisation IS NULL
MERGE (error:Error)
MERGE (transaction)-[h:HAS_NO_ORGANISATION]->(error)
}
} IN TRANSACTIONS;
I tried to make some changes but with no luck!
Any suggestion?
Solved! Go to Solution.
05-06-2022 10:05 AM
Try this code. See in the each cypher statement in each 'with' clause, it terminates with a 'RETURN 1' statement. This causes the apoc 'when' method to return a value in the 'yield value' statement, which gets merged with the current result. Without returning a value, the current result gets merged with 'null' and the query stops since there is no result to process. At least this is my hypothesis. I got it to work locally when I returned a value.
:auto
LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL
CALL {
WITH row
MERGE (transaction:Transaction { id : toInteger(row.id) })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.id = toInteger(row.id),
transaction.consumer_id = NULL // remove the attribute no more used
WITH transaction, row
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
CALL apoc.do.when(bankAccount IS NOT NULL,
'WITH $tran as tran, $bankAcct as bankAcct MERGE (tran)-[h:HAS]->(bankAcct) RETURN 1',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_BANK_ACCOUNT]->(error) RETURN 1',
{tran: transaction, bankAcct: bankAccount}) yield value
with transaction, row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
CALL apoc.do.when(organisation IS NOT NULL,
'WITH $tran as tran, $org as org MERGE (tran)-[h:BELONGS_TO]->(org) RETURN 1',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_ORGANISATION]->(error) RETURN 1',
{tran: transaction, org: organisation}) yield value
RETURN 1
} IN TRANSACTIONS
RETURN 1
05-05-2022 11:39 AM
Hey there!
From a brief peek at your cypher code snippet, it appears you are hitting that error due to a syntax error on row 5? I was able to reproduce your query above (different data) using v4.4.5 (I am in v4.4.5 browser though).
I believe you are seeing an error because you are following the CALL { with row with row
with a WHERE
clause.
i.e. -> You have:
:auto LOAD CSV WITH HEADERS FROM 'file:///transactions.csv' AS row
CALL {
WITH row
WITH row
WHERE row.id IS NOT NULL
...
What you may want is to tuck that where clause in your OPTIONAL MATCH
. Something like this:
:auto LOAD CSV WITH HEADERS FROM 'file:///transactions.csv' AS row
CALL { WITH row WITH row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)}) WHERE ...
Reading through regarding auto transactions, I don't think you can use a WHERE clause there...
I hope this somewhat helps and there isn't much else to it. Post back here if you are seeing same thing. I'd be curious what is throwing that error if it isn't how you are stating WHERE row.id IS NOT NULL
.
Cheers,
Rob
05-05-2022 01:32 PM
You may recall you had a query a few weeks back that had nested CALL subqueries and returned the same error. In that case, I was not able to get rid of the error and ended up refactoring the query not to have nested CALLs. I did the same in this case using the apoc 'when' method to replace the use of the inner CALL subqueries used to mimic if/else functionality.
The query executes, but I don't have test data to validate it is actually working. The 'RETURN 1' statements were necessary to make it execute, since I got errors about 'can't terminate without a RETURN statement'.
:auto LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL
CALL {
WITH row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
MERGE (transaction:Transaction { id : toInteger(row.id) })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.id = toInteger(row.id),
transaction.consumer_id = NULL // remove the attribute no more used
WITH transaction, row, bankAccount, organisation
CALL apoc.when(bankAccount IS NOT NULL,
'MERGE ($tran)-[h:HAS]->($bankAcct)',
'MERGE (error:Error) MERGE ($tran)-[h:HAS_NO_BANK_ACCOUNT]->(error)',
{tran: transaction, bankAcct: bankAccount}) yield value
with transaction, organisation
CALL apoc.when(organisation IS NOT NULL,
'MERGE ($tran)-[h:BELONGS_TO]->($org)',
'MERGE (error:Error) MERGE ($tran)-[h:HAS_NO_ORGANISATION]->(error)',
{tran: transaction, org: organisation}) yield value
RETURN 1
} IN TRANSACTIONS
RETURN 1
05-06-2022 07:18 AM
Unfortunately, it looks nice but run badly
Why not simply this way? (We need apoc.do.when
because we are writing ....)
05-06-2022 07:38 AM
That is what I originally had, but I thought it would not recognize the variables 'transaction', 'organization', and bankAccount' when the apoc routine executed the cypher. I did not try it to verify. Did you try it?
The error looks like you can't use the passed parameters in the match pattern.
05-06-2022 07:42 AM
Hi @glilienfield ,
With $tran he will look for the variable within the outer scope. With tran alone should work.
Bennu
05-06-2022 01:07 AM
The second WITH is used to filter the row.id, and it is trick because in a call the first WITH cannot be filtered.
It has ran until now, I don't think it is the problem, because if I remove it the problem remains.
I'm just moving to the latest 4.4.6, then let you know ....
05-05-2022 01:55 PM
I tried another approach that allowed me to eliminate the extra return statements. I used some 'list filtering' to simulate an 'if' capability. Each of the lists 'bankAcctNotNull', 'bankAcctIsNull', 'organisationNotNull', and 'organisationIsNull' will either be empty or contain a single value of '1', depending on the 'is null / is not null' conditions. As such, the 'foreach' loops will either execute one time or zero times.
Again, the query runs but I have not tested it.
:auto LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL
CALL {
WITH row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
MERGE (transaction:Transaction { id : toInteger(row.id) })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.id = toInteger(row.id),
transaction.consumer_id = NULL // remove the attribute no more used
WITH transaction, row, bankAccount, organisation,
[i in [1] where bankAccount IS NOT NULL] as bankAcctNotNull,
[i in [1] where bankAccount IS NULL] as bankAcctIsNull,
[i in [1] where organisation IS NOT NULL] as organisationNotNull,
[i in [1] where organisation IS NULL] as organisationIsNull
foreach(i in bankAcctNotNull |
MERGE (transaction)-[h:HAS]->(bankAccount)
)
foreach(i in bankAcctIsNull |
MERGE (error:Error)
MERGE (transaction)-[h:HAS_NO_BANK_ACCOUNT]->(error)
)
foreach(i in organisationNotNull |
MERGE (transaction)-[h:BELONGS_TO]->(organisation)
)
foreach(i in organisationIsNull |
MERGE (error:Error)
MERGE (transaction)-[h:HAS_NO_ORGANISATION]->(error)
)
} IN TRANSACTIONS
05-06-2022 05:04 AM
@glilienfield @Rcolinp I used the first solution from @glilienfield : it is much more readable of the second, despite the return statement.
You can remove the second return 1
statement: it runs without.
So, the problem key are the nested calls: if you have a running query with CALL {} IN TRANSACTIONS
and add a nested CALL {}
doing nothing, the entire query fails with an error.
While the same query without the inner call, succeeds (without any return!):
This latest to demonstrate the usage of the double WITH ROW
inside the CALL. With only one it doesn't run with the WHERE
clause:
Now a question is still open: What is the meaning of using :auto
combined with using WITH TRANSACTIONS OF 100 ROWS
? Who has the precedence? :auto
or 100
?
05-06-2022 08:03 AM
I had passed the variables from the outer scope as parameters to the apoc method, then referenced the parameters in the cypher. It looks like it does not like using parameters in replacement of match binder variables.
I just did an experiment. the apoc method did not recognize variables defined outside the method. This make sense to me, as the cypher will not have knowledge of these variables when it is executed on the server.
I tried this as a work around. I bound the passed parameters to variables in a 'with' clause, which can them be used in the match pattern.
:auto LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL
CALL {
WITH row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
MERGE (transaction:Transaction { id : toInteger(row.id) })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.id = toInteger(row.id),
transaction.consumer_id = NULL // remove the attribute no more used
WITH transaction, row, bankAccount, organisation
CALL apoc.do.when(bankAccount IS NOT NULL,
'WITH $tran as tran, $bankAcct as bankAcct MERGE (tran)-[h:HAS]->(bankAcct)',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_BANK_ACCOUNT]->(error)',
{tran: transaction, bankAcct: bankAccount}) yield value
with transaction, organisation
CALL apoc.do.when(organisation IS NOT NULL,
'WITH $tran as tran, $org as org MERGE (tran)-[h:BELONGS_TO]->(org)',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_ORGANISATION]->(error)',
{tran: transaction, org: organisation}) yield value
RETURN 1
} IN TRANSACTIONS
RETURN 1
05-06-2022 08:12 AM
I'll give you another point to think about: the :auto
keyword doesn't like to the cypher-shell
!
05-06-2022 08:30 AM
Well, I gave it this version and it partially run:
But I have 895 nodes, and if I check I have 895 :HAS_NO_BANK_ACCOUNT
but 0 :HAS_NO_ORGANIZATION
AND 0 :BELONGS_TO
. Still cannot explain
05-06-2022 08:49 AM
@glilienfield @Rcolinp @bennu.neo Yes, I tried a lot of times, the first APOC.DO.WHEN runs fine, but the second APOC.DO.WHEN is never executed!
05-06-2022 09:13 AM
Can you please try reversing the order? Does the 'organisation' one now run and the 'bankAccount' when doesn't?
05-06-2022 09:16 AM
Already done: it stops after the first one!
05-06-2022 09:44 AM
I have already tried this too: moving the optional matches below doesn't change anything.
BTW: You other solutions run as a charme, but I would like to figure this out and understand this unusual behavior
05-06-2022 09:45 AM
with the return statements added to the 'when' cypher? I tested it locally with similar code and it executed both 'when' statements.
05-06-2022 09:50 AM
Yes, It don't execute the second when
. I changed the names to the relationships, so I can check if it creates one branch or the other, and I have no relationships from the second WHEN.
WITH row
WHERE row.transactionid IS NOT NULL
MERGE (transaction:Transaction { transactionid : row.transactionid })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.consumer_id = NULL // remove the attribute no more used
with transaction, row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
CALL apoc.do.when(organisation IS NOT NULL,
'WITH $tran as tran, $org as org MERGE (tran)-[h:TRANSACTION_BELONGS_TO_ORGANISATION]->(org)',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:TRANSACTION_HAS_NO_ORGANISATION]->(error)',
{tran: transaction, org: organisation}) yield value AS X
WITH transaction, row
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
CALL apoc.do.when(bankAccount IS NOT NULL,
'WITH $tran as tran, $bankAcct as bankAcct MERGE (tran)-[h:TRANSACTION_HAS_BANK_ACCOUNT]->(bankAcct)',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:TRANSACTION_HAS_NO_BANK_ACCOUNT]->(error)',
{tran: transaction, bankAcct: bankAccount}) yield value AS Y
return 1
} IN TRANSACTIONS OF 5000 ROWS
return 1;
05-06-2022 09:56 AM
You can change the return values from all 1's to different values and return them. You can then see which cypher statements executed. maybe add a 'limit 10' or so after the first 'where' clause so it doesn't execute all the rows.
The above code you pasted does not have the 'return' statements in the apoc 'when' cypher. I was able to get both 'when' statements to execute in my prototype only when I returned something from the first 'when' cypher; otherwise, the query stops since it produces no results.
05-06-2022 09:59 AM
Please, show me what you mean with a return statements in the apoc.when
cypher.
Anyway, I can't try now because neo4j is not responding: it' loading millions transactions using your other query and, even I split the input file in chunks of 10000lines (with shell split) and I'm running in the cypher shell, the browser doesn't respond. I'll have to wait it will finish to make more tests!
05-06-2022 10:05 AM
Try this code. See in the each cypher statement in each 'with' clause, it terminates with a 'RETURN 1' statement. This causes the apoc 'when' method to return a value in the 'yield value' statement, which gets merged with the current result. Without returning a value, the current result gets merged with 'null' and the query stops since there is no result to process. At least this is my hypothesis. I got it to work locally when I returned a value.
:auto
LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL
CALL {
WITH row
MERGE (transaction:Transaction { id : toInteger(row.id) })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.id = toInteger(row.id),
transaction.consumer_id = NULL // remove the attribute no more used
WITH transaction, row
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
CALL apoc.do.when(bankAccount IS NOT NULL,
'WITH $tran as tran, $bankAcct as bankAcct MERGE (tran)-[h:HAS]->(bankAcct) RETURN 1',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_BANK_ACCOUNT]->(error) RETURN 1',
{tran: transaction, bankAcct: bankAccount}) yield value
with transaction, row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
CALL apoc.do.when(organisation IS NOT NULL,
'WITH $tran as tran, $org as org MERGE (tran)-[h:BELONGS_TO]->(org) RETURN 1',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_ORGANISATION]->(error) RETURN 1',
{tran: transaction, org: organisation}) yield value
RETURN 1
} IN TRANSACTIONS
RETURN 1
05-06-2022 10:06 AM
Understood: will try ASAP. Thank you anyway!
05-09-2022 07:09 AM
@glilienfield @bennu.neo @Rcolinp Justr for your info, I loaded 11 million transactions of different kind, using the latest approach proposed by @glilienfield.
This approach using apoc turned out to be by far the most readable one (after a bit of a sprucing up of the query!)
This is the result:
LOAD CSV WITH HEADERS FROM 'file:///transactions.csv' AS row FIELDTERMINATOR ';'
CALL {
WITH row
WITH row
WHERE row.transactionid IS NOT NULL
MERGE (transaction:Transaction { transactionid : row.transactionid })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.consumer_id = NULL // remove the attribute no more used
with transaction, row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
CALL apoc.do.when(organisation IS NOT NULL,
'WITH $tran as tran, $org as org
MERGE (tran)-[h:TRANSACTION_BELONGS_TO_ORGANISATION]->(org)
RETURN 1',
'WITH $tran as tran
MERGE (error:Error)
MERGE (tran)-[h:TRANSACTION_HAS_NO_ORGANISATION]->(error)
RETURN 1',
{tran: transaction, org: organisation}) yield value AS X
WITH transaction, row
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
CALL apoc.do.when(bankAccount IS NOT NULL,
'WITH $tran as tran, $bankAcct as bankAcct
MERGE (tran)-[h:TRANSACTION_HAS_BANK_ACCOUNT]->(bankAcct)
RETURN 1',
'WITH $tran as tran
MERGE (error:Error)
MERGE (tran)-[h:TRANSACTION_HAS_NO_BANK_ACCOUNT]->(error)
RETURN 1',
{tran: transaction, bankAcct: bankAccount}) yield value AS Y
return transaction
} IN TRANSACTIONS OF 5000 ROWS
return count(transaction)
The story will follow with a new discussion .... but this is an achieved result I'd like to share!
Thank you to everybody!
05-09-2022 07:14 AM
Super @paolodipietro58 !
So @glilienfield is suggesting APOC solutions now.
05-09-2022 08:00 AM
Well, we make some trial, and the APOC.do.when
looks better, also in terms of query readability.
05-27-2022 12:47 AM
@glilienfield @bennu.neo @Rcolinp I have a little good new on this:
I discovered tha you can remove the WITH
statement inside and this further simplify the query, just keep attention to the names!
CALL apoc.do.when(organisation IS NOT NULL,
' MERGE (tran)-[h:TRANSACTION_BELONGS_TO_ORGANISATION]->(org)
RETURN 1',
' MERGE (error:Error)
MERGE (tran)-[h:TRANSACTION_HAS_NO_ORGANISATION]->(error)
RETURN 1',
{tran: transaction, org: organisation}) yield value AS X
05-06-2022 09:42 AM
I think it may be caused by not returning a value from the 'when' cypher statements. I added returns. I also rearranged some things, but that should have no effect.
Try this:
:auto
LOAD CSV WITH HEADERS FROM 'file:///Book1.csv' AS row FIELDTERMINATOR ';'
WITH row
WHERE row.id IS NOT NULL
CALL {
WITH row
MERGE (transaction:Transaction { id : toInteger(row.id) })
SET transaction = row,
transaction.uuid = apoc.create.uuid(),
transaction.id = toInteger(row.id),
transaction.consumer_id = NULL // remove the attribute no more used
WITH transaction, row
OPTIONAL MATCH (bankAccount:BankAccount {id: toInteger(row.bank_account_id)})
CALL apoc.do.when(bankAccount IS NOT NULL,
'WITH $tran as tran, $bankAcct as bankAcct MERGE (tran)-[h:HAS]->(bankAcct) return 1',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_BANK_ACCOUNT]->(error) return 1',
{tran: transaction, bankAcct: bankAccount}) yield value
with transaction, row
OPTIONAL MATCH (organisation:Organisation {id: toInteger(row.organisation_id)})
CALL apoc.do.when(organisation IS NOT NULL,
'WITH $tran as tran, $org as org MERGE (tran)-[h:BELONGS_TO]->(org) return 1',
'WITH $tran as tran MERGE (error:Error) MERGE (tran)-[h:HAS_NO_ORGANISATION]->(error) return 1',
{tran: transaction, org: organisation}) yield value
RETURN 1
} IN TRANSACTIONS
RETURN 1
05-09-2022 09:15 AM
I agree, it is much more readable than the other approach.
05-27-2022 07:48 AM
That is very interesting, so I looked at the source code to understand how that works. It turns out the apoc procedure passes the parameter map to the neo4j API 'execute' method as parameters, so they can be referenced as parameters, i.e. with a '$' sign. It also turns out the the procedure creates a 'WITH' clause with all the parameter values individually assigned to their parameter's names, i.e. 'WITH value1 as key1, value2 as key2, ...' This 'WITH' clause is prepended to both the 'if' cypher and the 'else' cypher before they are passed to the 'execute' method. This allows the parameter values to be referenced directly as cypher variables. In summary, the apoc.do.when allows passed parameters to be references as parameters (with '$') and directly as cypher variables. This explains why both methods work. This approach is used in all of the apoc cypher methods that pass a parameter map.
Good find.
All the sessions of the conference are now available online