Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
01-28-2022 08:07 AM
According to the documentation, for the following update Neo4j automatically acquires a writing lock:
MATCH (n:Example {id: 42})
SET n.prop = n.prop + 1
I have tested it as part of different queries. The first one returns the entire node and is actually thread safe:
MATCH (n:Example {id: 42})
SET n.prop = n.prop + 1
RETURN n
The other one returns only a property and is not thread safe; apparently, no lock is acquired:
MATCH (n:Example {id: 42})
SET n.prop = n.prop + 1
RETURN n.prop
Have also tried to pass the node in a WITH clause, and then return the property - still doesn't work:
MATCH (n:Example {id: 42})
SET n.prop = n.prop + 1
WITH n
RETURN n.prop
Is it possible to update a property in a thread safe way and then return it alone, without the entire node?
01-28-2022 02:19 PM
Can you elaborate on how you determined it was not thread safe? Also, what version of Neo4j is being used here?
At the point of writing, a lock is taken on the node. That lock is not released until the transaction ends, so the property read at the return should be the property resulting from the earlier SET operation.
If not, we may be looking at a bug here.
01-28-2022 02:59 PM
also as a test and if you have APOC installed you could run
MATCH (n:Example {id: 42})
SET n.prop = n.prop + 1
WITH n call apoc.util.sleep(100000)
RETURN n.prop
which would find the node, as a result of the SET create a lock, update the property and then sleep for 100000ms. During these 100000ms if you then run
MATCH (n:Example {id: 42})
SET n.testing_lock = true
WITH n
RETURN n.prop
this should hang until the apoc.util.sleep completes in 100000ms. If there was no lock then the 2nd cypher should be < 10ms. But if it takes > 10ms its because the lock is present
01-31-2022 01:50 AM
I found that the lock also doesn't work in a MERGE clause, even if the entire node is returned:
@Test
void testMerge() {
List<?> values = IntStream.range(0, 5)
.parallel()
.mapToObj(n -> merge())
.collect(toList());
System.out.println(values);
//Fail, may prints: [0, 0, 0, 0, 0]
assertThat(
values,
containsInAnyOrder(1L, 2L, 3L, 4L, 5L)
);
}
private long merge() {
String query = "MERGE (n:ExampleMerged {id: 42}) " +
"ON CREATE SET n.prop = 0 " +
"ON MATCH SET n.prop = n.prop + 1 " +
"RETURN n";
try (Transaction tx = database.beginTx()) {
long value = (long) tx.execute(query)
.<Entity>columnAs("n")
.next()
.getProperty("prop");
tx.commit();
return value;
}
}
It would work, however, if the node already exists before running the test, i.e. only the ON SET clause runs in parallel threads.
01-29-2022 05:11 AM
Below is my test:
public class WritingLockTest {
private final GraphDatabaseService database;
public WritingLockTest() {
this.database = Neo4jBuilders.newInProcessBuilder()
.build()
.databaseManagementService()
.database(DEFAULT_DATABASE_NAME);
}
@BeforeEach
void create() {
try (Transaction tx = database.beginTx()) {
tx.execute("CREATE (n:Example {id: 42, prop: 0}) ");
tx.commit();
}
}
@Test
void testGetNode() {
List<?> values = IntStream.range(0, 5)
.parallel()
.mapToObj(n -> getPropertyFromNode())
.collect(toList());
assertThat(
values,
containsInAnyOrder(1L, 2L, 3L, 4L, 5L)
);
}
private long getPropertyFromNode() {
String query = "MATCH (n:Example {id: 42}) " +
"SET n.prop = n.prop + 1 " +
"RETURN n";
try (Transaction tx = database.beginTx()) {
long value = (long) tx.execute(query)
.<Entity>columnAs("n")
.next()
.getProperty("prop");
tx.commit();
return value;
}
}
@Test
void testGetProperty() {
List<?> values = IntStream.range(0, 5)
.parallel()
.mapToObj(n -> getProperty())
.collect(toList());
System.out.println(values);
//Fail, may print or smth. like that: [0, 0, 1, 0, 0]
assertThat(
values,
containsInAnyOrder(1L, 2L, 3L, 4L, 5L)
);
}
private long getProperty() {
String query = "MATCH (n:Example {id: 42}) " +
"SET n.prop = n.prop + 1 " +
"RETURN n.prop as prop";
try (Transaction tx = database.beginTx()) {
long value = tx.execute(query)
.<Long>columnAs("prop")
.next();
tx.commit();
return value;
}
}
}
Both cases run the query in a parallel stream and then retrieve the property: either from the node or on its own. The first test passes, the second one fails.
Obviously, this test only covers the embedded database created by the harness. I will later run the same code in a server extension, deployed in a desktop installation.
01-29-2022 07:34 AM
Have run the testGetProperty code in a server extension, deployed in a Desktop installation. It works fine and prints no duplicates, e.g. [1,4,3,5,2].
So apparently, the bug is in the harness. The version I am using is 4.4.2.
While debugging, I found that the needsExclusiveLock flag in org.neo4j.cypher.internal.runtime.interpreted.pipes.SetEntityPropertyOperation is set to true in first test and false in the second one. Hope this helps.
01-31-2022 02:12 AM
Comparing the execution plans reveals potentially interesting differences:
Returning node (thread safe):
MATCH (n:Example {id: 42})
SET n.prop = n.prop + 1
RETURN n
+------------------+------------------------------+----------------+
| Operator | Details | Estimated Rows |
+------------------+------------------------------+----------------+
| +ProduceResults | n | 1 |
| | +------------------------------+----------------+
| +SetProperty | n.prop = n.prop + $autoint_1 | 1 |
| | +------------------------------+----------------+
| +Eager | | 1 |
| | +------------------------------+----------------+
| +Filter | n.id = $autoint_0 | 1 |
| | +------------------------------+----------------+
| +NodeByLabelScan | n:Example | 10 |
+------------------+------------------------------+----------------+
Returning property (non thread safe):
MATCH (n:Example {id: 42})
SET n.prop = n.prop + 1
RETURN n.prop as prop
+------------------+-------------------------------------+----------------+
| Operator | Details | Estimated Rows |
+------------------+-------------------------------------+----------------+
| +ProduceResults | prop | 1 |
| | +-------------------------------------+----------------+
| +Projection | cache[n.prop] AS prop | 1 |
| | +-------------------------------------+----------------+
| +Eager | | 1 |
| | +-------------------------------------+----------------+
| +SetProperty | n.prop = cache[n.prop] + $autoint_1 | 1 |
| | +-------------------------------------+----------------+
| +Eager | | 1 |
| | +-------------------------------------+----------------+
| +Filter | n.id = $autoint_0 | 1 |
| | +-------------------------------------+----------------+
| +NodeByLabelScan | n:Example | 10 |
+------------------+-------------------------------------+----------------+
Don't know what cache[n.prop] means in the second plan, but it is an obvious difference. However, the third query doesn't exhibit this behaviour although it is also non thread safe:
MERGE (n:ExampleMerged {id: 42})
ON CREATE SET n.prop = 0
ON MATCH SET n.prop = n.prop + 1
RETURN n
+------------------+---------------------------------------------------------------------------------------+----------------+
| Operator | Details | Estimated Rows |
+------------------+---------------------------------------------------------------------------------------+----------------+
| +ProduceResults | n | 1 |
| | +---------------------------------------------------------------------------------------+----------------+
| +Merge | CREATE (n:ExampleMerged {id: $autoint_0}), ON MATCH SET n.prop = n.prop + $autoint_2, | 1 |
| | | ON CREATE SET n.prop = $autoint_1 | |
| | +---------------------------------------------------------------------------------------+----------------+
| +Filter | n.id = $autoint_0 | 1 |
| | +---------------------------------------------------------------------------------------+----------------+
| +NodeByLabelScan | n:ExampleMerged | 10 |
+------------------+---------------------------------------------------------------------------------------+----------------+
All the sessions of the conference are now available online