Neo4j

guille_rr · ‎10-13-2021

Hi all,
I have to filter the result of a apoc.when. When two property values are equals I need the node with the closest date only one.
The problem is that the order by limit 1 does not work
The conde:

//Codigo para sacar subgrafo entre el centro y la barra
MATCH (cen:CENTRO{name:'C0010810'})<-[r5:APLICA_DIRECTO_INCLUYE]-(rapc:RAPC)
MATCH (rapc)<-[r4:APLICA_CENTROS]-(cab:CabeceraAP)
//WHERE (COALESCE(rapc.fIniEfe,"") + COALESCE(rapc.cHoraIni,"")) <= '201909050000' <= (COALESCE(rapc.fFinEfe,"") + COALESCE(rapc.cHoraFin,""))
MATCH (cab)-[r3:APLICA_MERCANCIA_C|APLICA_MERCANCIA_D]->(rapm:RAPM)
MATCH (rapm)-[r2]->(n)
MATCH p = (b:BARRA{id:'B001014861102136'})-[r6:PERTENECE_A*..3]->(n) WITH *, relationships(p) AS rel
//OPTIONAL MATCH (b)<-[:PERTENECE_A]-(t:TALLA)
//MATCH (cab)-[:APLICA_MERCANCIA_C|APLICA_MERCANCIA_D]->(ra:RAPM)-[]->(n)
MATCH ruta = ((cab)-[:APLICA_MERCANCIA_C|APLICA_MERCANCIA_D]->(ra:RAPM)-[re]->(n))


WITH count(nodes(ruta)) as con, ra, cab, n, re
CALL apoc.when(
   con > 1,
  //codigo para cuando una cabecera tiene mas de un rapm, sacar el min del grupo
  'WITH ra.cNivelIn as nivel, ra, cab, n, re
		//cambiar la M por 8,5
		CALL apoc.when(
          nivel = "M",
          "RETURN 8.5 as nivel",
          "RETURN nivel as nivel",
          {nivel:nivel}) YIELD value
          WITH MIN(value.nivel) as min, cab, ra, re, n
   WHERE (ra.cNivelIn) = min
   
   // cuando hay varios con el mismo nivelIn ver el nivelEci 
   WITH ra, cab, re, n
   WHERE ra.nivelEci = "M1"
   WITH ra, cab, re, n
   WHERE TYPE(re) <> "APLICA_DIRECTO_EXCLUYE|APLICA_COMB_EXCLUYE_M|APLICA_COMB_EXCLUYE_F"
   WITH cab, ra, re, n
   WHERE cab.dTipProm = "P"
   
   //seach cab node with the same value on property cCarpeta
   WITH cab, cab as cab1, ra, re, n
    CALL apoc.when(
      cab.cCarpeta = cab1.cCarpeta,
      //if the nodes are equals, I use the node with the closest date
        "WITH MAX(ra.fIniEfe + ra.cHoraIni) as maxFechas, cab, ra
        	WHERE (ra.fIniEfe + ra.cHoraIni) = maxFechas
            RETURN cab, ra, maxFechas ORDER BY maxFechas DESC limit 1
        ",
        "",
        {cab:cab,ra:ra}) YIELD value as val
    RETURN val
   ',
   
  //codigo para cuando una cabecera tiene un solo rapm
  '',
  {ra:ra,cab:cab,n:n, re:re}) YIELD value 
//RETURN value.cab as cab, value.ra as rapm, value.nivel, value.min, value.n as n
RETURN value.val

the detail is this:

Thanks in advance.
Regards.

clair_sullivan · ‎10-13-2021

Can you please let us know what error message you are getting or what the expected output is versus what you are getting?

guille_rr · ‎10-13-2021

Hello, no error appears, but the output is not what I expect.
If you look at the drawing, you have to obtain for each group (marked in green) the date closest to the current one (the largest date).
The result has to be the nodes with the date in blue. Only 2 nodes (dates in blue)
Thanks!

I think the problem is that aggregations do not work when they are part of a case.when
The code that has to do the aggregation is in the first if of the case.
I think that is the problem.

andrew_bowman · ‎10-13-2021

You are correct here. Procedure calls, including this one, execute per row. Prior to the call, you have:

WITH cab, cab as cab1, ra, re, n

Try returning all variables at that point in the query and view the table of results. For each of those result rows, apoc.when() will be called. This means the aggregations you perform in the procedure call are only executing on the single input row for which the procedure call was made (it's just making multiple calls, one per row).

Also make sure you remember that aggregations are applied with respect to your grouping keys, so make sure that your grouping key combination is correct for the context of the aggregation.

guille_rr · ‎10-18-2021

but it does not matter the variables that meta, since the false part of the case and the true part are decoupled.
I would like to know how I can make the true part of the case have no relation to the rest and can execute an aggregation

all the code that is inside the first apoc.when, cannot be grouped

andrew_bowman · ‎10-18-2021

As mentioned before, any procedure call executes only on the single given input row. It does not consider any other rows, and it doesn't perform aggregations implicitly. The aggregation you're doing within it max() is only considering that single input row. That is repeating per input row.

So for row1, apoc.when() is called, max() operates on that single row, it passes your WHERE filter because the single row's sum of properties is equal to the max of the sum of properties for the single row aggregated, and you get the single row back from your LIMIT 1.

Then for row2, apoc.when() is called, max() operates on that single row, it passes your WHERE filter because the single row's sum of properties is equal to the max of the sum of properties for the single row aggregated, and you get the single row back from your LIMIT 1.

Effectively your aggregation and filtering are broken because apoc.when() is only executing on a single row at a time, so your aggregation can't consider more than that single input row.

If you want apoc.do.when() to be able to aggregate across multiple rows, then the MATCH (or any other operation) that generates multiple rows must be called within the procedure, not outside of it.

Otherwise, you should abandon the idea of using the procedure to perform the aggregation. Maybe a subquery call would be better, but you would still need to do the MATCH that generates the rows within the subquery, not outside.

You might consider using apoc.agg.maxItems() to help you out, this is a custom aggregation function that will get you the item associated with a maximum value, according to the grouping key at the point of the aggregation.

For example,

WITH re, n, cab.cCarpeta as cCarpeta, maxItems(cab, ra.fIniEfe + ra.cHoraIni, 1) as result
WITH re, n, result.items[0] as cab, result.value as maxFechas
...

This reads as:

per a distinct combination of re, n, and cCarpeta value, get me the single cab node with the maximum value of the ra.fIniEfe + ra.cHoraIni for its row.

The resulting structure is: {items:[], value:n} where value is the maximum value for the grouping key, and items is all of the items (in this case cab nodes) tieing for the max value. Since we called the function with 1 as that last arg, that items list will have at most 1 value, and we get it by grabbing the 0th element of the items list.

I think this gets around your problem. I think you were trying to use the WITH cab, cab as cab1 to try to compare cabs on different rows with each other, but that doesn't work, since you're aliasing the same node, you're not actually doing the comparison you want.

The apoc.agg.maxItems() approach should work because cab.cCarpeta is the grouping key, the aggregation executes with respect to cab nodes having the same value, and the ra.fIniEfe + ra.cHoraIni maxFechas value is used in determination of which row is considered the max value, and its associated cab node is selected, we then retrieve it from the resulting structure, including the maxFechas value for that entry if you want it.

guille_rr · ‎10-19-2021

Hi Andrew, first of all, thanks for the help you are giving me. I'm learning a lot.
In the drawing you can see the problem that I have to solve:

like keeping the cab and ra nodes that if they have the same cCarpeta, the tiebreaker is done by looking at the dates of the ra nodes + the name field of the cab node.
That's why I group by cCarpeta and I keep the one with the highest date + name.
I am able to remove the nodes correctly but not the 2 types (ra + cab) because when doing the aggregation, if I add another type of node it no longer works well.
.

MATCH (nf)-[]-(raf)-[]-(cabf)
WITH  apoc.agg.maxItems(cabf, cabf.name + raf.fIniEfe + raf.cHoraIni, 1) as result, cCarpeta

This is the result, it´s ok but ra nodes are missing

And I had to take the aggregation out of the apoc and put a match as you explained to me yesterday.
But even so, if in the aggregation I put the two nodes that I need in the output (ra + cab) instead of 3 nodes, cab / ra takes the 5 out of the total

Thanks a lot for everything, kind regards

andrew_bowman · ‎10-20-2021

It looks like you have one RA node per cab node, so that indicates to me that it ought to be included with the items aggregated, not the grouping key outside the aggregation. In that case, make your aggregation be a structure that includes both the cab node at the RA node:

WITH re, n, cab.cCarpeta as cCarpeta, maxItems([cab, ra], ra.fIniEfe + ra.cHoraIni, 1) as result
WITH re, n, result.items[0][0] as cab, result.items[0][1] as ra, result.value as maxFechas
...

You could use a map structure instead, but the point is that your item can be a structure to hold multiple items to aggregate.

guille_rr · ‎10-21-2021

It's works! thanks
but the condition is that they have the same cFolder and that the cab.dTipProm = "P"
And I don't know how to filter cab nodes from Result list

WITH value.n as nf, value.ra as raf, value.cab as cabf, value.cab.cCarpeta as cCarpeta
WITH apoc.agg.maxItems([raf,cabf], cabf.name + raf.fIniEfe + raf.cHoraIni) as result, cCarpeta
WHERE result.items.cabf.dTipProm = "P"  ????

Regards!

EDIT:
It doesn't work

Now take out the red arrows and I would have to take out the green

But if I remove the name (cabf.name) in the concat string it's work fine

WITH value.n as nf, value.ra as raf, value.cab as cabf, value.cab.cCarpeta as cCarpeta
WITH apoc.agg.maxItems([raf,cabf],  raf.fIniEfe + raf.cHoraIni) as result, cCarpeta

what is the reason?

Thanks

andrew_bowman · ‎10-21-2021

The thing we're trying to get is the cabf (and its raf) associated with the max value of your date (raf.fIniEfe + raf.cHoraIni, which was previously being calculated as a maxFechas value).

Appending the cabf.name skews all of that, because now those terms don't represent a date, and they have the individual node name tacked on. If cabf nodes all have different names, then, per cabf nodes with the same cCarpeta, because you're appending the string, the cabf node with the first ordered name (according to string sorting) will be selected as the max, since that is the value that is being evaluated to determine the max value.

I don't think you should be appending anything here except for the value under evaluation.

guille_rr · ‎10-25-2021

Hi Andrew,
it works if I put the name after the dates

WITH apoc.agg.maxItems([raf,cabf], raf.fIniEfe + raf.cHoraIni + cabf.name) as result, cCarpeta

But I have to limit the number of cabf with
WHERE cab.dTipProm = "P"
but I don't know how I can access the node to filter by this field dTipProm = "P"
The condition is that they have the same cCarpeta and the dTipProm = "P"
Because it is a list
WITH apoc.agg.maxItems([raf,cabf]

WITH apoc.agg.maxItems([raf,cabf], raf.fIniEfe + raf.cHoraIni + cabf.name) as result, cCarpeta
RETURN result.items as elem

What kind of structure it is ?

I have to acces to one field of a node (dTipProm = "P")
How can I access?
Thanks in advance

guille_rr · ‎10-26-2021

This is the answer to access

result.items[0][1].dTipProm = "P"

this is the to access another node

result.items[0][0].name

Regards

Neo4j

Why don´t work this order by desc limit 1