Head's Up! These forums are read-only. All users and content have migrated. Please join us at community.neo4j.com.
10-08-2018 11:19 AM
Hi All,
We are trying to build graph using software and hardware informations and each hardware has list of softwares installed and I am using "Jaccard Similarity Algorithm" to show the hardware which has similar softwares installed. Below is the query I tried,
I followed this link to write cypher query to get similar hardware,
https://neo4j.com/docs/graph-algorithms/current/algorithms/similarity-jaccard/
Below is the query I tried executing in neo4j browser but didn't get any response.
MATCH (s:Software)-[:installed]->(Hardware)
WITH {item:id(s), categories: collect(id(Hardware))} as userData
WITH collect(userData) as data
CALL algo.similarity.jaccard.stream(data)
YIELD item1, item2, count1, count2, intersection, similarity
RETURN algo.getNodeById(item1).name AS from, algo.getNodeById(item2).name AS to, intersection, similarity
ORDER BY similarity DESC LIMIT 20
Output Response
Please correct me If I am doing anything wrong.
Attached screenshot of EXPLAIN Cypher query,
Thanks,
Ganeshbabu R
10-09-2018 02:06 PM
Can you run the data collection query on it's own and see if that works?
i.e.
MATCH (s:Software)-[:installed]->(Hardware)
WITH {item:id(s), categories: collect(id(Hardware))} as userData
WITH collect(userData) as data
CALL algo.similarity.jaccard.stream(data) YIELD item1, item2, count1, count2, intersection, similarity
RETURN count(*);
What happens if you run your query with PROFILE instead of EXPLAIN?
10-12-2018 12:50 AM
Below is the response when I ran the query with PROFILE
Below is the response I got,
Correct me If I am doing anything wrong and let me know your thoughts.
Regards,
Ganeshbabu R
10-12-2018 05:56 PM
But there is overlap between the software and installed hardware?
What happens if you run:
MATCH (s:Software)-[:installed]->(Hardware)
WITH {item:id(s), categories: collect(id(Hardware))} as userData
RETURN userData LIMIT 100;
if you look at the data (esp. for categories), are there any overlaps?
10-15-2018 12:03 PM
Hi @michael.hunger,
Below is the response when I ran the above query,
https://pastebin.com/LMCgAegW
Please check and let me know your thoughts also I am not sure how to check whether there is any overlap in the categories data.
Regards,
Ganeshbabu R
10-16-2018 11:55 AM
Odd, I cannot access pastebin.
10-16-2018 09:34 PM
sorry for this I thought its public view and below is the response of the query
userData
{item:2990,categories:[2972]}
{item:2685,categories:[2680,2774]}
{item:3340,categories:[3334]}
{item:2344,categories:[2338]}
{item:2012,categories:[3587,2193,3086,2007,3504]}
{item:137,categories:[1899,2774,3112,112,3273,2927,3133,3272,2680]}
{item:3107,categories:[3479,3106]}
{item:2021,categories:[2927,2016]}
{item:891,categories:[2689,2742,2648,3097,2149,1862,1690,2602,2152,3232,3571,1743,2170,3565,1456,2181,925,1733,2423,1962,3315,2633,857,2385,2716,3302,3589,1918,1970,1542,3173]}
{item:1205,categories:[2715,3589,2580,1870,1201,3417,2777,2967,3272]}
{item:550,categories:[2607,1646,1456,517]}
{item:864,categories:[857]}
{item:146,categories:[2774,112,2927,2680]}
{item:2775,categories:[2774]}
{item:559,categories:[557,2347,2379]}
{item:218,categories:[2088,3496,2723,3151,298,3112,292,2733,215,3546,1103,3285,3152,2715,1944,3173,3176,989,1611,3101,711,1576]}
{item:568,categories:[557]}
{item:227,categories:[2131,2327,3497,2193,711,3614,944,1494,2727,2729,3546,2916,2159,2824,3509,1947,1097,2623,1946,2750,2602,1226,989,2521,1637,2520,1568,598,3505,1576,1671,2591,3504,664,681,215,2809,3534,298,1956,3382,2949,914,742,2185,3061,3515,2915,1813,1812,3151,1051,1250,3320,3157,1177,2755,332]}
{item:1752,categories:[3254,1862,3233,1743,2964,2378,3101,2742,3255,2114]}
{item:2981,categories:[2972]}
{item:2640,categories:[2639]}
{item:765,categories:[762]}
{item:2649,categories:[2648]}
{item:1833,categories:[1831]}
{item:1115,categories:[3159,1690,1110]}
{item:774,categories:[2407,762,1025,836,1882,2347,3353,3546,2841,1056,2379]}
{item:433,categories:[3232,351]}
{item:92,categories:[2533,2405,3498,557,662,3479,3552,762,3497,681,614,956,2927,3086,112,3230,2639,944,2548,3609,818,1712,3441,1630,3179,2100,2002,2918,1611,2155,635,2575,789,3315,3587,2804,1795,3380,3489,1726,1220,2826,2558,1663,1949,2607,3159,1177,1926,920,3305,3504,1690,1069,1051,1354,2809,2423,2774,1952,3418,2159,1171,3211,2653,1910,1882,2249,2916,3078,298,1226,3450,3454,1247,1890,3226,78,3272,1941,2076,1193,3065,3310]}
{item:3403,categories:[3401]}
{item:1528,categories:[1527]}
{item:1187,categories:[1181]}
{item:1501,categories:[2418,1497,1962,3429,2358]}
{item:846,categories:[836,3159]}
{item:1160,categories:[1158]}
{item:442,categories:[2804,2868,3211,1807,1497,2639,351]}
{item:101,categories:[2916,666,3226,1171,944,2159,3489,2249,1630,1226,1663,3078,2918,1247,3272,2076,1051,920,1690,1882,78,1949,3418,1354,2405,1910,1726,3315,762,614,3310,557,1193,1177,956,3305,2002,662,818,2716,2620,1611,2653,2575,3159,635,2155]}
{item:1196,categories:[2116,3096,3143,1193]}
{item:200,categories:[3273,2927,112,2774,2680,1899]}
{item:514,categories:[496]}
{item:3062,categories:[3061]}
{item:173,categories:[3272,2680,1899,3273,3133,2774,3112,112,2927]}
{item:209,categories:[2607,2558,2121,3086,2575,2723,2100,3331,2701,2533,2620,1970,3302,1831,246,2423,2672,112,2456,2916,2380,3353,2584,3429,2526,944,2580,2645,1949,1887,1768,1527,2569,2193,1429,2152,836,2407]}
{item:523,categories:[1456,2607,1646,517,1726]}
{item:182,categories:[2680,3272,1899,3112,3273,2774,112,2927,3133]}
{item:1393,categories:[2322,1373,3085,3052,3049]}
{item:3618,categories:[3617]}
{item:3277,categories:[3273]}
{item:254,categories:[3520,1795,2602,2860,3550,3000,2648,3145,3441,1497,2805,3026,2100,3545,3180,3417,246,1566,3454,2803,3539,2607,1247,1148,2972,1690,3401,3395,3450,1918,3583,2599,3302,3571,2596,1106,2423,2002,1646,3389,2531,3211,2171,1573,2526,1354,789,1862,1080,3029,2321,2003,3199,2380,2548,2307,2916,1910,1063,1056,3462,2170,2479,3173,3522,3459,3282,986,2840,3052,925,1762,2650,3546,3609,2214,2185,2181,1359,3382,3387,3587,3498,3096,2772,3495,1949,2474,3259,2316,2728,2958,2716,2653,1527,1578,1051,3315,1726,2443,3617,635,3067,2131,2959,3380,3001,2633,2347,2621,2239,2752,2639,3204,1373,662,2522,3589,920,2456,3563,2146,3232,2742,557,1568,2703,2604,3540,3552,1250,2757,914,711,3097,2821,1890,1970,3565,2966,857,2808,2385,1923,3367,1712,1743,3591,2198,2868,1707,2580,1456,3226,3463,1542,2809,2628,3224,303,3078,2969,1171,847,3505,2277,1952,2322,3256,2584,2241,1962,1887,2149,2801,1725,762,1232,2822,2229,3202,3298,2689,3085,3479,3060,2152,3334,1822,1733,1429,3106,2437,2949,1292,855,3322,2771]}
{item:191,categories:[1429,3386,2680,2607,2774,3418,2007,1043,2456,3112,2575,2953,2844,2927,1415,989,3067,2569,1949,1066,3502,3498,1926,3522,2700,3310,3331,3239,3133,1882,3273,112,987,664,2407,2964,2797,3305,2566,3272,3071,956,3442,2722,1899]}
{item:2461,categories:[2456]}
{item:1402,categories:[3085,3052,2959,3049,1373,2322]}
{item:1061,categories:[3502,2241,2953,1059,1292]}
{item:720,categories:[2626,818,711]}
{item:2156,categories:[2155]}
{item:1815,categories:[1813,2078,2729,2116]}
{item:2470,categories:[2456]}
{item:1474,categories:[1456,2607]}
{item:1411,categories:[2476,1407,2680,3589,2453]}
{item:2129,categories:[2121]}
{item:1070,categories:[1069]}
{item:729,categories:[833,3459,2964,2146,3247,711,2912]}
{item:2165,categories:[2164]}
{item:1824,categories:[1822]}
{item:1483,categories:[2152,1456,1918,3302,2689]}
{item:828,categories:[818]}
{item:1142,categories:[3591,1110,2016]}
{item:424,categories:[2569,1373,3056,3232,3001,2584,351,3204,2822,3054,2239,3052,2868]}
{item:738,categories:[711]}
{item:83,categories:[78,3311,2744]}
{item:3349,categories:[3334]}
{item:1178,categories:[1637,1177,1956,3509,1807,1947,1708,1812]}
{item:1492,categories:[1456]}
{item:837,categories:[836,2249,2597,3565,2131,1807]}
{item:1151,categories:[1148]}
{item:155,categories:[2927,2680,3272,1899,956,2774,3112,112,3273,3133]}
{item:810,categories:[789]}
{item:469,categories:[2076,2832,1816,2702,2358,1454,2327,1676,2733,453,3031,3247,2729,2338,3056,1201,742,1230,581,2777,3285,2706,1884,3427,2006,2723,1569]}
{item:3017,categories:[3001]}
{item:505,categories:[3454,1962,3179,1419,1361,2307,2233,2918,2321,3395,2804,3389,2772,3199,1762,1086,3000,3026,2969,3112,3455,3285,3583,2324,2322,3085,3202,1250,3298,2229,2808,2869,3031,2155,2437,2966,2821,2277,2239,1295,2959,3353,1226,789,742,3387,2752,496,956,3563,3197,2840,1816,2379,2771,3462,3052,2972,1102,2443,2241,3029]}
{item:819,categories:[3450,818,2723,1106]}
{item:164,categories:[2680,956,3272,1899,3273,3112,2774,3133,112,2927]}
{item:3430,categories:[3429]}
{item:478,categories:[2768,2279,453,2214,2607,2832,1795]}
{item:2371,categories:[2706,2358]}
{item:2030,categories:[2164,2358,2016,2100]}
{item:1689,categories:[2378,3101,1676,2742,3243,3255,3254,2114,1862,3233,1743]}
{item:577,categories:[3232,557,2016,2358,2068,2378,3243,2385,2569,3455,2584,1648,2526]}
{item:2784,categories:[2777]}
{item:3439,categories:[3429]}
{item:3098,categories:[3097]}
{item:2039,categories:[2016]}
{item:1698,categories:[3097,3232,3173,2602,1862,2742,1743,1690,3315,2716]}
{item:245,categories:[3230,3272,1232,2661,1676,3243,3540,2347,2701,2378,3310,2121,3246,1454,3254,3386,603,3239,2474,1816,3255,3562,3417,3442,3441,2076,2114,3427,2566,3558,1690,1638,1415,3443,1884,2607,1831,1743,2742,496,2774,2152,2083,2672,2379,2797,1527,1862,2476,2964,3418,3247,2620,2722,2453,1826,3610,2802,246,3259]}
{item:2452,categories:[2450]}
{item:2111,categories:[2456,2100]}
{item:1052,categories:[2648,1051,2088,2650,3617]}
{item:1770,categories:[1768]}
{item:2425,categories:[3029,2822,3179,2423]}
{item:1366,categories:[1361]}
{item:2084,categories:[2772,2439,2083]}
10-20-2018 06:14 PM
Just looking at the data visually I see several overlaps.
If I take it and run it just with your data it also returns the appropriate data:
WITH [
{item:2990,categories:[2972]},
{item:2685,categories:[2680,2774]},
{item:3340,categories:[3334]},
{item:2344,categories:[2338]},
{item:2012,categories:[3587,2193,3086,2007,3504]},
{item:137,categories:[1899,2774,3112,112,3273,2927,3133,3272,2680]},
{item:3107,categories:[3479,3106]},
{item:2021,categories:[2927,2016]},
{item:891,categories:[2689,2742,2648,3097,2149,1862,1690,2602,2152,3232,3571,1743,2170,3565,1456,2181,925,1733,2423,1962,3315,2633,857,2385,2716,3302,3589,1918,1970,1542,3173]},
{item:1205,categories:[2715,3589,2580,1870,1201,3417,2777,2967,3272]},
{item:550,categories:[2607,1646,1456,517]},
{item:864,categories:[857]},
{item:146,categories:[2774,112,2927,2680]},
{item:2775,categories:[2774]},
{item:559,categories:[557,2347,2379]},
{item:218,categories:[2088,3496,2723,3151,298,3112,292,2733,215,3546,1103,3285,3152,2715,1944,3173,3176,989,1611,3101,711,1576]},
{item:568,categories:[557]},
{item:227,categories:[2131,2327,3497,2193,711,3614,944,1494,2727,2729,3546,2916,2159,2824,3509,1947,1097,2623,1946,2750,2602,1226,989,2521,1637,2520,1568,598,3505,1576,1671,2591,3504,664,681,215,2809,3534,298,1956,3382,2949,914,742,2185,3061,3515,2915,1813,1812,3151,1051,1250,3320,3157,1177,2755,332]},
{item:1752,categories:[3254,1862,3233,1743,2964,2378,3101,2742,3255,2114]},
{item:2981,categories:[2972]},
{item:2640,categories:[2639]},
{item:765,categories:[762]},
{item:2649,categories:[2648]},
{item:1833,categories:[1831]},
{item:1115,categories:[3159,1690,1110]},
{item:774,categories:[2407,762,1025,836,1882,2347,3353,3546,2841,1056,2379]},
{item:433,categories:[3232,351]},
{item:92,categories:[2533,2405,3498,557,662,3479,3552,762,3497,681,614,956,2927,3086,112,3230,2639,944,2548,3609,818,1712,3441,1630,3179,2100,2002,2918,1611,2155,635,2575,789,3315,3587,2804,1795,3380,3489,1726,1220,2826,2558,1663,1949,2607,3159,1177,1926,920,3305,3504,1690,1069,1051,1354,2809,2423,2774,1952,3418,2159,1171,3211,2653,1910,1882,2249,2916,3078,298,1226,3450,3454,1247,1890,3226,78,3272,1941,2076,1193,3065,3310]},
{item:3403,categories:[3401]},
{item:1528,categories:[1527]},
{item:1187,categories:[1181]},
{item:1501,categories:[2418,1497,1962,3429,2358]},
{item:846,categories:[836,3159]},
{item:1160,categories:[1158]},
{item:442,categories:[2804,2868,3211,1807,1497,2639,351]},
{item:101,categories:[2916,666,3226,1171,944,2159,3489,2249,1630,1226,1663,3078,2918,1247,3272,2076,1051,920,1690,1882,78,1949,3418,1354,2405,1910,1726,3315,762,614,3310,557,1193,1177,956,3305,2002,662,818,2716,2620,1611,2653,2575,3159,635,2155]},
{item:1196,categories:[2116,3096,3143,1193]},
{item:200,categories:[3273,2927,112,2774,2680,1899]},
{item:514,categories:[496]},
{item:3062,categories:[3061]},
{item:173,categories:[3272,2680,1899,3273,3133,2774,3112,112,2927]},
{item:209,categories:[2607,2558,2121,3086,2575,2723,2100,3331,2701,2533,2620,1970,3302,1831,246,2423,2672,112,2456,2916,2380,3353,2584,3429,2526,944,2580,2645,1949,1887,1768,1527,2569,2193,1429,2152,836,2407]},
{item:523,categories:[1456,2607,1646,517,1726]},
{item:182,categories:[2680,3272,1899,3112,3273,2774,112,2927,3133]},
{item:1393,categories:[2322,1373,3085,3052,3049]},
{item:3618,categories:[3617]},
{item:3277,categories:[3273]},
{item:254,categories:[3520,1795,2602,2860,3550,3000,2648,3145,3441,1497,2805,3026,2100,3545,3180,3417,246,1566,3454,2803,3539,2607,1247,1148,2972,1690,3401,3395,3450,1918,3583,2599,3302,3571,2596,1106,2423,2002,1646,3389,2531,3211,2171,1573,2526,1354,789,1862,1080,3029,2321,2003,3199,2380,2548,2307,2916,1910,1063,1056,3462,2170,2479,3173,3522,3459,3282,986,2840,3052,925,1762,2650,3546,3609,2214,2185,2181,1359,3382,3387,3587,3498,3096,2772,3495,1949,2474,3259,2316,2728,2958,2716,2653,1527,1578,1051,3315,1726,2443,3617,635,3067,2131,2959,3380,3001,2633,2347,2621,2239,2752,2639,3204,1373,662,2522,3589,920,2456,3563,2146,3232,2742,557,1568,2703,2604,3540,3552,1250,2757,914,711,3097,2821,1890,1970,3565,2966,857,2808,2385,1923,3367,1712,1743,3591,2198,2868,1707,2580,1456,3226,3463,1542,2809,2628,3224,303,3078,2969,1171,847,3505,2277,1952,2322,3256,2584,2241,1962,1887,2149,2801,1725,762,1232,2822,2229,3202,3298,2689,3085,3479,3060,2152,3334,1822,1733,1429,3106,2437,2949,1292,855,3322,2771]},
{item:191,categories:[1429,3386,2680,2607,2774,3418,2007,1043,2456,3112,2575,2953,2844,2927,1415,989,3067,2569,1949,1066,3502,3498,1926,3522,2700,3310,3331,3239,3133,1882,3273,112,987,664,2407,2964,2797,3305,2566,3272,3071,956,3442,2722,1899]},
{item:2461,categories:[2456]},
{item:1402,categories:[3085,3052,2959,3049,1373,2322]},
{item:1061,categories:[3502,2241,2953,1059,1292]},
{item:720,categories:[2626,818,711]},
{item:2156,categories:[2155]},
{item:1815,categories:[1813,2078,2729,2116]},
{item:2470,categories:[2456]},
{item:1474,categories:[1456,2607]},
{item:1411,categories:[2476,1407,2680,3589,2453]},
{item:2129,categories:[2121]},
{item:1070,categories:[1069]},
{item:729,categories:[833,3459,2964,2146,3247,711,2912]},
{item:2165,categories:[2164]},
{item:1824,categories:[1822]},
{item:1483,categories:[2152,1456,1918,3302,2689]},
{item:828,categories:[818]},
{item:1142,categories:[3591,1110,2016]},
{item:424,categories:[2569,1373,3056,3232,3001,2584,351,3204,2822,3054,2239,3052,2868]},
{item:738,categories:[711]},
{item:83,categories:[78,3311,2744]},
{item:3349,categories:[3334]},
{item:1178,categories:[1637,1177,1956,3509,1807,1947,1708,1812]},
{item:1492,categories:[1456]},
{item:837,categories:[836,2249,2597,3565,2131,1807]},
{item:1151,categories:[1148]},
{item:155,categories:[2927,2680,3272,1899,956,2774,3112,112,3273,3133]},
{item:810,categories:[789]},
{item:469,categories:[2076,2832,1816,2702,2358,1454,2327,1676,2733,453,3031,3247,2729,2338,3056,1201,742,1230,581,2777,3285,2706,1884,3427,2006,2723,1569]},
{item:3017,categories:[3001]},
{item:505,categories:[3454,1962,3179,1419,1361,2307,2233,2918,2321,3395,2804,3389,2772,3199,1762,1086,3000,3026,2969,3112,3455,3285,3583,2324,2322,3085,3202,1250,3298,2229,2808,2869,3031,2155,2437,2966,2821,2277,2239,1295,2959,3353,1226,789,742,3387,2752,496,956,3563,3197,2840,1816,2379,2771,3462,3052,2972,1102,2443,2241,3029]},
{item:819,categories:[3450,818,2723,1106]},
{item:164,categories:[2680,956,3272,1899,3273,3112,2774,3133,112,2927]},
{item:3430,categories:[3429]},
{item:478,categories:[2768,2279,453,2214,2607,2832,1795]},
{item:2371,categories:[2706,2358]},
{item:2030,categories:[2164,2358,2016,2100]},
{item:1689,categories:[2378,3101,1676,2742,3243,3255,3254,2114,1862,3233,1743]},
{item:577,categories:[3232,557,2016,2358,2068,2378,3243,2385,2569,3455,2584,1648,2526]},
{item:2784,categories:[2777]},
{item:3439,categories:[3429]},
{item:3098,categories:[3097]},
{item:2039,categories:[2016]},
{item:1698,categories:[3097,3232,3173,2602,1862,2742,1743,1690,3315,2716]},
{item:245,categories:[3230,3272,1232,2661,1676,3243,3540,2347,2701,2378,3310,2121,3246,1454,3254,3386,603,3239,2474,1816,3255,3562,3417,3442,3441,2076,2114,3427,2566,3558,1690,1638,1415,3443,1884,2607,1831,1743,2742,496,2774,2152,2083,2672,2379,2797,1527,1862,2476,2964,3418,3247,2620,2722,2453,1826,3610,2802,246,3259]},
{item:2452,categories:[2450]},
{item:2111,categories:[2456,2100]},
{item:1052,categories:[2648,1051,2088,2650,3617]},
{item:1770,categories:[1768]},
{item:2425,categories:[3029,2822,3179,2423]},
{item:1366,categories:[1361]},
{item:2084,categories:[2772,2439,2083]}
] as data
CALL algo.similarity.jaccard.stream(data, {similarityCutoff:0.1}) YIELD item1, item2, count1, count2, intersection, similarity
RETURN item1, item2, count1, count2, intersection, similarity LIMIT 10
For some meaningful data I added a cutoff but even without it you see proper results.
╒═══════╤═══════╤════════╤════════╤══════════════╤═══════════════════╕
│"item1"│"item2"│"count1"│"count2"│"intersection"│"similarity" │
╞═══════╪═══════╪════════╪════════╪══════════════╪═══════════════════╡
│92 │101 │84 │47 │44 │0.5057471264367817 │
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│191 │200 │45 │6 │6 │0.13333333333333333│
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│191 │209 │45 │38 │9 │0.12162162162162163│
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│191 │245 │45 │60 │13 │0.14130434782608695│
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│92 │191 │84 │45 │14 │0.12173913043478261│
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│92 │254 │84 │198 │40 │0.1652892561983471 │
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│200 │1411 │6 │5 │1 │0.1 │
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│200 │2021 │6 │2 │1 │0.14285714285714285│
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│200 │2685 │6 │2 │2 │0.3333333333333333 │
├───────┼───────┼────────┼────────┼──────────────┼───────────────────┤
│200 │2775 │6 │1 │1 │0.16666666666666666│
└───────┴───────┴────────┴────────┴──────────────┴───────────────────┘
11-28-2018 05:40 AM
Hi Michael,
In context to the Jaccard Algorithm usage -
My graph looks like this -
[1]
Here when i run this query :
MATCH (s:claimIntimationRequestHeader)-[:Request]-(claimIntimationRequestBody)
WITH {item:id(s), categories: collect(id(claimIntimationRequestBody))} as userData
RETURN userData
I get this output -
userData
{
"item": 1929,
"categories": [
1928
]
}
So using it like this -
WITH[{
item: 1929,
categories: [
1928
]
}] as data
CALL algo.similarity.jaccard.stream(data, {similarityCutoff:0.1}) YIELD item1, item2, count1, count2, intersection, similarity
RETURN item1, item2, count1, count2, intersection, similarity LIMIT 10
i get this output -
(no changes, no records)
So i guess because i have only one Item in the userData.
And when i do its count -
MATCH (s:claimIntimationRequestHeader)-[:Request]-(claimIntimationRequestBody)
WITH {item:id(s), categories: collect(id(claimIntimationRequestBody))} as userData
WITH collect(userData) as data
CALL algo.similarity.jaccard.stream(data) YIELD item1, item2, count1, count2, intersection, similarity
RETURN count(*)
i get this output -
count(*)
0
[2]
When i try this cypher query :
MATCH (s:claimIntimationRequestBody)-[:Parameter]-(requestID)
WITH {item:id(s), categories: collect(id(requestID))} as userData
RETURN userData
Output -
userData
{
"item": 0,
"categories": [
1794,
1793,
1792,
1791,
1790,
1789,
1788,
1787,
1786,
1785,
1784,
1783,
1782,
1781,
1780,
1779,
1778
]
}
And then when i utilize this output with this query -
WITH[{
item: 0,
categories: [
1794,
1793,
1792,
1791,
1790,
1789,
1788,
1787,
1786,
1785,
1784,
1783,
1782,
1781,
1780,
1779,
1778
]
}] as data
CALL algo.similarity.jaccard.stream(data, {similarityCutoff:0.1}) YIELD item1, item2, count1, count2, intersection, similarity
RETURN item1, item2, count1, count2, intersection, similarity LIMIT 10
Output -
(no changes, no records)
Please help in creating a proper Jaccard algorithm query.
11-28-2018 06:21 AM
You have to have more than one item in your list
It seems that in each of your cases you have only one item.
If you want to compute similarities between cIRH and cIRB you have to have at least a few cIRH
in your query result.
Also from your model there is at least one extra node in between the two, so your query would not return anything. You basically just return "postClaims" as you don't use a label on the end-node.
01-15-2019 04:56 AM
Hello,
Please, could you tell me why the write RETURN is false for the following query?
MATCH (user:User) WHERE size((user)-[:CONNECT]->())>20 WITH user
MATCH (user)-[r:CONNECT]->(k:Keyword)
WHERE r.weight > 10
WITH {item:id(user), categories: collect(id(k))} AS userData
WITH collect(userData) AS data
CALL algo.similarity.jaccard(data, {write:TRUE, graph:'HUGE', writeRelationshipType: 'SIMILARITY', writeProperty:'keywords_jaccard'})
YIELD nodes, similarityPairs, write, writeRelationshipType, writeProperty
RETURN nodes, similarityPairs, write, writeRelationshipType, writeProperty
nodes | similarityPairs | write | writeRelationshipType | writeProperty |
---|---|---|---|---|
9684 | 46885086 | false | "SIMILARITY" | "keywords_jaccard" |
Thanks in advance
01-15-2019 05:30 AM
Probably b/c you didn't specify a similarityCutoff value, to avoid writing the "0" similarity pairs.
try to add: similarityCutoff:0.1
or whatever makes sense in your case.
01-16-2019 02:41 AM
Thanks for your quickly answer! Just to have an idea about hardware, I saw in :https://towardsdatascience.com/tagoverflow-correlating-tags-in-stackoverflow-66e2b0e1117b that you run jaccard similarity (17Kx17K) in about 13 min. Please, could you tell me how much ram, memory heat for getting that result? In my case, I'm running with :
After more than an hour, it doesn't work , in logs : ERROR [o.n.b.t.p.HouseKeeper] Fatal error occurred when handling a client connection.
Thanks in advance.
01-16-2019 05:17 AM
We ran it on an 8 CPU AWS machine with 32G RAM.
Perhaps your category lists are much larger?
As you can see from your unfiltered output you get about 46M similarity pairs.
I also used topK which limits the pairs per element to K.
Do you have duplicate connections to the keywords or only unique ones per user?
Otherwise use collect(distinct id(k))
I would run it with write:false first to see the pure output + statistics.
How long did the compute above run that you shared?
and how long does this run? and what does it output?
MATCH (user:User) WHERE size((user)-[:CONNECT]->())>20 WITH user
MATCH (user)-[r:CONNECT]->(k:Keyword)
WHERE r.weight > 10
WITH id(user) as item, count(id(k)) as categories, count(distinct id(k)) as uniqueCategories
RETURN count(*), max(categories), max(uniqueCategories)
01-16-2019 06:05 AM
Thanks again!
The answer to your query is
|count(*)|max(categories)|max(uniqueCategories)|
|9684|2183|2183|
Categories are unique, the number of items is 9684 ~ 10K.
How long did the compute above run that you shared? >>> Without including similarityCutoff:0.1
it doesn't write any result. Including similarityCutoff:0.1
it never finished , get out with lost connection, I have to down docker and re-start.
Including similarityCutoff:0.3
in the original query with write:FALSE , I get:
|nodes|similarityPairs|write|writeRelationshipType|writeProperty|
|9684|12817736|false|"SIMILARITY"|"keywords_jaccard"| >>> it's done in 59 sec.
~ 13Millions Pairs... it's not so much for my machine (16 CPU, 56 GB), it should finish in a reasonable time (I hope less than 30 min).
When I set write:TRUE in the same query , I get : Connection to server lost. Reconnecting... Then I have to down docker and start again... (Without deleting the entire data base)
I appreciate your help.
01-16-2019 07:04 AM
What kind of disk do you have?
The relationship writing currently happens in batches of 100k can you check while it's running with the 0.3 cutoff (i.e. 13M rels)
what the CPU or I/O load look like on your machine?
01-17-2019 02:45 AM
Thanks a lot!
Finally, I realized that I have a memory problem if I try to write 13 or18 Millions of similarityPairs (and it's not useful). However adding the parameter topK I reduced the number and it writes without problem.
In your example you actually write 2864 similarityPairs (using topK:5), not the total ~ 292Millions. I'm sorry for the misunderstanding
01-17-2019 03:45 AM
It should actually batch the writes so it should progress and finish in parallel, but we can check that again.
All the sessions of the conference are now available online