Add locks to local hash index and add basic concurrent insertion tests #5828

ray6080 · 2025-07-30T02:53:38Z

Description

See title.

Some performance numbers of 1M node insertions (1 insertion per transaction). The scaling bottleneck mainly comes from locks within the index.

#t	time (ms)
1 (master)	76925
1	80102
2	43622
4	30092
8	25541

COPY performance is almost the same on the LDBC 100 comment table.

codecov · 2025-07-31T04:26:26Z

Codecov Report

❌ Patch coverage is 75.41899% with 44 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.94%. Comparing base (2d7dd95) to head (cce3231).
⚠️ Report is 14 commits behind head on master.

Files with missing lines	Patch %	Lines
test/transaction/transaction_test.cpp	69.65%	41 Missing and 3 partials ⚠️

❌ Your patch status has failed because the patch coverage (75.41%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5828      +/-   ##
==========================================
- Coverage   85.97%   85.94%   -0.03%     
==========================================
  Files        1620     1620              
  Lines       73600    73807     +207     
  Branches     8798     8828      +30     
==========================================
+ Hits        63277    63434     +157     
- Misses      10096    10142      +46     
- Partials      227      231       +4

Flag	Coverage Δ
extension	`63.19% <18.85%> (-0.14%)`	⬇️
in-mem	`81.69% <89.72%> (+0.02%)`	⬆️
on-disk	`86.59% <92.46%> (+<0.01%)`	⬆️
recovery	`86.59% <92.46%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

github-actions · 2025-07-31T05:31:50Z

Benchmark Result

Master commit hash: b2a29262b27c734f9c0e91bdc8d642cd8a2e929c
Branch commit hash: 5d53fdf4f5e59900a390b2956d2114302b72e4cb

Query Group	Query Name	Mean Time - Commit (ms)	Mean Time - Master (ms)	Diff
ldbc_snb_ic	q35	8.28	6.46	1.82 (28.16%)

Other queries

Query Group	Query Name	Mean Time - Commit (ms)	Mean Time - Master (ms)	Diff
aggregation	q24	703.42	700.83	2.59 (0.37%)
aggregation	q28	7686.01	7724.55	-38.55 (-0.50%)
filter	q14	60.39	59.29	1.10 (1.86%)
filter	q15	60.62	63.89	-3.27 (-5.12%)
filter	q16	278.04	278.33	-0.29 (-0.10%)
filter	q17	379.71	382.92	-3.21 (-0.84%)
filter	q18	1933.15	1817.16	115.99 (6.38%)
filter	zonemap-node	22.93	24.43	-1.49 (-6.11%)
filter	zonemap-node-lhs-cast	23.18	23.98	-0.80 (-3.34%)
filter	zonemap-node-null	22.81	23.58	-0.77 (-3.27%)
filter	zonemap-rel	5559.48	5425.75	133.73 (2.46%)
fixed_size_expr_evaluator	q07	617.50	625.49	-7.99 (-1.28%)
fixed_size_expr_evaluator	q08	902.78	912.55	-9.77 (-1.07%)
fixed_size_expr_evaluator	q09	904.71	912.57	-7.86 (-0.86%)
fixed_size_expr_evaluator	q10	190.55	197.23	-6.67 (-3.38%)
fixed_size_expr_evaluator	q11	191.31	197.53	-6.22 (-3.15%)
fixed_size_expr_evaluator	q12	167.46	175.74	-8.27 (-4.71%)
fixed_size_expr_evaluator	q13	1498.70	1495.87	2.83 (0.19%)
fixed_size_seq_scan	q23	48.35	54.17	-5.82 (-10.75%)
join	q29	782.72	828.27	-45.55 (-5.50%)
join	q30	1810.35	1835.48	-25.13 (-1.37%)
join	q31	6.42	5.74	0.68 (11.84%)
join	SelectiveTwoHopJoin	53.70	53.16	0.53 (1.00%)
ldbc_snb_ic	q36	89.46	91.19	-1.74 (-1.90%)
ldbc_snb_is	q32	5.39	5.66	-0.27 (-4.79%)
ldbc_snb_is	q33	11.14	12.84	-1.70 (-13.26%)
ldbc_snb_is	q34	1.47	1.27	0.19 (15.34%)
limit	push-down-limit-into-distinct	1918.17	2005.29	-87.12 (-4.34%)
multi-rel	multi-rel-large-scan	1556.05	1463.35	92.69 (6.33%)
multi-rel	multi-rel-lookup	8.84	10.72	-1.88 (-17.53%)
multi-rel	multi-rel-small-scan	171.07	198.22	-27.16 (-13.70%)
order_by	q25	65.52	68.14	-2.62 (-3.85%)
order_by	q26	384.96	384.66	0.29 (0.08%)
order_by	q27	1291.73	1303.74	-12.01 (-0.92%)
recursive_join	recursive-join-bidirection	364.84	363.59	1.25 (0.34%)
recursive_join	recursive-join-dense	7042.90	6959.59	83.32 (1.20%)
recursive_join	recursive-join-path	23317.49	23511.06	-193.56 (-0.82%)
recursive_join	recursive-join-sparse	9.68	9.82	-0.15 (-1.49%)
recursive_join	recursive-join-trail	7014.02	6895.05	118.97 (1.73%)
scan_after_filter	q01	101.71	112.97	-11.26 (-9.96%)
scan_after_filter	q02	90.22	89.09	1.13 (1.27%)
shortest_path_ldbc100	q37	78.34	82.58	-4.25 (-5.14%)
shortest_path_ldbc100	q38	337.44	329.70	7.74 (2.35%)
shortest_path_ldbc100	q39	86.94	86.66	0.28 (0.32%)
shortest_path_ldbc100	q40	506.17	520.47	-14.30 (-2.75%)
var_size_expr_evaluator	q03	2089.80	2116.73	-26.93 (-1.27%)
var_size_expr_evaluator	q04	2132.35	2114.50	17.85 (0.84%)
var_size_expr_evaluator	q05	2518.31	2618.15	-99.84 (-3.81%)
var_size_expr_evaluator	q06	1255.21	1260.86	-5.65 (-0.45%)
var_size_seq_scan	q19	1336.06	1345.72	-9.66 (-0.72%)
var_size_seq_scan	q20	2644.59	2496.69	147.91 (5.92%)
var_size_seq_scan	q21	2192.24	2169.41	22.83 (1.05%)
var_size_seq_scan	q22	107.97	109.78	-1.81 (-1.65%)

andyfengHKU

@benjaminwinger should comeback and review the lock

benjaminwinger · 2025-08-11T19:03:26Z

src/include/storage/local_storage/local_hash_index.h

-    void reserveInserts(uint64_t newEntries) { localInsertions.reserve(newEntries); }
+    void reserveInserts(uint64_t newEntries) {
+        std::unique_lock xLock{mtx};
+        localInsertions.reserve(newEntries);


I don't think this is the right behaviour here (though it should be safe). If we have multiple threads bulk reserving we should reserve the combined amount.
This also applies to reserveSpaceForAppend, and maybe the solution should be to track capacity explicitly and merge the two functions, and in both situations just add to the capacity (even now, I think reserveInserts could be replaced by reserveSpaceForAppend since I think reserveInserts is only called when the number of existing entries is zero). I think that should be safe, since the local storage gets cleared during rollbacks, so if there is an interruption we won't just keep growing the capacity if the reserve is tried again.

add concurrent insertion tests; add locks to local hash index

564b29c

ray6080 requested a review from benjaminwinger as a code owner July 30, 2025 02:53

ray6080 and others added 2 commits July 30, 2025 02:55

ci: auto code format

4e9d073

skip

cce3231

andyfengHKU approved these changes Aug 8, 2025

View reviewed changes

ray6080 merged commit 3d154c8 into master Aug 8, 2025
27 of 28 checks passed

ray6080 deleted the mvcc-insertions branch August 8, 2025 04:24

benjaminwinger reviewed Aug 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add locks to local hash index and add basic concurrent insertion tests #5828

Add locks to local hash index and add basic concurrent insertion tests #5828

Uh oh!

ray6080 commented Jul 30, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 31, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 31, 2025

Uh oh!

andyfengHKU left a comment

Uh oh!

Uh oh!

benjaminwinger Aug 11, 2025

Uh oh!

Uh oh!

Add locks to local hash index and add basic concurrent insertion tests #5828

Add locks to local hash index and add basic concurrent insertion tests #5828

Uh oh!

Conversation

ray6080 commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

codecov bot commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jul 31, 2025

Benchmark Result

Uh oh!

andyfengHKU left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

benjaminwinger Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ray6080 commented Jul 30, 2025 •

edited

Loading

codecov bot commented Jul 31, 2025 •

edited

Loading