Skip to content

Conversation

nikita-tkachenko-datadog
Copy link
Contributor

@nikita-tkachenko-datadog nikita-tkachenko-datadog commented Sep 1, 2025

What Does This Do

Migrates AgentTestRunner (base class for instrumentation tests) from JUnit4-based SpockRunner to JUnit5-based SpockExtension.

Summary of the changes:

  • Replaced SpockRunner with JUnit 5-based TestClassShadowingExtension and TooManyInvocationsErrorHandler
  • Renamed AgentTestRunner to InstrumentationSpecification
  • Extracted instrumentation-specific testing logic to a separate module: :dd-java-agent:instrumentation-testing
  • Extracted Appsec test fixtures to a separate module: :dd-java-agent:appsec:appsec-test-fixtures
  • Extracted IAST test fixtures to a separate module: :dd-java-agent:agent-iast:iast-test-fixtures
  • Extracted CI Visibility test fixtures to separate modules: :dd-java-agent:agent-ci-visibility:civisibility-test-fixtures (shared between CI Vis instrumentation and smoke tests) and :dd-java-agent:agent-ci-visibility:civisibility-instrumentation-test-fixtures (for CI Vis instrumentation tests)
  • Fixed individual tests failing after the migration (see notes below)

Motivation

Required for supporting JDK 25 and JUnit 6 instrumentation.

The tracer uses a Spock version that runs on top of JUnit 5.
Previously it was using JUnitPlatformRunner to execute tests "in a JUnit 4 environment".
In recent JUnit versions the platform runner is no longer supported so we have to fully migrate to JUnit 5.

Additional Notes

Instrumentation tests need to patch the bootstrap classpath adding some core tracer classes to it.
The classes added to the bootstrap classpath cannot be loaded before the classpath patching takes place - if they're loaded by the application classloader first, then these application CP versions will be used instead of the bootstrap ones.

When running instrumentation tests with JUnit 5 this becomes a problem: the framework scans classpath to determine which tests to run. When scanning the classpath, it loads classes - including the classes that should be appended to the bootstrap classpath.

To make sure classpath patching happens before scanning, a custom org.junit.platform.launcher.LauncherSessionListener implementation was added: datadog.trace.agent.test.BootstrapClasspathSetup.
LauncherSessionListener implementations are discovered using Java ServiceLoader from META-INF/services/org.junit.platform.launcher.LauncherSessionListener files available on the classpath.
The overridden org.junit.platform.launcher.LauncherSessionListener#launcherSessionOpened method is executed early in the testing framework lifecycle - before the classpath scanning occurs.

One problem with this approach is that it will trigger classpath patching for every execution that has BootstrapClasspathSetup in the classpath (since it will be discovered by the service loader automatically).
This includes regular unit tests, where patching the bootstrap classpath caused failures in some cases.

To avoid this, BootstrapClasspathSetup was moved to a new Gradle module instrumentation-testing, which is only added to the dependencies of the instrumentation modules that need it.

Test fixtures in some of the product modules (Appsec, IAST, CI Visibility) had to be split into separate modules for the same reason: they referenced AgentTestRunner and BootstrapClasspathSetup, so they had to be excluded from the test classpath of their parent modules (:dd-java-agent:agent-appsec, :dd-java-agent:agent-iast, :dd-java-agent:agent-civisibility) to avoid interfering with the unit tests there.

Some instrumentation tests started failing after migration to JUnit 5. They are fixed in this PR. Here are the main reasons for the failures:

  • tests were not discovered with JUnit 4 so they weren't run before this change
  • change in the order of tests execution and the resulting uncovered dependencies between tests (such as not clearing captured traces before/after tests, or not resetting config system properties)
  • JUnit 5's classpath scanning loaded some classes that the tested instrumentation needed to patch; the classes were loaded before the instrumentation could patch them (and subsequent retransformation failed because superclass/interfaces needed to be changed)

Contributor Checklist

Jira ticket: [PROJ-IDENT]

@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Sep 1, 2025

🎯 Code Coverage
Patch Coverage: 100.00%
Total Coverage: 60.13% (+2.27%)

View detailed report

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 36f5715 | Docs | Was this helpful? Give us feedback!

@pr-commenter
Copy link

pr-commenter bot commented Sep 1, 2025

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master nikita-tkachenko/junit5-testing
git_commit_date 1757412728 1757440709
git_commit_sha f284153 36f5715
release_version 1.54.0-SNAPSHOT~f284153719 1.53.0-SNAPSHOT~36f5715401
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1757442676 1757442676
ci_job_id 1121142283 1121142283
ci_pipeline_id 75989158 75989158
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-umxsuvco 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-umxsuvco 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 1 performance improvements and 0 performance regressions! Performance is the same for 44 metrics, 14 unstable metrics.

scenario Δ mean execution_time candidate mean execution_time baseline mean execution_time
scenario:startup:petclinic:iast:Remote Config better
[-50.562µs; -13.346µs] or [-8.006%; -2.113%]
599.630µs 631.584µs
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.53.0-SNAPSHOT~36f5715401, baseline=1.54.0-SNAPSHOT~f284153719

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.045 s) : 0, 1045312
Total [baseline] (10.748 s) : 0, 10748374
Agent [candidate] (1.047 s) : 0, 1046645
Total [candidate] (10.694 s) : 0, 10694264
section appsec
Agent [baseline] (1.225 s) : 0, 1224740
Total [baseline] (10.788 s) : 0, 10788375
Agent [candidate] (1.222 s) : 0, 1221830
Total [candidate] (10.808 s) : 0, 10807532
section iast
Agent [baseline] (1.2 s) : 0, 1199539
Total [baseline] (11.031 s) : 0, 11030838
Agent [candidate] (1.181 s) : 0, 1180677
Total [candidate] (10.949 s) : 0, 10949265
section profiling
Agent [baseline] (1.205 s) : 0, 1205474
Total [baseline] (11.069 s) : 0, 11069208
Agent [candidate] (1.22 s) : 0, 1219816
Total [candidate] (10.944 s) : 0, 10943854
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.045 s -
Agent appsec 1.225 s 179.428 ms (17.2%)
Agent iast 1.2 s 154.227 ms (14.8%)
Agent profiling 1.205 s 160.162 ms (15.3%)
Total tracing 10.748 s -
Total appsec 10.788 s 40.002 ms (0.4%)
Total iast 11.031 s 282.464 ms (2.6%)
Total profiling 11.069 s 320.834 ms (3.0%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.047 s -
Agent appsec 1.222 s 175.185 ms (16.7%)
Agent iast 1.181 s 134.032 ms (12.8%)
Agent profiling 1.22 s 173.17 ms (16.5%)
Total tracing 10.694 s -
Total appsec 10.808 s 113.268 ms (1.1%)
Total iast 10.949 s 255.001 ms (2.4%)
Total profiling 10.944 s 249.589 ms (2.3%)
gantt
    title petclinic - break down per module: candidate=1.53.0-SNAPSHOT~36f5715401, baseline=1.54.0-SNAPSHOT~f284153719

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.45 ms) : 0, 1450
crashtracking [candidate] (1.449 ms) : 0, 1449
BytebuddyAgent [baseline] (730.704 ms) : 0, 730704
BytebuddyAgent [candidate] (732.178 ms) : 0, 732178
GlobalTracer [baseline] (242.532 ms) : 0, 242532
GlobalTracer [candidate] (242.583 ms) : 0, 242583
AppSec [baseline] (30.026 ms) : 0, 30026
AppSec [candidate] (29.992 ms) : 0, 29992
Debugger [baseline] (6.423 ms) : 0, 6423
Debugger [candidate] (6.38 ms) : 0, 6380
Remote Config [baseline] (667.263 µs) : 0, 667
Remote Config [candidate] (665.638 µs) : 0, 666
Telemetry [baseline] (12.382 ms) : 0, 12382
Telemetry [candidate] (12.226 ms) : 0, 12226
section appsec
crashtracking [baseline] (1.457 ms) : 0, 1457
crashtracking [candidate] (1.447 ms) : 0, 1447
BytebuddyAgent [baseline] (755.997 ms) : 0, 755997
BytebuddyAgent [candidate] (754.335 ms) : 0, 754335
GlobalTracer [baseline] (235.935 ms) : 0, 235935
GlobalTracer [candidate] (234.803 ms) : 0, 234803
AppSec [baseline] (169.287 ms) : 0, 169287
AppSec [candidate] (170.038 ms) : 0, 170038
Debugger [baseline] (7.501 ms) : 0, 7501
Debugger [candidate] (7.554 ms) : 0, 7554
Remote Config [baseline] (623.903 µs) : 0, 624
Remote Config [candidate] (617.879 µs) : 0, 618
Telemetry [baseline] (9.341 ms) : 0, 9341
Telemetry [candidate] (8.5 ms) : 0, 8500
IAST [baseline] (23.479 ms) : 0, 23479
IAST [candidate] (23.481 ms) : 0, 23481
section iast
crashtracking [baseline] (1.488 ms) : 0, 1488
crashtracking [candidate] (1.457 ms) : 0, 1457
BytebuddyAgent [baseline] (866.936 ms) : 0, 866936
BytebuddyAgent [candidate] (852.015 ms) : 0, 852015
GlobalTracer [baseline] (235.755 ms) : 0, 235755
GlobalTracer [candidate] (233.238 ms) : 0, 233238
AppSec [baseline] (28.023 ms) : 0, 28023
AppSec [candidate] (26.796 ms) : 0, 26796
Debugger [baseline] (7.101 ms) : 0, 7101
Debugger [candidate] (6.979 ms) : 0, 6979
Remote Config [baseline] (631.584 µs) : 0, 632
Remote Config [candidate] (599.63 µs) : 0, 600
Telemetry [baseline] (8.543 ms) : 0, 8543
Telemetry [candidate] (8.303 ms) : 0, 8303
IAST [baseline] (29.713 ms) : 0, 29713
IAST [candidate] (30.124 ms) : 0, 30124
section profiling
ProfilingAgent [baseline] (109.65 ms) : 0, 109650
ProfilingAgent [candidate] (110.996 ms) : 0, 110996
crashtracking [baseline] (1.423 ms) : 0, 1423
crashtracking [candidate] (1.461 ms) : 0, 1461
BytebuddyAgent [baseline] (763.716 ms) : 0, 763716
BytebuddyAgent [candidate] (775.236 ms) : 0, 775236
GlobalTracer [baseline] (225.356 ms) : 0, 225356
GlobalTracer [candidate] (225.403 ms) : 0, 225403
AppSec [baseline] (30.893 ms) : 0, 30893
AppSec [candidate] (31.174 ms) : 0, 31174
Debugger [baseline] (7.465 ms) : 0, 7465
Debugger [candidate] (6.869 ms) : 0, 6869
Remote Config [baseline] (744.075 µs) : 0, 744
Remote Config [candidate] (702.379 µs) : 0, 702
Telemetry [baseline] (15.676 ms) : 0, 15676
Telemetry [candidate] (16.551 ms) : 0, 16551
Profiling [baseline] (110.339 ms) : 0, 110339
Profiling [candidate] (111.698 ms) : 0, 111698
Loading
Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.53.0-SNAPSHOT~36f5715401, baseline=1.54.0-SNAPSHOT~f284153719

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.046 s) : 0, 1046471
Total [baseline] (8.64 s) : 0, 8640311
Agent [candidate] (1.048 s) : 0, 1048035
Total [candidate] (8.632 s) : 0, 8631900
section iast
Agent [baseline] (1.177 s) : 0, 1177439
Total [baseline] (9.375 s) : 0, 9374510
Agent [candidate] (1.18 s) : 0, 1179955
Total [candidate] (9.377 s) : 0, 9377311
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.046 s -
Agent iast 1.177 s 130.968 ms (12.5%)
Total tracing 8.64 s -
Total iast 9.375 s 734.199 ms (8.5%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.048 s -
Agent iast 1.18 s 131.92 ms (12.6%)
Total tracing 8.632 s -
Total iast 9.377 s 745.411 ms (8.6%)
gantt
    title insecure-bank - break down per module: candidate=1.53.0-SNAPSHOT~36f5715401, baseline=1.54.0-SNAPSHOT~f284153719

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.455 ms) : 0, 1455
crashtracking [candidate] (1.458 ms) : 0, 1458
BytebuddyAgent [baseline] (733.322 ms) : 0, 733322
BytebuddyAgent [candidate] (733.05 ms) : 0, 733050
GlobalTracer [baseline] (242.44 ms) : 0, 242440
GlobalTracer [candidate] (242.14 ms) : 0, 242140
AppSec [baseline] (30.153 ms) : 0, 30153
AppSec [candidate] (30.065 ms) : 0, 30065
Debugger [baseline] (6.401 ms) : 0, 6401
Debugger [candidate] (6.414 ms) : 0, 6414
Remote Config [baseline] (672.594 µs) : 0, 673
Remote Config [candidate] (674.611 µs) : 0, 675
Telemetry [baseline] (10.83 ms) : 0, 10830
Telemetry [candidate] (13.124 ms) : 0, 13124
section iast
crashtracking [baseline] (1.449 ms) : 0, 1449
crashtracking [candidate] (1.45 ms) : 0, 1450
BytebuddyAgent [baseline] (849.765 ms) : 0, 849765
BytebuddyAgent [candidate] (851.526 ms) : 0, 851526
GlobalTracer [baseline] (232.204 ms) : 0, 232204
GlobalTracer [candidate] (232.798 ms) : 0, 232798
AppSec [baseline] (28.578 ms) : 0, 28578
AppSec [candidate] (26.118 ms) : 0, 26118
Debugger [baseline] (7.01 ms) : 0, 7010
Debugger [candidate] (7.837 ms) : 0, 7837
Remote Config [baseline] (606.656 µs) : 0, 607
Remote Config [candidate] (673.609 µs) : 0, 674
Telemetry [baseline] (8.261 ms) : 0, 8261
Telemetry [candidate] (8.324 ms) : 0, 8324
IAST [baseline] (28.538 ms) : 0, 28538
IAST [candidate] (30.237 ms) : 0, 30237
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master nikita-tkachenko/junit5-testing
git_commit_date 1757412728 1757440709
git_commit_sha f284153 36f5715
release_version 1.54.0-SNAPSHOT~f284153719 1.53.0-SNAPSHOT~36f5715401
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1757442262 1757442262
ci_job_id 1121142284 1121142284
ci_pipeline_id 75989158 75989158
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-kvczgj9c 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-kvczgj9c 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 2 performance improvements and 1 performance regressions! Performance is the same for 9 metrics, 12 unstable metrics.

scenario Δ mean http_req_duration Δ mean throughput candidate mean http_req_duration candidate mean throughput baseline mean http_req_duration baseline mean throughput
scenario:load:insecure-bank:no_agent:high_load worse
[+178.715µs; +285.523µs] or [+4.119%; +6.580%]
unstable
[-170.954op/s; +65.579op/s] or [-16.163%; +6.200%]
4.571ms 1005.000op/s 4.339ms 1057.688op/s
scenario:load:insecure-bank:iast_FULL:high_load better
[-1004.635µs; -412.873µs] or [-7.007%; -2.880%]
unstable
[-21.869op/s; +55.244op/s] or [-6.726%; +16.992%]
13.630ms 341.812op/s 14.338ms 325.125op/s
scenario:load:petclinic:iast:high_load better
[-2.797ms; -1.944ms] or [-6.057%; -4.210%]
unstable
[-1.861op/s; +12.761op/s] or [-1.837%; +12.593%]
43.810ms 106.787op/s 46.181ms 101.338op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.53.0-SNAPSHOT~36f5715401, baseline=1.54.0-SNAPSHOT~f284153719
    dateFormat X
    axisFormat %s
section baseline
no_agent (37.476 ms) : 37186, 37766
.   : milestone, 37476,
appsec (47.513 ms) : 47082, 47945
.   : milestone, 47513,
code_origins (44.513 ms) : 44131, 44895
.   : milestone, 44513,
iast (46.181 ms) : 45771, 46591
.   : milestone, 46181,
profiling (50.32 ms) : 49865, 50775
.   : milestone, 50320,
tracing (43.262 ms) : 42888, 43636
.   : milestone, 43262,
section candidate
no_agent (37.215 ms) : 36908, 37522
.   : milestone, 37215,
appsec (48.284 ms) : 47849, 48719
.   : milestone, 48284,
code_origins (44.729 ms) : 44343, 45115
.   : milestone, 44729,
iast (43.81 ms) : 43428, 44192
.   : milestone, 43810,
profiling (51.014 ms) : 50494, 51534
.   : milestone, 51014,
tracing (43.856 ms) : 43488, 44224
.   : milestone, 43856,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 37.476 ms [37.186 ms, 37.766 ms] -
appsec 47.513 ms [47.082 ms, 47.945 ms] 10.037 ms (26.8%)
code_origins 44.513 ms [44.131 ms, 44.895 ms] 7.037 ms (18.8%)
iast 46.181 ms [45.771 ms, 46.591 ms] 8.705 ms (23.2%)
profiling 50.32 ms [49.865 ms, 50.775 ms] 12.844 ms (34.3%)
tracing 43.262 ms [42.888 ms, 43.636 ms] 5.786 ms (15.4%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 37.215 ms [36.908 ms, 37.522 ms] -
appsec 48.284 ms [47.849 ms, 48.719 ms] 11.069 ms (29.7%)
code_origins 44.729 ms [44.343 ms, 45.115 ms] 7.514 ms (20.2%)
iast 43.81 ms [43.428 ms, 44.192 ms] 6.595 ms (17.7%)
profiling 51.014 ms [50.494 ms, 51.534 ms] 13.799 ms (37.1%)
tracing 43.856 ms [43.488 ms, 44.224 ms] 6.642 ms (17.8%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.53.0-SNAPSHOT~36f5715401, baseline=1.54.0-SNAPSHOT~f284153719
    dateFormat X
    axisFormat %s
section baseline
no_agent (4.339 ms) : 4291, 4387
.   : milestone, 4339,
iast (9.224 ms) : 9073, 9374
.   : milestone, 9224,
iast_FULL (14.338 ms) : 14055, 14622
.   : milestone, 14338,
iast_GLOBAL (10.544 ms) : 10356, 10732
.   : milestone, 10544,
profiling (8.884 ms) : 8733, 9034
.   : milestone, 8884,
tracing (7.821 ms) : 7709, 7934
.   : milestone, 7821,
section candidate
no_agent (4.571 ms) : 4520, 4622
.   : milestone, 4571,
iast (9.411 ms) : 9256, 9566
.   : milestone, 9411,
iast_FULL (13.63 ms) : 13364, 13896
.   : milestone, 13630,
iast_GLOBAL (10.577 ms) : 10388, 10765
.   : milestone, 10577,
profiling (8.628 ms) : 8493, 8763
.   : milestone, 8628,
tracing (7.767 ms) : 7648, 7885
.   : milestone, 7767,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 4.339 ms [4.291 ms, 4.387 ms] -
iast 9.224 ms [9.073 ms, 9.374 ms] 4.885 ms (112.6%)
iast_FULL 14.338 ms [14.055 ms, 14.622 ms] 9.999 ms (230.4%)
iast_GLOBAL 10.544 ms [10.356 ms, 10.732 ms] 6.205 ms (143.0%)
profiling 8.884 ms [8.733 ms, 9.034 ms] 4.545 ms (104.7%)
tracing 7.821 ms [7.709 ms, 7.934 ms] 3.482 ms (80.3%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 4.571 ms [4.52 ms, 4.622 ms] -
iast 9.411 ms [9.256 ms, 9.566 ms] 4.84 ms (105.9%)
iast_FULL 13.63 ms [13.364 ms, 13.896 ms] 9.058 ms (198.2%)
iast_GLOBAL 10.577 ms [10.388 ms, 10.765 ms] 6.005 ms (131.4%)
profiling 8.628 ms [8.493 ms, 8.763 ms] 4.057 ms (88.7%)
tracing 7.767 ms [7.648 ms, 7.885 ms] 3.195 ms (69.9%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master nikita-tkachenko/junit5-testing
git_commit_date 1757412728 1757440709
git_commit_sha f284153 36f5715
release_version 1.54.0-SNAPSHOT~f284153719 1.53.0-SNAPSHOT~36f5715401
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1757442783 1757442783
ci_job_id 1121142285 1121142285
ci_pipeline_id 75989158 75989158
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-pzfwlt0n 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-pzfwlt0n 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.53.0-SNAPSHOT~36f5715401, baseline=1.54.0-SNAPSHOT~f284153719
    dateFormat X
    axisFormat %s
section baseline
no_agent (14.869 s) : 14869000, 14869000
.   : milestone, 14869000,
appsec (14.783 s) : 14783000, 14783000
.   : milestone, 14783000,
iast (18.564 s) : 18564000, 18564000
.   : milestone, 18564000,
iast_GLOBAL (17.457 s) : 17457000, 17457000
.   : milestone, 17457000,
profiling (15.511 s) : 15511000, 15511000
.   : milestone, 15511000,
tracing (14.851 s) : 14851000, 14851000
.   : milestone, 14851000,
section candidate
no_agent (15.437 s) : 15437000, 15437000
.   : milestone, 15437000,
appsec (14.988 s) : 14988000, 14988000
.   : milestone, 14988000,
iast (18.964 s) : 18964000, 18964000
.   : milestone, 18964000,
iast_GLOBAL (18.306 s) : 18306000, 18306000
.   : milestone, 18306000,
profiling (15.835 s) : 15835000, 15835000
.   : milestone, 15835000,
tracing (15.134 s) : 15134000, 15134000
.   : milestone, 15134000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 14.869 s [14.869 s, 14.869 s] -
appsec 14.783 s [14.783 s, 14.783 s] -86.0 ms (-0.6%)
iast 18.564 s [18.564 s, 18.564 s] 3.695 s (24.9%)
iast_GLOBAL 17.457 s [17.457 s, 17.457 s] 2.588 s (17.4%)
profiling 15.511 s [15.511 s, 15.511 s] 642.0 ms (4.3%)
tracing 14.851 s [14.851 s, 14.851 s] -18.0 ms (-0.1%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.437 s [15.437 s, 15.437 s] -
appsec 14.988 s [14.988 s, 14.988 s] -449.0 ms (-2.9%)
iast 18.964 s [18.964 s, 18.964 s] 3.527 s (22.8%)
iast_GLOBAL 18.306 s [18.306 s, 18.306 s] 2.869 s (18.6%)
profiling 15.835 s [15.835 s, 15.835 s] 398.0 ms (2.6%)
tracing 15.134 s [15.134 s, 15.134 s] -303.0 ms (-2.0%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.53.0-SNAPSHOT~36f5715401, baseline=1.54.0-SNAPSHOT~f284153719
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.474 ms) : 1462, 1485
.   : milestone, 1474,
appsec (3.589 ms) : 3377, 3801
.   : milestone, 3589,
iast (2.196 ms) : 2134, 2259
.   : milestone, 2196,
iast_GLOBAL (2.241 ms) : 2178, 2304
.   : milestone, 2241,
profiling (2.054 ms) : 2003, 2106
.   : milestone, 2054,
tracing (2.009 ms) : 1961, 2057
.   : milestone, 2009,
section candidate
no_agent (1.472 ms) : 1460, 1483
.   : milestone, 1472,
appsec (3.659 ms) : 3442, 3876
.   : milestone, 3659,
iast (2.204 ms) : 2141, 2266
.   : milestone, 2204,
iast_GLOBAL (2.238 ms) : 2174, 2301
.   : milestone, 2238,
profiling (2.058 ms) : 2006, 2110
.   : milestone, 2058,
tracing (2.015 ms) : 1966, 2064
.   : milestone, 2015,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.474 ms [1.462 ms, 1.485 ms] -
appsec 3.589 ms [3.377 ms, 3.801 ms] 2.115 ms (143.5%)
iast 2.196 ms [2.134 ms, 2.259 ms] 722.812 µs (49.0%)
iast_GLOBAL 2.241 ms [2.178 ms, 2.304 ms] 767.481 µs (52.1%)
profiling 2.054 ms [2.003 ms, 2.106 ms] 580.851 µs (39.4%)
tracing 2.009 ms [1.961 ms, 2.057 ms] 535.077 µs (36.3%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.472 ms [1.46 ms, 1.483 ms] -
appsec 3.659 ms [3.442 ms, 3.876 ms] 2.187 ms (148.6%)
iast 2.204 ms [2.141 ms, 2.266 ms] 731.8 µs (49.7%)
iast_GLOBAL 2.238 ms [2.174 ms, 2.301 ms] 765.707 µs (52.0%)
profiling 2.058 ms [2.006 ms, 2.11 ms] 585.968 µs (39.8%)
tracing 2.015 ms [1.966 ms, 2.064 ms] 542.937 µs (36.9%)

@pr-commenter
Copy link

pr-commenter bot commented Sep 1, 2025

Kafka / producer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master nikita-tkachenko/junit5-testing
git_commit_date 1757412728 1757440709
git_commit_sha f284153 36f5715
See matching parameters
Baseline Candidate
ci_job_date 1757441912 1757441912
ci_job_id 1121142288 1121142288
ci_pipeline_id 75989158 75989158
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.25 11.0.25
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/producer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.25+9-post-Ubuntu-1ubuntu122.04 11.0.25+9-post-Ubuntu-1ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaProduceBenchmark.benchProduce same
scenario:only-tracing-dsm-enabled-benchmarks/KafkaProduceBenchmark.benchProduce same

@pr-commenter
Copy link

pr-commenter bot commented Sep 1, 2025

Kafka / consumer-benchmark

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master nikita-tkachenko/junit5-testing
git_commit_date 1757412728 1757440709
git_commit_sha f284153 36f5715
See matching parameters
Baseline Candidate
ci_job_date 1757441954 1757441954
ci_job_id 1121142289 1121142289
ci_pipeline_id 75989158 75989158
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
jdkVersion 11.0.25 11.0.25
jmhVersion 1.36 1.36
jvm /usr/lib/jvm/java-11-openjdk-amd64/bin/java /usr/lib/jvm/java-11-openjdk-amd64/bin/java
jvmArgs -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant -Dfile.encoding=UTF-8 -Djava.io.tmpdir=/go/src/github.com/DataDog/apm-reliability/dd-trace-java/platform/src/consumer-benchmark/build/tmp/jmh -Duser.country=US -Duser.language=en -Duser.variant
vmName OpenJDK 64-Bit Server VM OpenJDK 64-Bit Server VM
vmVersion 11.0.25+9-post-Ubuntu-1ubuntu122.04 11.0.25+9-post-Ubuntu-1ubuntu122.04

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 3 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean throughput
scenario:not-instrumented/KafkaConsumerBenchmark.benchConsume same
scenario:only-tracing-dsm-disabled-benchmarks/KafkaConsumerBenchmark.benchConsume same
scenario:only-tracing-dsm-enabled-benchmarks/KafkaConsumerBenchmark.benchConsume same

@PerfectSlayer
Copy link
Contributor

I would like to have a look Today or early Monday if it’s not too late 🙏

Comment on lines +91 to +109
iastTest {
filter {
// This class must be excluded from scanning because it references class from "org.apache.pekko.http" package.
// When JUnit 5 scans this class, it loads every other class that is present in its method signatures (arguments, return types, throws).
// As the result, some classes from the datadog.trace.instrumentation.pekkohttp.iast.MakeTaintableInstrumentation#knownMatchingTypes list are loaded.
// JUnit scanning (and class loading that it triggers) happens before instrumentations are applied.
// Later when MakeTaintableInstrumentation is applied (in InstrumentationSpecification#setupSpec),
// it fails to retransform the already-loaded org.apache.pekko.http classes,
// because it needs to change the set of implemented interfaces, which is not possible for already loaded classes.
excludeTestsMatching("*PekkoIastTestWebServer*")
}
}

latestDepIastTest {
filter {
// Exclude the same problematic class as for iastTest to avoid class loading issues.
excludeTestsMatching("*PekkoIastTestWebServer*")
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Great investigation !

Copy link
Contributor

@PerfectSlayer PerfectSlayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 notes: Posting the current state of my review to not loose. I will continue later Today or Tomorrow but the changes look great already!

@nikita-tkachenko-datadog nikita-tkachenko-datadog requested a review from a team as a code owner September 8, 2025 15:03
Copy link
Contributor

@PerfectSlayer PerfectSlayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m halfway through but posting comments as I fear to loose the review progress 😓 - the GitHub UI is not very stable with so many changes 😅

@@ -862,7 +862,22 @@ abstract class HttpClientTest extends VersionedNamingTestBase {
"$DDTags.PATHWAY_HASH" { String }
}
if (exception) {
errorTags(exception.class, exception.message)
// PlayWS classes throw different exception types for the same connection failures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 suggestion: ‏I wonder if there isn’t a cleaner way to avoid introducing coupling the generic test client with Play instrumentation - especially since it’s coupled with a lot of product already.

What about adding a method assertSpanErrorTag(Class<Throwable> errorType, message) that will delegate to the usual DSL TagsAssert.errorTags but can be overridden into the Play framework instrumentation tests to loosen the assert a bit?

@@ -3,6 +3,10 @@ plugins {
id 'me.champeau.jmh'
}

ext {
latestDepJava11TestMinJavaVersionForTests = JavaVersion.VERSION_11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 thought: I’m surprised‏ it even runs on instrumentation / JDK8 jobs according the latest dependency requirements 🤷

Comment on lines +29 to +31
forkedTest {
timeout = Duration.of(15, ChronoUnit.MINUTES)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 notes: Similarly, we need to find out why the migration extended the timeout
cc @AlexeyKuznetsov-DD

Copy link
Contributor

@PerfectSlayer PerfectSlayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slowing getting there, 150 files to review left

callback?.call()
if (callback != null) {
// Execute callback in a separate thread to clear trace context
def thread = new Thread({ callback.call() })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ question: ‏Do you have some context about why this kind of change is needed here @amarziali?
I wonder if starting a new thread is the right solution in term of CI stability or if Tracer.muteTracing() could be a safer alternative?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what was happening was that the callback was created as a child span, resulting in 2 spans returned instead of 1, which is why the callback is explicitly started in a new thread here. From what I can see, Tracer.muteTracing() mutes the additional span, so this seems like a safer alternative compared to creating a new thread. I can try implementing this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry -- we do actually need the callback span because that's what the test is checking for, and Tracer.muteTracing() mutes this entirely, so it does not work... Let me know if I'm understanding Tracer.muteTracing() or the test wrong though 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two tests that can be done:

  1. invoke the callback when the http client was running under a parent span -> the callback must be child of the parent
  2. invoke the callback when the http client was running without a parent span -> the callback must have no parent

PlayWSClientTestBase has testCallbackWithParent to false (I imagine because that one was just not working). So we are just testing the case that the callback must not have a parent. This means that, if the test was failing because the callback was attached to some parent does not means that the test has to be fixed but that the integration is broken. Having run the callback in a separate thread is wrong here.
Rather, we must have investigated why the instrumentation is capturing a bad parent. At this point I would suggest to log a ticket into the idm board in order to have play ws fixed in the future (cc @vandonr for awareness)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I logged a ticket in the intake column here: https://datadoghq.atlassian.net/browse/AIDM-101

Copy link
Contributor

@PerfectSlayer PerfectSlayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 chore: Excluded classes from :dd-java-agent:testing build.grade could be updated to removed the old runner exclusion:

// Groovy generates unreachable lines see:
// https://issues.apache.org/jira/browse/GROOVY-9610
'datadog.trace.agent.test.AgentTestRunner',

🧹 chore: ‏Similarly, can we add @DataDog/asm-java as codeowner of the newly introduced modules :dd-java-agent:appsec:appsec-test-fixtures and :dd-java-agent:agent-iast:iast-test-fixtures here:

# @DataDog/asm-java (AppSec/IAST)

👏 praise: ‏Amazing work! Thanks for unblocking instrumented testing using JUnit 5 💪 We tried several times during R&D week and nobody went to the bottom of it. Thanks a lot!

That’s all for me, I will approve and we can follow up in PR  comment discussion or in follow up PR if you would like to get this one merge soon (to avoid merge conflict as much as possible).

Comment on lines +16 to +22
forkedTest {
timeout = Duration.of(15, ChronoUnit.MINUTES)
}

latestDepForkedTest {
timeout = Duration.of(15, ChronoUnit.MINUTES)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 notes: Same here about investigating timeout increase
cc @AlexeyKuznetsov-DD

@nikita-tkachenko-datadog nikita-tkachenko-datadog merged commit 67cbd2c into master Sep 9, 2025
675 of 676 checks passed
@nikita-tkachenko-datadog nikita-tkachenko-datadog deleted the nikita-tkachenko/junit5-testing branch September 9, 2025 19:53
@github-actions github-actions bot added this to the 1.54.0 milestone Sep 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp: testing Testing run-tests: all Run all tests tag: no release notes Changes to exclude from release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants