Skip to content

Significant test suite regression #9536

@smklein

Description

@smklein

I've done some investigation using my hist branch, for analyzing test performance in C/I.

#9091 appears to be causing our test suite to run about 16 minutes slower:

 $ cargo xtask hist compare -p linux -n 100     
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.25s
     Running `target/debug/xtask hist compare -p linux -n 100`
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.31s
     Running `target/debug/omicron-hist compare -p linux -n 100`
Analyzing historical junit-linux test timing for 100 recent commits...
  TOTAL TIME    TESTS   TEST SUITES                    ENVIRONMENT COMMIT
...
    4401.71s     2410           104                            aws 18058fca [feat, multicast] Multicast Group+Member Support (#9091)
    3378.10s     2273           103                            aws ca96cf3a BlueprintBuilder: remove SledEditor::baseboard_id() (#9405)
    3284.24s     2273           103                            aws 69dc064f [ereport-types] less ugly `Display` impl for `EreportId` (#9401)
    3243.36s     2273           103                            aws 0995c5ee Change LocalStorage dataset to non-zoned (#9404)
    3559.12s     2273           103                            aws 50600451 Fix omdb support-bundles download --output (#9400)
    3364.53s     2273           103                            aws 2f7f8074 Allow up to 254 vCPUs to a VM (#9385)

This is also perceivable visually, via:

Image

Comparing test suites before and after that PR:

# BEFORE

NAME                                                       DURATION   TEST COUNT                                                                                                              
omicron-nexus::test_all                                   10706.46s          493                                                                                                              
omicron-nexus                                              4783.38s          191                                                                                                              
nexus-db-queries                                           1407.59s          391                                                                                                              
oximeter-db                                                 781.55s          248                                                                                                              
nexus-reconfigurator-execution                              304.24s           16                                                                                                              
nexus-mgs-updates                                            83.40s           13                                                                                                              
nexus-reconfigurator-rendezvous                              61.43s            3                                                                                                              
omicron-omdb::test_all_output                                54.87s            3                                                                                                              
oximeter-db::integration_test                                49.13s            3                                                                                                              
# AFTER

NAME                                                       DURATION   TEST COUNT              
omicron-nexus::test_all                                   13378.63s          551              
omicron-nexus                                              7653.73s          210              
nexus-db-queries                                           1823.54s          431              
oximeter-db                                                 853.86s          248
nexus-reconfigurator-execution                              378.04s           16
nexus-mgs-updates                                            86.93s           13
nexus-reconfigurator-rendezvous                              72.83s            3
nexus-reconfigurator-cli-integration-tests::integration      59.86s            1
omicron-omdb::test_all_output                                51.23s            3
trust-quorum-protocol::cluster                        

It's not yet clear to me whether or not:

  1. This PR was so massive that it added 16 minutes of "needed" tests. That seems like a lot.
  2. This PR caused a regression in the performance of existing tests.

Either way, 16 minutes is a lot; we should investigate.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions