-----============= acceptance-small: replay-single ============----- Mon Mar 16 09:38:15 EDT 2026 mgs: Rocky Linux release 8.10 (Green Obsidian) MGS_OS_ID_LIKE=rhel centos fedora rocky MGS_OS_VERSION_ID=8.10 MGS_OS_ID=rocky MGS_OS_VERSION_CODE=134873088 mds1: Rocky Linux release 8.10 (Green Obsidian) MDS1_OS_VERSION_ID=8.10 MDS1_OS_VERSION_CODE=134873088 MDS1_OS_ID_LIKE=rhel centos fedora rocky MDS1_OS_ID=rocky ost1: Rocky Linux release 8.10 (Green Obsidian) OST1_OS_VERSION_CODE=134873088 OST1_OS_ID_LIKE=rhel centos fedora rocky OST1_OS_VERSION_ID=8.10 OST1_OS_ID=rocky client: Rocky Linux release 8.10 (Green Obsidian) CLIENT_OS_ID=rocky CLIENT_OS_VERSION_CODE=134873088 CLIENT_OS_VERSION_ID=8.10 CLIENT_OS_ID_LIKE=rhel centos fedora rocky oleg146-server: ls: cannot access '/home/green/git/lustre-release/lustre/tests/except/replay-single.*ex': No such file or directory excepting tests: 110f 131b 59 36 === replay-single: start setup 09:38:24 (1773668304) === oleg146-client.virtnet: executing check_config_client /mnt/lustre oleg146-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg146-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff9c2f8a548800.idle_timeout=debug osc.lustre-OST0001-osc-ffff9c2f8a548800.idle_timeout=debug disable quota as required oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all === replay-single: finish setup 09:38:41 (1773668321) === Cleaning up AT ... == replay-single test 100a: DNE: create striped dir, drop update rep from MDT1, fail MDT1 ========================================================== 09:38:43 (1773668323) SKIP: replay-single test_100a needs >= 2 MDTs SKIP 100a (2s) == replay-single test 100b: DNE: create striped dir, fail MDT0 ========================================================== 09:38:45 (1773668325) SKIP: replay-single test_100b needs >= 2 MDTs SKIP 100b (2s) == replay-single test 100c: DNE: create striped dir, abort_recov_mdt mds2 ========================================================== 09:38:47 (1773668327) SKIP: replay-single test_100c needs >= 2 MDTs SKIP 100c (2s) == replay-single test 100d: DNE: cancel update logs upon recovery abort ========================================================== 09:38:49 (1773668329) SKIP: replay-single test_100d needs > 1 MDTs SKIP 100d (2s) == replay-single test 100e: DNE: create striped dir on MDT0 and MDT1, fail MDT0, MDT1 ========================================================== 09:38:51 (1773668331) SKIP: replay-single test_100e needs >= 2 MDTs SKIP 100e (2s) == replay-single test 101: Shouldn't reassign precreated objs to other files after recovery ========================================================== 09:38:53 (1773668333) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Stopping /mnt/lustre-mds1 (opts:) on oleg146-server Failover mds1 to oleg146-server oleg146-server.virtnet Start mds1: mount -t lustre -o localrecov -o abort_recovery lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 PASS 101 (60s) == replay-single test 102a: check resend (request lost) with multiple modify RPCs in flight ========================================================== 09:39:54 (1773668394) creating 7 files ... fail_loc=0x159 launch 7 chmod in parallel (09:39:55) ... fail_loc=0 done (09:40:12) /mnt/lustre/d102a.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-7 has perms 0600 OK PASS 102a (20s) == replay-single test 102b: check resend (reply lost) with multiple modify RPCs in flight ========================================================== 09:40:14 (1773668414) creating 7 files ... fail_loc=0x15a launch 7 chmod in parallel (09:40:16) ... fail_loc=0 done (09:40:33) /mnt/lustre/d102b.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-7 has perms 0600 OK PASS 102b (21s) == replay-single test 102c: check replay w/o reconstruction with multiple mod RPCs in flight ========================================================== 09:40:35 (1773668435) creating 7 files ... UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2209792 3328 2204416 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3351552 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 6144 3749888 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 9216 7101440 1% /mnt/lustre fail_loc=0x15a launch 7 chmod in parallel (09:40:38) ... fail_loc=0 Failing mds1 on oleg146-server Stopping /mnt/lustre-mds1 (opts:) on oleg146-server 09:40:42 (1773668442) shut down facet: mds1 facet_host: oleg146-server facet_failover_host: oleg146-server Failover mds1 to oleg146-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 09:40:58 (1773668458) targets are mounted 09:40:58 (1773668458) facet_failover done oleg146-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec done (09:41:08) /mnt/lustre/d102c.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-7 has perms 0600 OK PASS 102c (36s) == replay-single test 102d: check replay & reconstruction with multiple mod RPCs in flight ========================================================== 09:41:11 (1773668471) creating 7 files ... fail_loc=0x15a launch 7 chmod in parallel (09:41:12) ... fail_loc=0 Failing mds1 on oleg146-server Stopping /mnt/lustre-mds1 (opts:) on oleg146-server 09:41:16 (1773668476) shut down facet: mds1 facet_host: oleg146-server facet_failover_host: oleg146-server Failover mds1 to oleg146-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 09:41:32 (1773668492) targets are mounted 09:41:32 (1773668492) facet_failover done oleg146-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec done (09:41:42) /mnt/lustre/d102d.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-7 has perms 0600 OK PASS 102d (33s) == replay-single test 103: Check otr_next_id overflow ==== 09:41:44 (1773668504) fail_loc=0x80000162 total: 30 open/close in 0.24 seconds: 123.21 ops/second Failing mds1 on oleg146-server Stopping /mnt/lustre-mds1 (opts:) on oleg146-server 09:41:49 (1773668509) shut down facet: mds1 facet_host: oleg146-server facet_failover_host: oleg146-server Failover mds1 to oleg146-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 09:42:04 (1773668524) targets are mounted 09:42:04 (1773668524) facet_failover done oleg146-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 103 (29s) == replay-single test 110a: DNE: create striped dir, fail MDT1 ========================================================== 09:42:13 (1773668533) SKIP: replay-single test_110a needs >= 2 MDTs SKIP 110a (1s) == replay-single test 110b: DNE: create striped dir, fail MDT1 and client ========================================================== 09:42:14 (1773668534) SKIP: replay-single test_110b needs >= 2 MDTs SKIP 110b (2s) == replay-single test 110c: DNE: create striped dir, fail MDT2 ========================================================== 09:42:16 (1773668536) SKIP: replay-single test_110c needs >= 2 MDTs SKIP 110c (1s) == replay-single test 110d: DNE: create striped dir, fail MDT2 and client ========================================================== 09:42:17 (1773668537) SKIP: replay-single test_110d needs >= 2 MDTs SKIP 110d (2s) == replay-single test 110e: DNE: create striped dir, uncommit on MDT2, fail client/MDT1/MDT2 ========================================================== 09:42:19 (1773668539) SKIP: replay-single test_110e needs >= 2 MDTs SKIP 110e (2s) SKIP: replay-single test_110f skipping excluded test 110f == replay-single test 110g: DNE: create striped dir, uncommit on MDT1, fail client/MDT1/MDT2 ========================================================== 09:42:22 (1773668542) SKIP: replay-single test_110g needs >= 2 MDTs SKIP 110g (3s) == replay-single test 111a: DNE: unlink striped dir, fail MDT1 ========================================================== 09:42:25 (1773668545) SKIP: replay-single test_111a needs >= 2 MDTs SKIP 111a (5s) == replay-single test 111b: DNE: unlink striped dir, fail MDT2 ========================================================== 09:42:30 (1773668550) SKIP: replay-single test_111b needs >= 2 MDTs SKIP 111b (4s) == replay-single test 111c: DNE: unlink striped dir, uncommit on MDT1, fail client/MDT1/MDT2 ========================================================== 09:42:35 (1773668555) SKIP: replay-single test_111c needs >= 2 MDTs SKIP 111c (4s) == replay-single test 111d: DNE: unlink striped dir, uncommit on MDT2, fail client/MDT1/MDT2 ========================================================== 09:42:40 (1773668560) SKIP: replay-single test_111d needs >= 2 MDTs SKIP 111d (4s) == replay-single test 111e: DNE: unlink striped dir, uncommit on MDT2, fail MDT1/MDT2 ========================================================== 09:42:44 (1773668564) SKIP: replay-single test_111e needs >= 2 MDTs SKIP 111e (5s) == replay-single test 111f: DNE: unlink striped dir, uncommit on MDT1, fail MDT1/MDT2 ========================================================== 09:42:49 (1773668569) SKIP: replay-single test_111f needs >= 2 MDTs SKIP 111f (4s) == replay-single test 111g: DNE: unlink striped dir, fail MDT1/MDT2 ========================================================== 09:42:53 (1773668573) SKIP: replay-single test_111g needs >= 2 MDTs SKIP 111g (4s) == replay-single test 112a: DNE: cross MDT rename, fail MDT1 ========================================================== 09:42:57 (1773668577) SKIP: replay-single test_112a needs >= 4 MDTs SKIP 112a (4s) == replay-single test 112b: DNE: cross MDT rename, fail MDT2 ========================================================== 09:43:01 (1773668581) SKIP: replay-single test_112b needs >= 4 MDTs SKIP 112b (4s) == replay-single test 112c: DNE: cross MDT rename, fail MDT3 ========================================================== 09:43:05 (1773668585) SKIP: replay-single test_112c needs >= 4 MDTs SKIP 112c (4s) == replay-single test 112d: DNE: cross MDT rename, fail MDT4 ========================================================== 09:43:09 (1773668589) SKIP: replay-single test_112d needs >= 4 MDTs SKIP 112d (4s) == replay-single test 112e: DNE: cross MDT rename, fail MDT1 and MDT2 ========================================================== 09:43:13 (1773668593) SKIP: replay-single test_112e needs >= 4 MDTs SKIP 112e (5s) == replay-single test 112f: DNE: cross MDT rename, fail MDT1 and MDT3 ========================================================== 09:43:18 (1773668598) SKIP: replay-single test_112f needs >= 4 MDTs SKIP 112f (4s) == replay-single test 112g: DNE: cross MDT rename, fail MDT1 and MDT4 ========================================================== 09:43:22 (1773668602) SKIP: replay-single test_112g needs >= 4 MDTs SKIP 112g (6s) == replay-single test 112h: DNE: cross MDT rename, fail MDT2 and MDT3 ========================================================== 09:43:28 (1773668608) SKIP: replay-single test_112h needs >= 4 MDTs SKIP 112h (5s) == replay-single test 112i: DNE: cross MDT rename, fail MDT2 and MDT4 ========================================================== 09:43:34 (1773668614) SKIP: replay-single test_112i needs >= 4 MDTs SKIP 112i (5s) == replay-single test 112j: DNE: cross MDT rename, fail MDT3 and MDT4 ========================================================== 09:43:38 (1773668618) SKIP: replay-single test_112j needs >= 4 MDTs SKIP 112j (4s) == replay-single test 112k: DNE: cross MDT rename, fail MDT1,MDT2,MDT3 ========================================================== 09:43:42 (1773668622) SKIP: replay-single test_112k needs >= 4 MDTs SKIP 112k (3s) == replay-single test 112l: DNE: cross MDT rename, fail MDT1,MDT2,MDT4 ========================================================== 09:43:45 (1773668625) SKIP: replay-single test_112l needs >= 4 MDTs SKIP 112l (3s) == replay-single test 112m: DNE: cross MDT rename, fail MDT1,MDT3,MDT4 ========================================================== 09:43:48 (1773668628) SKIP: replay-single test_112m needs >= 4 MDTs SKIP 112m (3s) == replay-single test 112n: DNE: cross MDT rename, fail MDT2,MDT3,MDT4 ========================================================== 09:43:51 (1773668631) SKIP: replay-single test_112n needs >= 4 MDTs SKIP 112n (3s) == replay-single test 115: failover for create/unlink striped directory ========================================================== 09:43:54 (1773668634) SKIP: replay-single test_115 needs >= 2 MDTs SKIP 115 (3s) == replay-single test 116a: large update log master MDT recovery ========================================================== 09:43:57 (1773668637) SKIP: replay-single test_116a needs >= 2 MDTs SKIP 116a (3s) == replay-single test 116b: large update log slave MDT recovery ========================================================== 09:44:00 (1773668640) SKIP: replay-single test_116b needs >= 2 MDTs SKIP 116b (3s) == replay-single test 117: DNE: cross MDT unlink, fail MDT1 and MDT2 ========================================================== 09:44:03 (1773668643) SKIP: replay-single test_117 needs >= 4 MDTs SKIP 117 (2s) == replay-single test 118: invalidate osp update will not cause update log corruption ========================================================== 09:44:06 (1773668646) SKIP: replay-single test_118 needs >= 2 MDTs SKIP 118 (2s) == replay-single test 119: timeout of normal replay does not cause DNE replay fails ========================================================== 09:44:08 (1773668648) SKIP: replay-single test_119 needs >= 2 MDTs SKIP 119 (3s) == replay-single test 120: DNE fail abort should stop both normal and DNE replay ========================================================== 09:44:11 (1773668651) SKIP: replay-single test_120 needs >= 2 MDTs SKIP 120 (3s) == replay-single test 121: lock replay timed out and race ========================================================== 09:44:14 (1773668654) multiop /mnt/lustre/f121.replay-single vs_s TMPPIPE=/tmp/multiop_open_wait_pipe.7552 Stopping /mnt/lustre-mds1 (opts:) on oleg146-server Failover mds1 to oleg146-server oleg146-server.virtnet fail_loc=0x721 fail_val=0 at_max=0 Start mds1: mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 fail_loc=0x0 at_max=600 PASS 121 (43s) == replay-single test 130a: DoM file create (setstripe) replay ========================================================== 09:44:57 (1773668697) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 6144 3749888 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 9216 7516160 1% /mnt/lustre Failing mds1 on oleg146-server Stopping /mnt/lustre-mds1 (opts:) on oleg146-server 09:45:06 (1773668706) shut down facet: mds1 facet_host: oleg146-server facet_failover_host: oleg146-server Failover mds1 to oleg146-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 09:45:36 (1773668736) targets are mounted 09:45:36 (1773668736) facet_failover done oleg146-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 130a (51s) == replay-single test 130b: DoM file create (inherited) replay ========================================================== 09:45:48 (1773668748) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3328 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 6144 3749888 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 9216 7516160 1% /mnt/lustre Failing mds1 on oleg146-server Stopping /mnt/lustre-mds1 (opts:) on oleg146-server 09:45:57 (1773668757) shut down facet: mds1 facet_host: oleg146-server facet_failover_host: oleg146-server Failover mds1 to oleg146-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 09:46:17 (1773668777) targets are mounted 09:46:17 (1773668777) facet_failover done oleg146-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 130b (40s) == replay-single test 131a: DoM file write lock replay === 09:46:28 (1773668788) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3328 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 6144 3749888 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 9216 7516160 1% /mnt/lustre 1+0 records in 1+0 records out 8 bytes copied, 0.0107223 s, 0.7 kB/s Failing mds1 on oleg146-server Stopping /mnt/lustre-mds1 (opts:) on oleg146-server 09:46:38 (1773668798) shut down facet: mds1 facet_host: oleg146-server facet_failover_host: oleg146-server Failover mds1 to oleg146-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 09:46:59 (1773668819) targets are mounted 09:46:59 (1773668819) facet_failover done oleg146-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 131a (43s) SKIP: replay-single test_131b skipping excluded test 131b == replay-single test 132a: PFL new component instantiate replay ========================================================== 09:47:13 (1773668833) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3328 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 6144 3749888 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 9216 7516160 1% /mnt/lustre 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0727712 s, 14.4 MB/s /mnt/lustre/f132a.replay-single lcm_layout_gen: 3 lcm_mirror_count: 1 lcm_entry_count: 2 lcme_id: 1 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: 1048576 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 1 lmm_objects: - 0: { l_ost_idx: 1, l_fid: [0x280000400:0x502:0x0] } lcme_id: 2 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 1048576 lcme_extent.e_end: EOF lmm_stripe_count: 2 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 0 lmm_objects: - 0: { l_ost_idx: 0, l_fid: [0x240000400:0x502:0x0] } - 1: { l_ost_idx: 1, l_fid: [0x280000400:0x503:0x0] } Failing mds1 on oleg146-server Stopping /mnt/lustre-mds1 (opts:) on oleg146-server 09:47:22 (1773668842) shut down facet: mds1 facet_host: oleg146-server facet_failover_host: oleg146-server Failover mds1 to oleg146-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 09:47:40 (1773668860) targets are mounted 09:47:40 (1773668860) facet_failover done oleg146-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f132a.replay-single lcm_layout_gen: 4 lcm_mirror_count: 1 lcm_entry_count: 2 lcme_id: 1 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: 1048576 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 65535 lmm_stripe_offset: 1 lmm_objects: - 0: { l_ost_idx: 1, l_fid: [0x280000400:0x502:0x0] } lcme_id: 2 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 1048576 lcme_extent.e_end: EOF lmm_stripe_count: 2 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 65535 lmm_stripe_offset: 0 lmm_objects: - 0: { l_ost_idx: 0, l_fid: [0x240000400:0x502:0x0] } - 1: { l_ost_idx: 1, l_fid: [0x280000400:0x503:0x0] } PASS 132a (40s) == replay-single test 133: check resend of ongoing requests for lwp during failover ========================================================== 09:47:53 (1773668873) SKIP: replay-single test_133 needs >= 2 MDTs SKIP 133 (2s) == replay-single test 134: replay creation of a file created in a pool ========================================================== 09:47:55 (1773668875) Creating new pool pool_134 oleg146-server: Pool lustre.pool_134 created Adding targets to pool oleg146-server: OST lustre-OST0001_UUID added to pool lustre.pool_134 Waiting 90s for 'lustre-OST0001_UUID ' Updated after 2s: want 'lustre-OST0001_UUID ' got 'lustre-OST0001_UUID ' UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3328 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 4096 3765248 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 6144 3763200 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 10240 7528448 1% /mnt/lustre Failing mds1 on oleg146-server Stopping /mnt/lustre-mds1 (opts:) on oleg146-server 09:48:10 (1773668890) shut down facet: mds1 facet_host: oleg146-server facet_failover_host: oleg146-server Failover mds1 to oleg146-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 09:48:33 (1773668913) targets are mounted 09:48:33 (1773668913) facet_failover done oleg146-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Destroy the created pools: pool_134 lustre.pool_134 oleg146-server: OST lustre-OST0001_UUID removed from pool lustre.pool_134 oleg146-server: Pool lustre.pool_134 destroyed Waiting 90s for 'foo' PASS 134 (59s) == replay-single test 135: Server failure in lock replay phase ========================================================== 09:48:54 (1773668934) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3328 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 4096 3765248 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 6144 3763200 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 10240 7528448 1% /mnt/lustre ldlm.cancel_unused_locks_before_replay=0 debug_mb=100 debug=+info +ha +dlmtrace Stopping /mnt/lustre-ost1 (opts:) on oleg146-server Failover ost1 to oleg146-server oleg146-server.virtnet oleg146-server: oleg146-server.virtnet: executing load_module ../libcfs/libcfs/libcfs fail_loc=0x32d fail_val=20 debug_mb=100 debug=+info +ha +dlmtrace Start ost1: mount -t lustre -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-OST0000 oleg146-client.virtnet: executing wait_import_state_mount REPLAY_LOCKS osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in REPLAY_LOCKS state after 0 sec Stopping /mnt/lustre-ost1 (opts:) on oleg146-server Failover ost1 to oleg146-server oleg146-server.virtnet oleg146-server: oleg146-server.virtnet: executing load_module ../libcfs/libcfs/libcfs fail_loc=0 Start ost1: mount -t lustre -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all End of sync pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-OST0000 Stopping /mnt/lustre-ost1 (opts:-f) on oleg146-server Stopping /mnt/lustre-ost2 (opts:-f) on oleg146-server Start ost1: mount -t lustre -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-OST0000 Start ost2: mount -t lustre -o localrecov lustre-ost2/ost2 /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-OST0001 pdsh@oleg146-client: oleg146-client: ssh exited with exit code 5 debug_mb=21 debug=super ioctl neterror warning dlmtrace error emerg ha rpctrace vfstrace config console lfsck ldlm.cancel_unused_locks_before_replay=1 PASS 135 (110s) == replay-single test 136: MDS to disconnect all OSPs first, then cleanup ldlm ========================================================== 09:50:45 (1773669045) SKIP: replay-single test_136 needs > 2 MDTs SKIP 136 (2s) == replay-single test 137a: DNE: create under striped dir, fail MDT1 ========================================================== 09:50:46 (1773669046) SKIP: replay-single test_137a needs >= 2 MDTs SKIP 137a (2s) == replay-single test 137b: DNE: create under striped dir, fail MDT2 ========================================================== 09:50:48 (1773669048) SKIP: replay-single test_137b needs >= 2 MDTs SKIP 137b (2s) == replay-single test 137c: DNE: create under striped dir, fail MDT1/MDT2 ========================================================== 09:50:50 (1773669050) SKIP: replay-single test_137c needs >= 2 MDTs SKIP 137c (2s) == replay-single test 200: Dropping one OBD_PING should not cause disconnect ========================================================== 09:50:52 (1773669052) SKIP: replay-single test_200 Need remote client SKIP 200 (2s) == replay-single test 201: MDT umount cascading disconnects timeouts ========================================================== 09:50:54 (1773669054) SKIP: replay-single test_201 needs >= 2 MDTs SKIP 201 (3s) == replay-single test 202: pfl replay should recovery layout generation ========================================================== 09:50:57 (1773669057) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3328 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 4096 3765248 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 6144 3763200 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 10240 7528448 1% /mnt/lustre Failing mds1 on oleg146-server Stopping /mnt/lustre-mds1 (opts:) on oleg146-server 09:51:04 (1773669064) shut down facet: mds1 facet_host: oleg146-server facet_failover_host: oleg146-server Failover mds1 to oleg146-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg146-server: oleg146-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg146-client: oleg146-server: ssh exited with exit code 1 Started lustre-MDT0000 09:51:22 (1773669082) targets are mounted 09:51:22 (1773669082) facet_failover done oleg146-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 202 (35s) == replay-single test 203: resend can hit original request ========================================================== 09:51:32 (1773669092) Starting client: oleg146-client.virtnet: -o user_xattr,flock 192.168.201.146@tcp:/lustre /mnt/lustre2 fail_loc=0x80002403 fail_val=2 STAT SETSTRIPE File: /mnt/lustre/f203.replay-single Size: 0 Blocks: 1 IO Block: 4194304 regular empty file Device: 2c54f966h/743766374d Inode: 144115574371254273 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2026-03-16 09:51:35.000000000 -0400 Modify: 2026-03-16 09:51:35.000000000 -0400 Change: 2026-03-16 09:51:35.000000000 -0400 Birth: 2026-03-16 09:51:34.000000000 -0400 fail_loc=0 fail_val=0 192.168.201.146@tcp:/lustre /mnt/lustre2 lustre rw,checksum,encrypt,flock,lazystatfs,lruresize,nolock,statfs_project,nouser_fid2path,user_xattr,verbose 0 0 Stopping client oleg146-client.virtnet /mnt/lustre2 (opts:) PASS 203 (10s) == replay-single test complete, duration 806 sec ========= 09:51:42 (1773669102) === replay-single: start cleanup 09:51:43 (1773669103) === === replay-single: finish cleanup 09:51:46 (1773669106) ===