-----============= acceptance-small: replay-ost-single ============----- Tue Apr 1 03:44:32 EDT 2025 mgs: Rocky Linux release 8.10 (Green Obsidian) MGS_OS_ID_LIKE=rhel centos fedora rocky MGS_OS_VERSION_ID=8.10 MGS_OS_ID=rocky MGS_OS_VERSION_CODE=134873088 mds1: Rocky Linux release 8.10 (Green Obsidian) MDS1_OS_VERSION_ID=8.10 MDS1_OS_VERSION_CODE=134873088 MDS1_OS_ID_LIKE=rhel centos fedora rocky MDS1_OS_ID=rocky ost1: Rocky Linux release 8.10 (Green Obsidian) OST1_OS_VERSION_CODE=134873088 OST1_OS_ID_LIKE=rhel centos fedora rocky OST1_OS_VERSION_ID=8.10 OST1_OS_ID=rocky client: Rocky Linux release 8.10 (Green Obsidian) CLIENT_OS_ID=rocky CLIENT_OS_VERSION_CODE=134873088 CLIENT_OS_VERSION_ID=8.10 CLIENT_OS_ID_LIKE=rhel centos fedora rocky oleg645-server: ls: cannot access '/home/green/git/lustre-release/lustre/tests/except/replay-ost-single.*ex': No such file or directory excepting tests: === replay-ost-single: start setup 03:45:18 (1743493518) === oleg645-client.virtnet: executing check_config_client /mnt/lustre oleg645-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg645-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff890a87821000.idle_timeout=debug osc.lustre-OST0001-osc-ffff890a87821000.idle_timeout=debug disable quota as required oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all === replay-ost-single: finish setup 03:46:27 (1743493587) === /mnt/lustre/d0.replay-ost-single stripe_count: 1 stripe_size: 4194304 pattern: raid0 stripe_offset: 0 == replay-ost-single test 0a: target handle mismatch (bug 5317) ========================================================== 03:46:32 (1743493592) Stopping client oleg645-client.virtnet /mnt/lustre (opts:-f) fail_loc=0x80000211 Starting client: oleg645-client.virtnet: -o user_xattr,flock 192.168.206.145@tcp:/lustre /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 3771392 3072 3766272 1% /mnt/lustre PASS 0a (38s) == replay-ost-single test 0b: empty replay =============== 03:47:10 (1743493630) Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 03:47:24 (1743493644) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 03:48:10 (1743493690) targets are mounted 03:48:10 (1743493690) facet_failover done oleg645-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 0b (94s) == replay-ost-single test 1: touch ======================= 03:48:44 (1743493724) Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 03:48:59 (1743493739) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 03:49:45 (1743493785) targets are mounted 03:49:45 (1743493785) facet_failover done oleg645-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec /mnt/lustre/d0.replay-ost-single/f1.replay-ost-single has type file OK PASS 1 (93s) == replay-ost-single test 2: |x| 10 open(O_CREAT)s ======= 03:50:18 (1743493818) Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 03:50:32 (1743493832) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 03:51:14 (1743493874) targets are mounted 03:51:14 (1743493874) facet_failover done oleg645-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 2 (88s) == replay-ost-single test 3: Fail OST during write, with verification ========================================================== 03:51:46 (1743493906) Failing ost1 on oleg645-server 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 1.42909 s, 3.7 MB/s Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 03:52:00 (1743493920) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 03:52:41 (1743493961) targets are mounted 03:52:41 (1743493961) facet_failover done oleg645-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 3 (88s) == replay-ost-single test 4: Fail OST during read, with verification ========================================================== 03:53:15 (1743493995) 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 1.86699 s, 2.8 MB/s Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 03:53:30 (1743494010) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 03:54:13 (1743494053) targets are mounted 03:54:13 (1743494053) facet_failover done oleg645-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 4 (91s) == replay-ost-single test 5: Fail OST during iozone ====== 03:54:46 (1743494086) iozone bg pid=17061 + iozone -i 0 -i 1 -+d -r 4 -s 1048576 -f /mnt/lustre/d0.replay-ost-single/f5.replay-ost-single tmppipe=/tmp/replay-ost-single.test_5.pipe iozone pid=17066 Iozone: Performance Test of File I/O Version $Revision: 3.483 $ Compiled for 64 bit mode. Build: linux-AMD64 Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root, Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer, Vangel Bojaxhi, Ben England, Vikentsi Lapa, Alexey Skidanov, Sudhir Kumar. Run began: Tue Apr 1 03:54:50 2025 >>> I/O Diagnostic mode enabled. <<< Performance measurements are invalid in this mode. Record Size 4 kB File size set to 1048576 kB Command line used: iozone -i 0 -i 1 -+d -r 4 -s 1048576 -f /mnt/lustre/d0.replay-ost-single/f5.replay-ost-single Output is in kBytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 kBytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 03:55:17 (1743494117) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 03:56:06 (1743494166) targets are mounted 03:56:06 (1743494166) facet_failover done oleg645-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec 1048576 4 2517 3437 54625 46526 iozone test complete. iozone rc=0 sleep 5 for ZFS MDS sleep 5 for ZFS OST PASS 5 (861s) == replay-ost-single test 6: Fail OST before obd_destroy ========================================================== 04:09:07 (1743494947) Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed wait 40 secs maximumly for oleg645-server mds-ost sync done. Waiting for MDT destroys to complete 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 2.03087 s, 2.6 MB/s /mnt/lustre/d0.replay-ost-single/f6.replay-ost-single lmm_stripe_count: 1 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 18 0x12 0x240000400 fail_loc=0x80000119 Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed wait 40 secs maximumly for oleg645-server mds-ost sync done. before_free: 7536640 after_dd_free: 7531520 took 0 seconds Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 04:10:18 (1743495018) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 04:10:56 (1743495056) targets are mounted 04:10:56 (1743495056) facet_failover done oleg645-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec affected facets: ost1 oleg645-server: oleg645-server.virtnet: executing _wait_recovery_complete *.lustre-OST0000.recovery_status 1475 oleg645-server: *.lustre-OST0000.recovery_status status: COMPLETE Can't lstat /mnt/lustre/d0.replay-ost-single/f6.replay-ost-single: No such file or directory Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed wait 40 secs maximumly for oleg645-server mds-ost sync done. sleep 5 for ZFS MDS Waiting for MDT destroys to complete free_before: 7536640 free_after: 7536640 PASS 6 (165s) == replay-ost-single test 7: Fail OST before obd_destroy ========================================================== 04:11:51 (1743495111) Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed wait 40 secs maximumly for oleg645-server mds-ost sync done. Waiting for MDT destroys to complete 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 2.18908 s, 2.4 MB/s before: 7536640 after_dd: 7531520 took 4 seconds UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 8192 3761152 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 11264 7527424 1% /mnt/lustre Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 04:12:39 (1743495159) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 04:13:17 (1743495197) targets are mounted 04:13:17 (1743495197) facet_failover done oleg645-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec affected facets: ost1 oleg645-server: oleg645-server.virtnet: executing _wait_recovery_complete *.lustre-OST0000.recovery_status 1475 oleg645-server: *.lustre-OST0000.recovery_status status: COMPLETE Can't lstat /mnt/lustre/d0.replay-ost-single/f7.replay-ost-single: No such file or directory Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed wait 40 secs maximumly for oleg645-server mds-ost sync done. sleep 5 for ZFS MDS Waiting for MDT destroys to complete before: 7536640 after: 7536640 PASS 7 (142s) == replay-ost-single test 8a: Verify redo io: redo io when get -EINPROGRESS error ========================================================== 04:14:14 (1743495254) 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 0.579395 s, 9.0 MB/s fail_loc=0x230 fail_loc=0 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 190.982 s, 27.5 kB/s PASS 8a (210s) == replay-ost-single test 8b: Verify redo io: redo io should success after recovery ========================================================== 04:17:44 (1743495464) 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 0.450613 s, 11.6 MB/s fail_loc=0x230 Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 04:18:24 (1743495504) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 04:19:02 (1743495542) targets are mounted 04:19:02 (1743495542) facet_failover done oleg645-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec fail_loc=0 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 271.415 s, 19.3 kB/s PASS 8b (292s) == replay-ost-single test 8c: Verify redo io: redo io should fail after eviction ========================================================== 04:22:36 (1743495756) 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 0.295934 s, 17.7 MB/s fail_loc=0x230 dd: error writing '/mnt/lustre/d0.replay-ost-single/f8c.replay-ost-single': Cannot send after transport endpoint shutdown 1+0 records in 0+0 records out 0 bytes copied, 29.159 s, 0.0 kB/s fail_loc=0 /tmp/verify-7374 /mnt/lustre/d0.replay-ost-single/f8c.replay-ost-single differ: byte 1, line 1 PASS 8c (67s) == replay-ost-single test 8d: Verify redo creation on -EINPROGRESS ========================================================== 04:23:44 (1743495824) fail_loc=0x187 fail_loc=0 File: /mnt/lustre/d0.replay-ost-single/f8d.replay-ost-single Size: 0 Blocks: 0 IO Block: 4194304 regular empty file Device: 2c54f966h/743766374d Inode: 144115205289279511 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2025-04-01 04:23:49.000000000 -0400 Modify: 2025-04-01 04:23:49.000000000 -0400 Change: 2025-04-01 04:23:49.000000000 -0400 Birth: 2025-04-01 04:24:17.000000000 -0400 fail_loc=0x187 fail_loc=0 Succeed in opening file "/mnt/lustre/d0.replay-ost-single/f8d.replay-ost-single"(flags=O_RDWR) File: /mnt/lustre/d0.replay-ost-single/f8d.replay-ost-single Size: 0 Blocks: 1 IO Block: 4194304 regular empty file Device: 2c54f966h/743766374d Inode: 144115205289279512 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2025-04-01 04:24:21.000000000 -0400 Modify: 2025-04-01 04:24:21.000000000 -0400 Change: 2025-04-01 04:24:21.000000000 -0400 Birth: 2025-04-01 04:24:49.000000000 -0400 PASS 8d (78s) == replay-ost-single test 8e: Verify that ptlrpc resends request on -EINPROGRESS ========================================================== 04:25:01 (1743495901) fail_loc=0x231 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 4096 3765248 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 7168 7531520 1% /mnt/lustre PASS 8e (40s) == replay-ost-single test 9: Verify that no req deadline happened during recovery ========================================================== 04:25:42 (1743495942) 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.394975 s, 2.7 MB/s UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 5120 3764224 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 8192 7530496 1% /mnt/lustre 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0667183 s, 15.7 MB/s fail_loc=0x00000714 fail_val=20 Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 04:26:09 (1743495969) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 04:26:49 (1743496009) targets are mounted 04:26:49 (1743496009) facet_failover done oleg645-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec fail_loc=0 PASS 9 (143s) == replay-ost-single test 10: conflicting PW & PR locks on a client ========================================================== 04:28:06 (1743496086) 10+0 records in 10+0 records out 5120 bytes (5.1 kB, 5.0 KiB) copied, 0.0482269 s, 106 kB/s fail_val=60 fail_loc=0x414 Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 04:28:21 (1743496101) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 04:29:01 (1743496141) targets are mounted 04:29:01 (1743496141) facet_failover done fail_loc=0x32a File: /mnt/lustre/d0.replay-ost-single/f10.replay-ost-single Size: 5120 Blocks: 11 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205289279517 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2025-04-01 04:28:09.000000000 -0400 Modify: 2025-04-01 04:28:09.000000000 -0400 Change: 2025-04-01 04:28:09.000000000 -0400 Birth: 2025-04-01 04:28:08.000000000 -0400 PASS 10 (77s) == replay-ost-single test 12a: glimpse after OST failover to a missing object ========================================================== 04:29:23 (1743496163) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 5120 3764224 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 8192 7530496 1% /mnt/lustre - open/close 410 (time 1743496189.60 total 10.01 last 40.94) total: 500 open/close in 12.44 seconds: 40.19 ops/second - open/close 399 (time 1743496212.64 total 10.02 last 39.84) total: 500 open/close in 13.07 seconds: 38.25 ops/second - open/close 429 (time 1743496236.18 total 10.02 last 42.82) total: 500 open/close in 12.61 seconds: 39.66 ops/second - open/close 412 (time 1743496257.95 total 10.01 last 41.17) total: 500 open/close in 12.39 seconds: 40.35 ops/second - open/close 447 (time 1743496279.14 total 10.01 last 44.66) total: 500 open/close in 11.62 seconds: 43.01 ops/second - open/close 391 (time 1743496301.95 total 10.00 last 39.09) total: 500 open/close in 13.11 seconds: 38.15 ops/second - open/close 412 (time 1743496324.20 total 10.06 last 40.97) total: 500 open/close in 12.26 seconds: 40.77 ops/second - open/close 425 (time 1743496345.72 total 10.00 last 42.49) total: 500 open/close in 12.09 seconds: 41.34 ops/second - open/close 426 (time 1743496367.11 total 10.02 last 42.50) total: 500 open/close in 11.95 seconds: 41.83 ops/second - open/close 356 (time 1743496388.44 total 10.01 last 35.57) total: 500 open/close in 13.87 seconds: 36.06 ops/second Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 04:33:32 (1743496412) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 04:34:11 (1743496451) targets are mounted 04:34:11 (1743496451) facet_failover done starting wait for ls -l PASS 12a (400s) == replay-ost-single test 12b: write after OST failover to a missing object ========================================================== 04:36:03 (1743496563) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3712 2204928 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 5120 3764224 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 8192 7530496 1% /mnt/lustre - open/close 424 (time 1743496589.93 total 10.02 last 42.31) total: 500 open/close in 12.59 seconds: 39.71 ops/second - open/close 417 (time 1743496611.76 total 10.00 last 41.69) total: 500 open/close in 12.23 seconds: 40.88 ops/second - open/close 418 (time 1743496634.05 total 10.00 last 41.79) total: 500 open/close in 12.00 seconds: 41.67 ops/second - open/close 414 (time 1743496654.90 total 10.02 last 41.34) total: 500 open/close in 12.47 seconds: 40.10 ops/second - open/close 439 (time 1743496677.68 total 10.06 last 43.62) total: 500 open/close in 11.60 seconds: 43.11 ops/second - open/close 411 (time 1743496699.50 total 10.01 last 41.05) total: 500 open/close in 12.57 seconds: 39.79 ops/second - open/close 406 (time 1743496722.86 total 10.01 last 40.56) total: 500 open/close in 13.06 seconds: 38.28 ops/second - open/close 420 (time 1743496745.97 total 10.01 last 41.96) total: 500 open/close in 12.14 seconds: 41.20 ops/second - open/close 393 (time 1743496768.68 total 10.00 last 39.29) total: 500 open/close in 13.37 seconds: 37.40 ops/second - open/close 338 (time 1743496792.43 total 10.01 last 33.77) total: 500 open/close in 14.51 seconds: 34.46 ops/second fail_loc=0x16e fail_val=10 Failing ost1 on oleg645-server Stopping /mnt/lustre-ost1 (opts:) on oleg645-server 04:40:14 (1743496814) shut down facet: ost1 facet_host: oleg645-server facet_failover_host: oleg645-server Failover ost1 to oleg645-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg645-server: oleg645-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg645-client: oleg645-server: ssh exited with exit code 1 Started lustre-OST0000 04:40:54 (1743496854) targets are mounted 04:40:54 (1743496854) facet_failover done 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.010867 s, 377 kB/s PASS 12b (394s) == replay-ost-single test complete, duration 3481 sec ==== 04:42:37 (1743496957) === replay-ost-single: start cleanup 04:42:41 (1743496961) === === replay-ost-single: finish cleanup 04:42:49 (1743496969) ===