-----============= acceptance-small: replay-ost-single ============----- Mon May 20 21:09:19 EDT 2024 mgs: CentOS Linux release 7.9.2009 (Core) MGS_OS_VERSION_ID=7 MGS_OS_ID=centos MGS_OS_VERSION_CODE=117440512 MGS_OS_ID_LIKE=rhel fedora centos mds1: CentOS Linux release 7.9.2009 (Core) MDS1_OS_ID_LIKE=rhel fedora centos MDS1_OS_ID=centos MDS1_OS_VERSION_ID=7 MDS1_OS_VERSION_CODE=117440512 ost1: CentOS Linux release 7.9.2009 (Core) OST1_OS_VERSION_CODE=117440512 OST1_OS_VERSION_ID=7 OST1_OS_ID_LIKE=rhel fedora centos OST1_OS_ID=centos client: CentOS Linux release 7.9.2009 (Core) CLIENT_OS_ID=centos CLIENT_OS_ID_LIKE=rhel fedora centos CLIENT_OS_VERSION_ID=7 CLIENT_OS_VERSION_CODE=117440512 excepting tests: === replay-ost-single: start setup 21:09:29 (1716253769) === oleg111-client.virtnet: executing check_config_client /mnt/lustre oleg111-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg111-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff8800b6356800.idle_timeout=debug osc.lustre-OST0001-osc-ffff8800b6356800.idle_timeout=debug disable quota as required oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all osd-ldiskfs.track_declares_assert=1 === replay-ost-single: finish setup 21:09:46 (1716253786) === /mnt/lustre/d0.replay-ost-single stripe_count: 1 stripe_size: 4194304 pattern: raid0 stripe_offset: 0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 0a: target handle mismatch (bug 5317) ========================================================== 21:09:48 (1716253788) Stopping client oleg111-client.virtnet /mnt/lustre (opts:-f) fail_loc=0x80000211 Starting client: oleg111-client.virtnet: -o user_xattr,flock oleg111-server@tcp:/lustre /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1776 1285912 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 3833116 1524 3605496 1% /mnt/lustre PASS 0a (15s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 0b: empty replay =============== 21:10:07 (1716253807) Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:10:09 (1716253809) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:10:25 (1716253825) targets are mounted 21:10:25 (1716253825) facet_failover done oleg111-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 0b (25s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 1: touch ======================= 21:10:34 (1716253834) Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:10:36 (1716253836) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:10:51 (1716253851) targets are mounted 21:10:51 (1716253851) facet_failover done oleg111-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec /mnt/lustre/d0.replay-ost-single/f1.replay-ost-single has type file OK PASS 1 (22s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 2: |x| 10 open(O_CREAT)s ======= 21:10:58 (1716253858) Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:10:59 (1716253859) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:11:13 (1716253873) targets are mounted 21:11:13 (1716253873) facet_failover done oleg111-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 2 (20s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 3: Fail OST during write, with verification ========================================================== 21:11:19 (1716253879) Failing ost1 on oleg111-server 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB) copied, 0.125614 s, 41.7 MB/s Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:11:21 (1716253881) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:11:34 (1716253894) targets are mounted 21:11:34 (1716253894) facet_failover done oleg111-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 3 (20s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 4: Fail OST during read, with verification ========================================================== 21:11:40 (1716253900) 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB) copied, 0.153512 s, 34.2 MB/s Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:11:42 (1716253902) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:11:55 (1716253915) targets are mounted 21:11:55 (1716253915) facet_failover done oleg111-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 4 (20s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 5: Fail OST during iozone ====== 21:12:01 (1716253921) iozone bg pid=18202 + iozone -i 0 -i 1 -+d -r 4 -s 1048576 -f /mnt/lustre/d0.replay-ost-single/f5.replay-ost-single tmppipe=/tmp/replay-ost-single.test_5.pipe iozone pid=18209 Iozone: Performance Test of File I/O Version $Revision: 3.483 $ Compiled for 64 bit mode. Build: linux-AMD64 Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins Al Slater, Scott Rhine, Mike Wisner, Ken Goss Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR, Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner, Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone, Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root, Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer, Vangel Bojaxhi, Ben England, Vikentsi Lapa, Alexey Skidanov, Sudhir Kumar. Run began: Mon May 20 21:12:02 2024 >>> I/O Diagnostic mode enabled. <<< Performance measurements are invalid in this mode. Record Size 4 kB File size set to 1048576 kB Command line used: iozone -i 0 -i 1 -+d -r 4 -s 1048576 -f /mnt/lustre/d0.replay-ost-single/f5.replay-ost-single Output is in kBytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 kBytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. random random bkwd record stride kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:12:11 (1716253931) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:12:24 (1716253944) targets are mounted 21:12:24 (1716253944) facet_failover done oleg111-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec 1048576 4 23447 35391 417622 454078 iozone test complete. iozone rc=0 PASS 5 (89s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 6: Fail OST before obd_destroy ========================================================== 21:13:32 (1716254012) Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0000-osc-MDT0001.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0001.old_sync_processed wait 40 secs maximumly for oleg111-server mds-ost sync done. Waiting for MDT destroys to complete 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB) copied, 0.244157 s, 21.5 MB/s /mnt/lustre/d0.replay-ost-single/f6.replay-ost-single lmm_stripe_count: 1 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 18 0x12 0x280000401 fail_loc=0x80000119 Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0000-osc-MDT0001.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0001.old_sync_processed wait 40 secs maximumly for oleg111-server mds-ost sync done. before_free: 7663168 after_dd_free: 7658048 took 0 seconds Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:14:05 (1716254045) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:14:20 (1716254060) targets are mounted 21:14:20 (1716254060) facet_failover done oleg111-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec affected facets: ost1 oleg111-server: oleg111-server.virtnet: executing _wait_recovery_complete *.lustre-OST0000.recovery_status 1475 oleg111-server: *.lustre-OST0000.recovery_status status: COMPLETE Can't lstat /mnt/lustre/d0.replay-ost-single/f6.replay-ost-single: No such file or directory Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0000-osc-MDT0001.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0001.old_sync_processed wait 40 secs maximumly for oleg111-server mds-ost sync done. Waiting for MDT destroys to complete free_before: 7663168 free_after: 7663168 PASS 6 (59s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 7: Fail OST before obd_destroy ========================================================== 21:14:33 (1716254073) Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0000-osc-MDT0001.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0001.old_sync_processed wait 40 secs maximumly for oleg111-server mds-ost sync done. Waiting for MDT destroys to complete 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB) copied, 0.256084 s, 20.5 MB/s before: 7663168 after_dd: 7658048 took 2 seconds UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1776 1285912 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 6660 3600360 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 8184 7205856 1% /mnt/lustre Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:14:49 (1716254089) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:15:05 (1716254105) targets are mounted 21:15:05 (1716254105) facet_failover done oleg111-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec affected facets: ost1 oleg111-server: oleg111-server.virtnet: executing _wait_recovery_complete *.lustre-OST0000.recovery_status 1475 oleg111-server: *.lustre-OST0000.recovery_status status: COMPLETE Can't lstat /mnt/lustre/d0.replay-ost-single/f7.replay-ost-single: No such file or directory Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0000-osc-MDT0001.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0001.old_sync_processed wait 40 secs maximumly for oleg111-server mds-ost sync done. Waiting for MDT destroys to complete before: 7663168 after: 7663168 PASS 7 (43s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 8a: Verify redo io: redo io when get -EINPROGRESS error ========================================================== 21:15:18 (1716254118) 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB) copied, 0.0674575 s, 77.7 MB/s fail_loc=0x230 fail_loc=0 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB) copied, 28.4674 s, 184 kB/s PASS 8a (31s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 8b: Verify redo io: redo io should success after recovery ========================================================== 21:15:52 (1716254152) 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB) copied, 0.0719906 s, 72.8 MB/s fail_loc=0x230 Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:16:14 (1716254174) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:16:29 (1716254189) targets are mounted 21:16:29 (1716254189) facet_failover done oleg111-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec fail_loc=0 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB) copied, 50.3238 s, 104 kB/s PASS 8b (52s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 8c: Verify redo io: redo io should fail after eviction ========================================================== 21:16:47 (1716254207) 1280+0 records in 1280+0 records out 5242880 bytes (5.2 MB) copied, 0.0630193 s, 83.2 MB/s fail_loc=0x230 dd: error writing '/mnt/lustre/d0.replay-ost-single/f8c.replay-ost-single': Input/output error 1+0 records in 0+0 records out 0 bytes (0 B) copied, 21.6142 s, 0.0 kB/s fail_loc=0 cmp: EOF on /mnt/lustre/d0.replay-ost-single/f8c.replay-ost-single PASS 8c (45s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 8d: Verify redo creation on -EINPROGRESS ========================================================== 21:17:34 (1716254254) fail_loc=0x187 fail_loc=0 File: '/mnt/lustre/d0.replay-ost-single/f8d.replay-ost-single' Size: 0 Blocks: 0 IO Block: 4194304 regular empty file Device: 2c54f966h/743766374d Inode: 144115205306056727 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2024-05-20 21:17:35.000000000 -0400 Modify: 2024-05-20 21:17:35.000000000 -0400 Change: 2024-05-20 21:17:35.000000000 -0400 Birth: - fail_loc=0x187 fail_loc=0 Succeed in opening file "/mnt/lustre/d0.replay-ost-single/f8d.replay-ost-single"(flags=O_RDWR) File: '/mnt/lustre/d0.replay-ost-single/f8d.replay-ost-single' Size: 0 Blocks: 0 IO Block: 4194304 regular empty file Device: 2c54f966h/743766374d Inode: 144115205306056728 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2024-05-20 21:17:57.000000000 -0400 Modify: 2024-05-20 21:17:57.000000000 -0400 Change: 2024-05-20 21:17:57.000000000 -0400 Birth: - PASS 8d (45s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 8e: Verify that ptlrpc resends request on -EINPROGRESS ========================================================== 21:18:22 (1716254302) fail_loc=0x231 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1776 1285912 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre PASS 8e (24s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 9: Verify that no req deadline happened during recovery ========================================================== 21:18:47 (1716254327) 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.0250841 s, 41.8 MB/s UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1776 1285912 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3604432 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7209928 1% /mnt/lustre 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.0137474 s, 76.3 MB/s fail_loc=0x00000714 fail_val=20 Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:18:52 (1716254332) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:19:08 (1716254348) targets are mounted 21:19:08 (1716254348) facet_failover done oleg111-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec fail_loc=0 PASS 9 (65s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 10: conflicting PW & PR locks on a client ========================================================== 21:19:54 (1716254394) 10+0 records in 10+0 records out 5120 bytes (5.1 kB) copied, 0.00581737 s, 880 kB/s fail_val=60 fail_loc=0x414 Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:19:58 (1716254398) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:20:13 (1716254413) targets are mounted 21:20:13 (1716254413) facet_failover done fail_loc=0x32a File: '/mnt/lustre/d0.replay-ost-single/f10.replay-ost-single' Size: 5120 Blocks: 0 IO Block: 4194304 regular file Device: 2c54f966h/743766374d Inode: 144115205306056733 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2024-05-20 21:19:55.000000000 -0400 Modify: 2024-05-20 21:19:55.000000000 -0400 Change: 2024-05-20 21:19:55.000000000 -0400 Birth: - PASS 10 (63s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 12a: glimpse after OST failover to a missing object ========================================================== 21:20:59 (1716254459) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1896 1285792 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1712 1285976 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 2572 3604448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 4096 7209944 1% /mnt/lustre total: 500 open/close in 2.16 seconds: 231.59 ops/second total: 500 open/close in 2.10 seconds: 238.40 ops/second total: 500 open/close in 1.99 seconds: 251.38 ops/second total: 500 open/close in 2.06 seconds: 242.96 ops/second total: 500 open/close in 2.07 seconds: 241.35 ops/second total: 500 open/close in 2.12 seconds: 236.00 ops/second total: 500 open/close in 2.10 seconds: 238.30 ops/second total: 500 open/close in 2.11 seconds: 236.43 ops/second total: 500 open/close in 2.10 seconds: 237.95 ops/second total: 500 open/close in 2.08 seconds: 240.89 ops/second Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:21:39 (1716254499) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:21:55 (1716254515) targets are mounted 21:21:55 (1716254515) facet_failover done starting wait for ls -l PASS 12a (72s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-ost-single test 12b: write after OST failover to a missing object ========================================================== 21:22:13 (1716254533) striped dir -i0 -c2 -H crush /mnt/lustre/d12b.replay-ost-single UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2232 1285456 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2080 1285608 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 2572 3604448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 4096 7209944 1% /mnt/lustre total: 500 open/close in 1.96 seconds: 255.35 ops/second total: 500 open/close in 2.05 seconds: 243.67 ops/second total: 500 open/close in 2.07 seconds: 242.08 ops/second total: 500 open/close in 2.10 seconds: 238.10 ops/second total: 500 open/close in 2.06 seconds: 243.15 ops/second total: 500 open/close in 2.03 seconds: 246.36 ops/second total: 500 open/close in 2.00 seconds: 249.75 ops/second total: 500 open/close in 2.10 seconds: 237.73 ops/second total: 500 open/close in 1.85 seconds: 270.76 ops/second total: 500 open/close in 2.04 seconds: 245.59 ops/second fail_loc=0x16e fail_val=10 Failing ost1 on oleg111-server Stopping /mnt/lustre-ost1 (opts:) on oleg111-server 21:22:50 (1716254570) shut down Failover ost1 to oleg111-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg111-server: oleg111-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg111-client: oleg111-server: ssh exited with exit code 1 Started lustre-OST0000 21:23:07 (1716254587) targets are mounted 21:23:07 (1716254587) facet_failover done 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.0025012 s, 1.6 MB/s PASS 12b (69s) debug_raw_pointers=0 debug_raw_pointers=0 == replay-ost-single test complete, duration 843 sec ===== 21:23:23 (1716254603) === replay-ost-single: start cleanup 21:23:23 (1716254603) === === replay-ost-single: finish cleanup 21:23:23 (1716254603) ===