-----============= acceptance-small: replay-dual ============----- Mon Mar 16 09:37:32 EDT 2026 mgs: Rocky Linux release 8.10 (Green Obsidian) MGS_OS_ID_LIKE=rhel centos fedora rocky MGS_OS_VERSION_ID=8.10 MGS_OS_ID=rocky MGS_OS_VERSION_CODE=134873088 mds1: Rocky Linux release 8.10 (Green Obsidian) MDS1_OS_VERSION_ID=8.10 MDS1_OS_VERSION_CODE=134873088 MDS1_OS_ID_LIKE=rhel centos fedora rocky MDS1_OS_ID=rocky ost1: Rocky Linux release 8.10 (Green Obsidian) OST1_OS_VERSION_CODE=134873088 OST1_OS_ID_LIKE=rhel centos fedora rocky OST1_OS_VERSION_ID=8.10 OST1_OS_ID=rocky client: Rocky Linux release 8.10 (Green Obsidian) CLIENT_OS_ID=rocky CLIENT_OS_VERSION_CODE=134873088 CLIENT_OS_VERSION_ID=8.10 CLIENT_OS_ID_LIKE=rhel centos fedora rocky oleg348-server: ls: cannot access '/home/green/git/lustre-release/lustre/tests/except/replay-dual.*ex': No such file or directory excepting tests: 14b 21b skipping tests SLOW=no: 21b === replay-dual: start setup 09:37:39 (1773668259) === Starting client oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 Mount client oleg348-client.virtnet: mount -t lustre -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 Started clients oleg348-client.virtnet: 192.168.203.148@tcp:/lustre on /mnt/lustre2 type lustre (rw,checksum,encrypt,flock,lazystatfs,lruresize,nolock,statfs_project,nouser_fid2path,user_xattr,verbose) oleg348-client.virtnet: executing check_config_client /mnt/lustre oleg348-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg348-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff9b88d2d9b000.idle_timeout=debug osc.lustre-OST0000-osc-ffff9b88d36c0000.idle_timeout=debug osc.lustre-OST0001-osc-ffff9b88d2d9b000.idle_timeout=debug osc.lustre-OST0001-osc-ffff9b88d36c0000.idle_timeout=debug disable quota as required oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all osd-ldiskfs.track_declares_assert=1 === replay-dual: finish setup 09:37:53 (1773668273) === == replay-dual test 0a: expired recovery with lost client ========================================================== 09:37:53 (1773668273) Check file is LU482_FAILED=/tmp/replay-dual.lu482.88Ytz5 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1708 1269596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1532 1269772 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre total: 50 open/close in 0.73 seconds: 68.85 ops/second fail_loc=0x80000514 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:37:59 (1773668279) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:38:15 (1773668295) targets are mounted 09:38:15 (1773668295) facet_failover done Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 - unlinked 0 (time 1773668411 ; total 0 ; last 0) total: 50 unlinks in 1 seconds: 50.000000 unlinks/second PASS 0a (141s) == replay-dual test 0b: lost client during waiting for next transno ========================================================== 09:40:14 (1773668414) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1676 1269628 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1532 1269772 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:40:19 (1773668419) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:40:35 (1773668435) targets are mounted 09:40:35 (1773668435) facet_failover done Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 PASS 0b (101s) == replay-dual test 1: |X| simple create ================= 09:41:55 (1773668515) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1676 1269628 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1532 1269772 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:42:00 (1773668520) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:42:17 (1773668537) targets are mounted 09:42:17 (1773668537) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 1 (33s) == replay-dual test 2: |X| mkdir adir ==================== 09:42:28 (1773668548) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1676 1269628 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1532 1269772 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:42:41 (1773668561) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:43:12 (1773668592) targets are mounted 09:43:12 (1773668592) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 2 (61s) == replay-dual test 3: |X| mkdir adir, mkdir adir/bdir === 09:43:30 (1773668610) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1676 1269628 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1532 1269772 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:43:49 (1773668629) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:44:10 (1773668650) targets are mounted 09:44:10 (1773668650) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 3 (51s) == replay-dual test 4: |X| mkdir adir (-EEXIST), mkdir adir/bdir ========================================================== 09:44:21 (1773668661) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1848 1269456 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre mkdir: cannot create directory '/mnt/lustre/adir': File exists Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:44:31 (1773668671) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:45:02 (1773668702) targets are mounted 09:45:02 (1773668702) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 4 (54s) == replay-dual test 5: open, unlink |X| close ============ 09:45:15 (1773668715) multiop /mnt/lustre2/a vo_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.8172 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:45:25 (1773668725) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:45:47 (1773668747) targets are mounted 09:45:47 (1773668747) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 5 (42s) == replay-dual test 6: open1, open2, unlink |X| close1 [fail mds1] close2 ========================================================== 09:45:57 (1773668757) multiop /mnt/lustre2/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.8172 multiop /mnt/lustre/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.8172 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:46:06 (1773668766) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:46:28 (1773668788) targets are mounted 09:46:28 (1773668788) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 6 (43s) == replay-dual test 8: replay of resent request ========== 09:46:40 (1773668800) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre fail_loc=0x119 fail_loc=0 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:47:06 (1773668826) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:47:27 (1773668847) targets are mounted 09:47:27 (1773668847) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 8 (59s) == replay-dual test 9: resending a replayed create ======= 09:47:39 (1773668859) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:47:51 (1773668871) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:48:18 (1773668898) targets are mounted 09:48:18 (1773668898) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 9 (62s) == replay-dual test 10: resending a replayed unlink ====== 09:48:42 (1773668922) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:48:50 (1773668930) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:49:20 (1773668960) targets are mounted 09:49:20 (1773668960) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 10 (60s) == replay-dual test 11: both clients timeout during replay ========================================================== 09:49:42 (1773668982) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre fail_loc=0x0119 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:49:51 (1773668991) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:50:21 (1773669021) targets are mounted 09:50:21 (1773669021) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 12 sec fail_loc=0 PASS 11 (58s) == replay-dual test 12: open resend timeout ============== 09:50:40 (1773669040) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre multiop /mnt/lustre/f12.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.8172 fail_loc=0x80000302 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:50:49 (1773669049) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:51:07 (1773669067) targets are mounted 09:51:07 (1773669067) facet_failover done fail_loc=0 /mnt/lustre/f12.replay-dual /mnt/lustre/f12.replay-dual has type file OK PASS 12 (32s) == replay-dual test 13: close resend timeout ============= 09:51:12 (1773669072) multiop /mnt/lustre/f13.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.8172 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre fail_loc=0x80000115 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:51:20 (1773669080) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:51:39 (1773669099) targets are mounted 09:51:39 (1773669099) facet_failover done fail_loc=0 /mnt/lustre/f13.replay-dual /mnt/lustre/f13.replay-dual has type file OK PASS 13 (32s) SKIP: replay-dual test_14b skipping ALWAYS excluded test 14b == replay-dual test 15a: timeout waiting for lost client during replay, 1 client completes ========================================================== 09:51:45 (1773669105) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre total: 25 open/close in 0.76 seconds: 32.75 ops/second total: 1 open/close in 0.04 seconds: 27.79 ops/second Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:51:54 (1773669114) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:52:14 (1773669134) targets are mounted 09:52:14 (1773669134) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1773669212 ; total 0 ; last 0) total: 25 unlinks in 1 seconds: 25.000000 unlinks/second Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 PASS 15a (114s) == replay-dual test 15c: remove multiple OST orphans ===== 09:53:39 (1773669219) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:55:31 (1773669331) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:56:00 (1773669360) targets are mounted 09:56:00 (1773669360) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 PASS 15c (219s) == replay-dual test 16: fail MDS during recovery (3571) == 09:57:18 (1773669438) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1820 1269484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre total: 25 open/close in 0.70 seconds: 35.85 ops/second total: 1 open/close in 0.04 seconds: 24.33 ops/second Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:57:29 (1773669449) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:57:58 (1773669478) targets are mounted 09:57:58 (1773669478) facet_failover done Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 09:58:22 (1773669502) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 09:58:41 (1773669521) targets are mounted 09:58:41 (1773669521) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1773669595 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 PASS 16 (160s) == replay-dual test 17: fail OST during recovery (3571) == 09:59:58 (1773669598) total: 25 open/close in 0.40 seconds: 62.18 ops/second total: 1 open/close in 0.03 seconds: 31.10 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1916 1269388 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing ost1 on oleg348-server Stopping /mnt/lustre-ost1 (opts:) on oleg348-server 10:00:05 (1773669605) shut down facet: ost1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover ost1 to oleg348-server mount facets: ost1 Start ost1: mount -t lustre -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-OST0000 10:00:25 (1773669625) targets are mounted 10:00:25 (1773669625) facet_failover done Failing ost1 on oleg348-server Stopping /mnt/lustre-ost1 (opts:) on oleg348-server 10:00:47 (1773669647) shut down facet: ost1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover ost1 to oleg348-server mount facets: ost1 Start ost1: mount -t lustre -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-OST0000 10:01:03 (1773669663) targets are mounted 10:01:03 (1773669663) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec - unlinked 0 (time 1773669735 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 PASS 17 (139s) == replay-dual test 18: ldlm_handle_enqueue succeeds on evicted export (3822) ========================================================== 10:02:17 (1773669737) debug=+dlmtrace fail_loc=0x8000030b using seed 3009303585 running for 500 iterations total: 500 stats in 0 seconds: inf stats/second ldlm.namespaces.MGC192.168.203.148@tcp.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff9b88c5be5000.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff9b88d2de4000.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0001-mdc-ffff9b88c5be5000.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0001-mdc-ffff9b88d2de4000.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff9b88c5be5000.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff9b88d2de4000.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff9b88c5be5000.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff9b88d2de4000.early_lock_cancel=0 fail_loc=0x80000305 Error in opening file "/mnt/lustre2/d18.replay-dual/f18.replay-dual"(flags=O_RDONLY) 2: No such file or directory ldlm.namespaces.MGC192.168.203.148@tcp.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff9b88c5be5000.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff9b88d2de4000.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0001-mdc-ffff9b88c5be5000.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0001-mdc-ffff9b88d2de4000.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff9b88c5be5000.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff9b88d2de4000.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff9b88c5be5000.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff9b88d2de4000.early_lock_cancel=1 fail_loc=0 fail_loc=0 PASS 18 (47s) == replay-dual test 19: resend of open request =========== 10:03:04 (1773669784) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1920 1269384 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre fail_loc=0x157 - open/close 0 (time 1773669874.18 total 86.36 last 0.00) total: 1 open/close in 86.36 seconds: 0.01 ops/second fail_loc=0 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:04:36 (1773669876) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:04:51 (1773669891) targets are mounted 10:04:51 (1773669891) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 19 (115s) == replay-dual test 20: recovery time is not increasing == 10:04:59 (1773669899) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1896 1269408 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:05:03 (1773669903) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:05:18 (1773669918) targets are mounted 10:05:18 (1773669918) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1896 1269408 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:07:48 (1773670068) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:08:03 (1773670083) targets are mounted 10:08:03 (1773670083) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 PASS 20 (332s) == replay-dual test 21a: commit on sharing =============== 10:10:31 (1773670231) mdt.lustre-MDT0000.commit_on_sharing=1 mdt.lustre-MDT0001.commit_on_sharing=1 Replay barrier on lustre-MDT0000 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:10:36 (1773670236) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:10:51 (1773670251) targets are mounted 10:10:51 (1773670251) facet_failover done Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 mdt.lustre-MDT0000.commit_on_sharing=0 mdt.lustre-MDT0001.commit_on_sharing=0 PASS 21a (163s) SKIP: replay-dual test_21b skipping SLOW test 21b == replay-dual test 22a: c1 lfs mkdir -i 1 dir1, M1 drop reply & fail, c2 mkdir dir1/dir ========================================================== 10:13:14 (1773670394) fail_loc=0x119 Failing mds2 on oleg348-server Stopping /mnt/lustre-mds2 (opts:) on oleg348-server 10:13:17 (1773670397) shut down facet: mds2 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds2 to oleg348-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 10:13:32 (1773670412) targets are mounted 10:13:32 (1773670412) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1904 1269400 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1744 1269560 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 72.85 ops/second total: 2 open/close in 0.01 seconds: 135.24 ops/second Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:13:43 (1773670423) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:13:58 (1773670438) targets are mounted 10:13:58 (1773670438) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22a (52s) == replay-dual test 22b: c1 lfs mkdir -i 1 d1, M1 drop reply & fail M0/M1, c2 mkdir d1/dir ========================================================== 10:14:06 (1773670446) fail_loc=0x119 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server Failing mds2 on oleg348-server Stopping /mnt/lustre-mds2 (opts:) on oleg348-server 10:14:11 (1773670451) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server facet: mds2 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Failover mds2 to oleg348-server mount facets: mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 10:14:38 (1773670478) targets are mounted 10:14:38 (1773670478) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1904 1269400 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1744 1269560 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 64.04 ops/second total: 2 open/close in 0.01 seconds: 146.45 ops/second Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:14:49 (1773670489) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:15:05 (1773670505) targets are mounted 10:15:05 (1773670505) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22b (67s) == replay-dual test 22c: c1 lfs mkdir -i 1 d1, M1 drop update & fail M1, c2 mkdir d1/dir ========================================================== 10:15:13 (1773670513) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:15:17 (1773670517) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:15:31 (1773670531) targets are mounted 10:15:31 (1773670531) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1904 1269400 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1756 1269548 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 74.54 ops/second total: 2 open/close in 0.01 seconds: 139.26 ops/second Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:15:42 (1773670542) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:15:57 (1773670557) targets are mounted 10:15:57 (1773670557) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22c (52s) == replay-dual test 22d: c1 lfs mkdir -i 1 d1, M1 drop update & fail M0/M1,c2 mkdir d1/dir ========================================================== 10:16:05 (1773670565) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server Failing mds2 on oleg348-server Stopping /mnt/lustre-mds2 (opts:) on oleg348-server 10:16:18 (1773670578) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server facet: mds2 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Failover mds2 to oleg348-server mount facets: mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 10:16:40 (1773670600) targets are mounted 10:16:40 (1773670600) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1868 1269436 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1708 1269596 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 61.10 ops/second total: 2 open/close in 0.01 seconds: 141.48 ops/second Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:16:49 (1773670609) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:17:05 (1773670625) targets are mounted 10:17:05 (1773670625) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22d (68s) == replay-dual test 23a: c1 rmdir d1, M1 drop reply and fail, client2 mkdir d1 ========================================================== 10:17:13 (1773670633) fail_loc=0x119 fail_loc=0 Failing mds2 on oleg348-server Stopping /mnt/lustre-mds2 (opts:) on oleg348-server 10:17:22 (1773670642) shut down facet: mds2 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds2 to oleg348-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 10:17:36 (1773670656) targets are mounted 10:17:36 (1773670656) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1900 1269404 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 96.87 ops/second Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:17:47 (1773670667) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:18:02 (1773670682) targets are mounted 10:18:02 (1773670682) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23a (57s) == replay-dual test 23b: c1 rmdir d1, M1 drop reply and fail M0/M1, c2 mkdir d1 ========================================================== 10:18:10 (1773670690) fail_loc=0x119 fail_loc=0 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server Failing mds2 on oleg348-server Stopping /mnt/lustre-mds2 (opts:) on oleg348-server 10:18:20 (1773670700) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server facet: mds2 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Failover mds2 to oleg348-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 10:18:42 (1773670722) targets are mounted 10:18:42 (1773670722) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1900 1269404 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1740 1269564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 81.47 ops/second Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:18:51 (1773670731) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:19:07 (1773670747) targets are mounted 10:19:07 (1773670747) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23b (65s) == replay-dual test 23c: c1 rmdir d1, M0 drop update reply and fail M0, c2 mkdir d1 ========================================================== 10:19:15 (1773670755) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:19:20 (1773670760) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:19:34 (1773670774) targets are mounted 10:19:34 (1773670774) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1900 1269404 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1752 1269552 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 107.41 ops/second Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:19:44 (1773670784) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:20:00 (1773670800) targets are mounted 10:20:00 (1773670800) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23c (53s) == replay-dual test 23d: c1 rmdir d1, M0 drop update reply and fail M0/M1, c2 mkdir d1 ========================================================== 10:20:08 (1773670808) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server Failing mds2 on oleg348-server Stopping /mnt/lustre-mds2 (opts:) on oleg348-server 10:20:21 (1773670821) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server facet: mds2 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Failover mds2 to oleg348-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 10:20:42 (1773670842) targets are mounted 10:20:42 (1773670842) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1864 1269440 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1704 1269600 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3104 7210936 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 105.70 ops/second Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:20:53 (1773670853) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:21:09 (1773670869) targets are mounted 10:21:09 (1773670869) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23d (69s) == replay-dual test 24: reconstruct on non-existing object ========================================================== 10:21:17 (1773670877) fail_loc=0x119 fail_loc=0 truncate: cannot truncate '/mnt/lustre/f24.replay-dual' to length 100: No such file or directory PASS 24 (88s) == replay-dual test 25: replay|resend ==================== 10:22:45 (1773670965) 1+0 records in 1+0 records out 512 bytes copied, 0.00354436 s, 144 kB/s fail_loc=0x304 fail_loc=0x80000325 Failing ost1 on oleg348-server Stopping /mnt/lustre-ost1 (opts:) on oleg348-server 10:22:48 (1773670968) shut down facet: ost1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover ost1 to oleg348-server mount facets: ost1 Start ost1: mount -t lustre -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-OST0000 10:23:03 (1773670983) targets are mounted 10:23:03 (1773670983) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec /home/green/git/lustre-release/lustre/tests/test-framework.sh: line 6872: 63880 Terminated LUSTRE="/home/green/git/lustre-release/lustre" bash -c "multiop /mnt/lustre2/f25.replay-dual Ow512" fail_loc=0 PASS 25 (25s) == replay-dual test 26: dbench and tar with mds failover ========================================================== 10:23:10 (1773670990) Starting client oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre Started clients oleg348-client.virtnet: 192.168.203.148@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,encrypt,flock,lazystatfs,lruresize,nolock,statfs_project,nouser_fid2path,user_xattr,verbose) Started tar loop with pid 65299 Started dbench loop with 65300 striped dir -i0 -c2 -H all_char /mnt/lustre2/d26.replay-dual/run_dbench striped dir -i0 -c2 -H crush2 /mnt/lustre/d26.replay-dual/run_tar looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Mon Mar 16 10:23:12 EDT 2026 waiting for dbench pid 65343 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs failed to create barrier semaphore 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1269312 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1784 1269520 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 26140 3579624 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 27684 7185100 1% /mnt/lustre 1 106 4.98 MB/sec warmup 1 sec latency 32.493 ms 1 197 3.92 MB/sec warmup 2 sec latency 172.434 ms 1 289 3.57 MB/sec warmup 3 sec latency 74.858 ms 1 405 3.59 MB/sec warmup 4 sec latency 42.474 ms 1 514 3.55 MB/sec warmup 5 sec latency 53.876 ms test_26 fail mds1 1 times 1 710 3.53 MB/sec warmup 6 sec latency 36.703 ms Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 1 780 3.05 MB/sec warmup 7 sec latency 258.532 ms 10:23:20 (1773671000) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server 1 870 2.71 MB/sec warmup 8 sec latency 351.694 ms 1 903 2.41 MB/sec warmup 9 sec latency 551.687 ms 1 995 2.19 MB/sec warmup 10 sec latency 437.475 ms 1 995 1.99 MB/sec warmup 11 sec latency 1437.630 ms 1 995 1.83 MB/sec warmup 12 sec latency 2437.845 ms 1 995 1.69 MB/sec warmup 13 sec latency 3438.009 ms 1 995 1.56 MB/sec warmup 14 sec latency 4438.152 ms 1 995 1.46 MB/sec warmup 15 sec latency 5438.417 ms 1 995 1.37 MB/sec warmup 16 sec latency 6438.648 ms 1 995 1.29 MB/sec warmup 17 sec latency 7438.801 ms 1 995 1.22 MB/sec warmup 18 sec latency 8439.006 ms Failover mds1 to oleg348-server mount facets: mds1 1 995 1.15 MB/sec warmup 19 sec latency 9439.251 ms 1 995 0.00 MB/sec execute 1 sec latency 11439.565 ms 1 995 0.00 MB/sec execute 2 sec latency 12439.866 ms Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 995 0.00 MB/sec execute 3 sec latency 13440.090 ms 1 995 0.00 MB/sec execute 4 sec latency 14440.330 ms oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:23:37 (1773671017) targets are mounted 10:23:37 (1773671017) facet_failover done 1 995 0.00 MB/sec execute 5 sec latency 15440.533 ms 1 995 0.00 MB/sec execute 6 sec latency 16440.676 ms 1 995 0.00 MB/sec execute 7 sec latency 17440.840 ms 1 995 0.00 MB/sec execute 8 sec latency 18440.994 ms 1 1010 0.05 MB/sec execute 9 sec latency 19107.585 ms oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 1159 0.11 MB/sec execute 10 sec latency 82.212 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 1379 0.37 MB/sec execute 11 sec latency 29.183 ms 1 1516 0.35 MB/sec execute 12 sec latency 118.361 ms 1 1696 0.33 MB/sec execute 13 sec latency 37.010 ms 1 1859 0.32 MB/sec execute 14 sec latency 34.052 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2416 1268888 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2128 1269176 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 43104 3560684 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 16536 3588012 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 59640 7148696 1% /mnt/lustre 1 2002 0.31 MB/sec execute 15 sec latency 52.903 ms 1 2154 0.30 MB/sec execute 16 sec latency 30.846 ms 1 2348 0.35 MB/sec execute 17 sec latency 33.724 ms 1 2449 0.34 MB/sec execute 18 sec latency 49.802 ms test_26 fail mds2 2 times 1 2622 0.38 MB/sec execute 19 sec latency 33.272 ms Failing mds2 on oleg348-server Stopping /mnt/lustre-mds2 (opts:) on oleg348-server 1 2957 0.58 MB/sec execute 20 sec latency 37.104 ms 10:23:53 (1773671033) shut down facet: mds2 facet_host: oleg348-server facet_failover_host: oleg348-server 1 3360 0.64 MB/sec execute 21 sec latency 31.713 ms 1 3695 0.77 MB/sec execute 22 sec latency 17.458 ms 1 3926 0.79 MB/sec execute 23 sec latency 18.866 ms 1 4083 0.76 MB/sec execute 24 sec latency 169.914 ms 1 4083 0.73 MB/sec execute 25 sec latency 1170.085 ms 1 4083 0.70 MB/sec execute 26 sec latency 2170.249 ms 1 4083 0.67 MB/sec execute 27 sec latency 3170.423 ms 1 4083 0.65 MB/sec execute 28 sec latency 4170.616 ms 1 4083 0.63 MB/sec execute 29 sec latency 5170.743 ms 1 4083 0.61 MB/sec execute 30 sec latency 6170.872 ms 1 4083 0.59 MB/sec execute 31 sec latency 7171.022 ms Failover mds2 to oleg348-server mount facets: mds2 1 4083 0.57 MB/sec execute 32 sec latency 8171.235 ms 1 4083 0.55 MB/sec execute 33 sec latency 9171.360 ms 1 4083 0.53 MB/sec execute 34 sec latency 10171.469 ms Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 4083 0.52 MB/sec execute 35 sec latency 11171.612 ms oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all 1 4083 0.50 MB/sec execute 36 sec latency 12171.773 ms pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 10:24:08 (1773671048) targets are mounted 10:24:08 (1773671048) facet_failover done 1 4083 0.49 MB/sec execute 37 sec latency 13171.935 ms 1 4083 0.48 MB/sec execute 38 sec latency 14172.160 ms 1 4083 0.47 MB/sec execute 39 sec latency 15172.335 ms 1 4083 0.45 MB/sec execute 40 sec latency 16172.499 ms oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid 1 4179 0.44 MB/sec execute 41 sec latency 16297.931 ms mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 4311 0.44 MB/sec execute 42 sec latency 44.381 ms 1 4455 0.44 MB/sec execute 43 sec latency 31.528 ms 1 4562 0.43 MB/sec execute 44 sec latency 41.565 ms 1 4744 0.45 MB/sec execute 45 sec latency 29.711 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2868 1268436 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2576 1268728 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 51800 3551148 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 16632 3588688 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 68432 7139836 1% /mnt/lustre 1 4935 0.50 MB/sec execute 46 sec latency 44.859 ms 1 5063 0.49 MB/sec execute 47 sec latency 44.473 ms 1 5225 0.49 MB/sec execute 48 sec latency 26.852 ms 1 5346 0.48 MB/sec execute 49 sec latency 40.173 ms 1 5458 0.47 MB/sec execute 50 sec latency 41.099 ms test_26 fail mds1 3 times Failing mds1 on oleg348-server 1 5627 0.47 MB/sec execute 51 sec latency 35.765 ms Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 1 5820 0.46 MB/sec execute 52 sec latency 47.429 ms 10:24:24 (1773671064) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server 1 5820 0.45 MB/sec execute 53 sec latency 1024.865 ms 1 5820 0.44 MB/sec execute 54 sec latency 2025.017 ms 1 5820 0.44 MB/sec execute 55 sec latency 3025.153 ms 1 5820 0.43 MB/sec execute 56 sec latency 4025.371 ms 1 5820 0.42 MB/sec execute 57 sec latency 5025.514 ms 1 5820 0.41 MB/sec execute 58 sec latency 6025.653 ms 1 5820 0.41 MB/sec execute 59 sec latency 7025.832 ms 1 5820 0.40 MB/sec execute 60 sec latency 8025.947 ms 1 5820 0.39 MB/sec execute 61 sec latency 9026.090 ms 1 5820 0.39 MB/sec execute 62 sec latency 10026.220 ms Failover mds1 to oleg348-server mount facets: mds1 1 5820 0.38 MB/sec execute 63 sec latency 11026.358 ms 1 5820 0.37 MB/sec execute 64 sec latency 12026.477 ms 1 5820 0.37 MB/sec execute 65 sec latency 13026.588 ms Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 5820 0.36 MB/sec execute 66 sec latency 14026.737 ms 1 5820 0.36 MB/sec execute 67 sec latency 15026.870 ms oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 1 5820 0.35 MB/sec execute 68 sec latency 16027.037 ms 10:24:40 (1773671080) targets are mounted 10:24:40 (1773671080) facet_failover done 1 5820 0.35 MB/sec execute 69 sec latency 17027.153 ms 1 5820 0.34 MB/sec execute 70 sec latency 18027.268 ms 1 5820 0.34 MB/sec execute 71 sec latency 19027.407 ms 1 5820 0.33 MB/sec execute 72 sec latency 20027.529 ms oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 5977 0.35 MB/sec execute 73 sec latency 20080.272 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 6148 0.35 MB/sec execute 74 sec latency 39.383 ms 1 6484 0.41 MB/sec execute 75 sec latency 38.796 ms 1 6832 0.43 MB/sec execute 76 sec latency 31.149 ms 1 7111 0.46 MB/sec execute 77 sec latency 30.636 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3320 1267984 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3096 1268208 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 53804 3548080 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 16856 3584840 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 70660 7132920 1% /mnt/lustre 1 7274 0.46 MB/sec execute 78 sec latency 47.900 ms 1 7424 0.47 MB/sec execute 79 sec latency 31.401 ms 1 7572 0.47 MB/sec execute 80 sec latency 40.490 ms 1 7713 0.46 MB/sec execute 81 sec latency 35.461 ms test_26 fail mds2 4 times 1 7859 0.46 MB/sec execute 82 sec latency 47.692 ms Failing mds2 on oleg348-server Stopping /mnt/lustre-mds2 (opts:) on oleg348-server 1 7980 0.46 MB/sec execute 83 sec latency 63.888 ms 10:24:56 (1773671096) shut down 1 7988 0.45 MB/sec execute 84 sec latency 933.488 ms facet: mds2 facet_host: oleg348-server facet_failover_host: oleg348-server 1 7988 0.45 MB/sec execute 85 sec latency 1933.703 ms 1 7988 0.44 MB/sec execute 86 sec latency 2933.924 ms 1 7988 0.44 MB/sec execute 87 sec latency 3934.122 ms 1 7988 0.43 MB/sec execute 88 sec latency 4934.336 ms 1 7988 0.43 MB/sec execute 89 sec latency 5934.577 ms 1 7988 0.42 MB/sec execute 90 sec latency 6934.747 ms 1 7988 0.42 MB/sec execute 91 sec latency 7934.903 ms 1 7988 0.41 MB/sec execute 92 sec latency 8935.107 ms 1 7988 0.41 MB/sec execute 93 sec latency 9935.289 ms 1 7988 0.40 MB/sec execute 94 sec latency 10935.460 ms Failover mds2 to oleg348-server mount facets: mds2 1 7988 0.40 MB/sec execute 95 sec latency 11935.634 ms 1 7988 0.40 MB/sec execute 96 sec latency 12935.841 ms 1 7988 0.39 MB/sec execute 97 sec latency 13936.010 ms Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 7988 0.39 MB/sec execute 98 sec latency 14936.111 ms oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all 1 7988 0.38 MB/sec execute 99 sec latency 15936.280 ms pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 10:25:11 (1773671111) targets are mounted 10:25:11 (1773671111) facet_failover done 1 cleanup 100 sec 1 cleanup 101 sec 1 cleanup 102 sec 1 cleanup 103 sec oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid 1 cleanup 104 sec 0 cleanup 104 sec Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 1200 47.896 20080.264 Close 876 2.619 17.667 Rename 50 25.719 41.394 Unlink 244 9.522 38.195 Qpathinfo 1122 19.283 16297.909 Qfileinfo 192 1.110 5.929 Qfsinfo 202 0.496 5.209 Sfileinfo 94 17.007 35.466 Find 423 5.094 25.894 WriteX 594 4.282 29.792 ReadX 1907 0.279 82.198 LockX 4 2.695 3.960 UnlockX 4 3.040 3.985 Flush 82 260.051 20172.716 Throughput 0.383715 MB/sec 1 clients 1 procs max_latency=20080.272 ms stopping dbench on /mnt/lustre at Mon Mar 16 10:25:16 EDT 2026 with return code 0 mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished striped dir -i0 -c2 -H crush /mnt/lustre2/d26.replay-dual/run_dbench looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Mon Mar 16 10:25:19 EDT 2026 waiting for dbench pid 69540 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients 1 144 6.15 MB/sec warmup 1 sec latency 39.426 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3892 1267412 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3724 1267580 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 24312 3581816 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 28540 3576288 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 52852 7158104 1% /mnt/lustre 1 219 4.21 MB/sec warmup 2 sec latency 156.660 ms 1 349 4.17 MB/sec warmup 3 sec latency 37.255 ms 1 493 4.30 MB/sec warmup 4 sec latency 30.142 ms 1 673 4.23 MB/sec warmup 5 sec latency 92.816 ms 1 799 3.58 MB/sec warmup 6 sec latency 158.707 ms test_26 fail mds1 5 times Failing mds1 on oleg348-server 1 884 3.10 MB/sec warmup 7 sec latency 475.642 ms Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 1 916 2.72 MB/sec warmup 8 sec latency 591.763 ms 10:25:27 (1773671127) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server 1 916 2.42 MB/sec warmup 9 sec latency 1554.188 ms 1 916 2.18 MB/sec warmup 10 sec latency 2554.411 ms 1 916 1.98 MB/sec warmup 11 sec latency 3554.567 ms 1 916 1.81 MB/sec warmup 12 sec latency 4554.723 ms 1 916 1.67 MB/sec warmup 13 sec latency 5554.924 ms 1 916 1.55 MB/sec warmup 14 sec latency 6555.084 ms 1 916 1.45 MB/sec warmup 15 sec latency 7555.244 ms 1 916 1.36 MB/sec warmup 16 sec latency 8555.420 ms 1 916 1.28 MB/sec warmup 17 sec latency 9555.610 ms 1 916 1.21 MB/sec warmup 18 sec latency 10555.858 ms Failover mds1 to oleg348-server mount facets: mds1 1 916 1.15 MB/sec warmup 19 sec latency 11556.025 ms 1 916 0.00 MB/sec execute 1 sec latency 13556.264 ms Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 916 0.00 MB/sec execute 2 sec latency 14556.421 ms 1 916 0.00 MB/sec execute 3 sec latency 15556.557 ms oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 1 916 0.00 MB/sec execute 4 sec latency 16556.693 ms Started lustre-MDT0000 10:25:43 (1773671143) targets are mounted 10:25:43 (1773671143) facet_failover done 1 916 0.00 MB/sec execute 5 sec latency 17556.878 ms 1 916 0.00 MB/sec execute 6 sec latency 18557.066 ms 1 916 0.00 MB/sec execute 7 sec latency 19557.316 ms 1 916 0.00 MB/sec execute 8 sec latency 20557.506 ms 1 983 0.01 MB/sec execute 9 sec latency 20914.280 ms oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 1128 0.12 MB/sec execute 10 sec latency 38.707 ms 1 1239 0.12 MB/sec execute 11 sec latency 33.330 ms 1 1405 0.35 MB/sec execute 12 sec latency 70.836 ms 1 1480 0.33 MB/sec execute 13 sec latency 166.183 ms 1 1568 0.31 MB/sec execute 14 sec latency 155.588 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3532 1267772 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3496 1267808 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 34768 3571888 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 38936 3567196 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 73704 7139084 2% /mnt/lustre 1 1684 0.30 MB/sec execute 15 sec latency 44.474 ms 1 1837 0.29 MB/sec execute 16 sec latency 51.155 ms 1 1980 0.28 MB/sec execute 17 sec latency 69.787 ms 1 2086 0.27 MB/sec execute 18 sec latency 45.847 ms test_26 fail mds2 6 times 1 2204 0.26 MB/sec execute 19 sec latency 46.479 ms Failing mds2 on oleg348-server Stopping /mnt/lustre-mds2 (opts:) on oleg348-server 1 2354 0.31 MB/sec execute 20 sec latency 163.890 ms 10:26:00 (1773671160) shut down facet: mds2 facet_host: oleg348-server facet_failover_host: oleg348-server 1 2354 0.29 MB/sec execute 21 sec latency 1164.005 ms 1 2354 0.28 MB/sec execute 22 sec latency 2164.139 ms 1 2354 0.27 MB/sec execute 23 sec latency 3164.335 ms 1 2354 0.26 MB/sec execute 24 sec latency 4164.534 ms 1 2354 0.25 MB/sec execute 25 sec latency 5164.758 ms 1 2354 0.24 MB/sec execute 26 sec latency 6165.099 ms 1 2354 0.23 MB/sec execute 27 sec latency 7165.258 ms 1 2354 0.22 MB/sec execute 28 sec latency 8165.461 ms 1 2354 0.21 MB/sec execute 29 sec latency 9165.627 ms 1 2354 0.20 MB/sec execute 30 sec latency 10165.906 ms 1 2354 0.20 MB/sec execute 31 sec latency 11166.081 ms Failover mds2 to oleg348-server mount facets: mds2 1 2354 0.19 MB/sec execute 32 sec latency 12166.248 ms 1 2354 0.19 MB/sec execute 33 sec latency 13166.419 ms 1 2354 0.18 MB/sec execute 34 sec latency 14166.626 ms Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 2354 0.18 MB/sec execute 35 sec latency 15166.805 ms 1 2354 0.17 MB/sec execute 36 sec latency 16166.980 ms 1 2354 0.17 MB/sec execute 37 sec latency 17167.135 ms oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 1 2354 0.16 MB/sec execute 38 sec latency 18167.296 ms Started lustre-MDT0001 10:26:17 (1773671177) targets are mounted 10:26:17 (1773671177) facet_failover done 1 2354 0.16 MB/sec execute 39 sec latency 19167.501 ms 1 2354 0.15 MB/sec execute 40 sec latency 20167.750 ms 1 2354 0.15 MB/sec execute 41 sec latency 21168.007 ms 1 2373 0.15 MB/sec execute 42 sec latency 21953.376 ms oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid 1 2485 0.16 MB/sec execute 43 sec latency 61.148 ms mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 2768 0.23 MB/sec execute 44 sec latency 40.282 ms 1 2969 0.26 MB/sec execute 45 sec latency 86.933 ms 1 3237 0.28 MB/sec execute 46 sec latency 40.679 ms 1 3399 0.29 MB/sec execute 47 sec latency 80.168 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2940 1268364 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2628 1268676 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 23324 3583252 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 38240 3566772 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 61564 7150024 1% /mnt/lustre 1 3636 0.35 MB/sec execute 48 sec latency 126.204 ms 1 3792 0.37 MB/sec execute 49 sec latency 29.578 ms 1 3926 0.37 MB/sec execute 50 sec latency 29.845 ms 1 4049 0.36 MB/sec execute 51 sec latency 28.936 ms test_26 fail mds1 7 times 1 4157 0.35 MB/sec execute 52 sec latency 54.918 ms Failing mds1 on oleg348-server 1 4236 0.35 MB/sec execute 53 sec latency 165.101 ms Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 1 4289 0.34 MB/sec execute 54 sec latency 461.987 ms 10:26:33 (1773671193) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server 1 4289 0.34 MB/sec execute 55 sec latency 1462.174 ms 1 4289 0.33 MB/sec execute 56 sec latency 2462.326 ms 1 4289 0.33 MB/sec execute 57 sec latency 3462.568 ms 1 4289 0.32 MB/sec execute 58 sec latency 4462.828 ms 1 4289 0.32 MB/sec execute 59 sec latency 5463.036 ms 1 4289 0.31 MB/sec execute 60 sec latency 6463.219 ms 1 4289 0.31 MB/sec execute 61 sec latency 7463.528 ms 1 4289 0.30 MB/sec execute 62 sec latency 8463.731 ms 1 4289 0.30 MB/sec execute 63 sec latency 9463.938 ms 1 4289 0.29 MB/sec execute 64 sec latency 10464.141 ms Failover mds1 to oleg348-server mount facets: mds1 1 4289 0.29 MB/sec execute 65 sec latency 11464.299 ms 1 4289 0.28 MB/sec execute 66 sec latency 12464.533 ms 1 4289 0.28 MB/sec execute 67 sec latency 13464.669 ms 1 4289 0.27 MB/sec execute 68 sec latency 14464.775 ms Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 4289 0.27 MB/sec execute 69 sec latency 15464.980 ms oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all 1 4289 0.27 MB/sec execute 70 sec latency 16465.099 ms pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:26:49 (1773671209) targets are mounted 10:26:50 (1773671210) facet_failover done 1 4289 0.26 MB/sec execute 71 sec latency 17465.229 ms 1 4289 0.26 MB/sec execute 72 sec latency 18465.437 ms 1 4289 0.26 MB/sec execute 73 sec latency 19465.657 ms 1 4289 0.25 MB/sec execute 74 sec latency 20465.837 ms 1 4289 0.25 MB/sec execute 75 sec latency 21466.022 ms oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 4370 0.25 MB/sec execute 76 sec latency 21595.983 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 4484 0.25 MB/sec execute 77 sec latency 71.600 ms 1 4590 0.26 MB/sec execute 78 sec latency 70.452 ms striped dir -i0 -c2 -H all_char /mnt/lustre/d26.replay-dual/run_tar 1 4714 0.26 MB/sec execute 79 sec latency 121.816 ms dbench killed by signal 15 stopping dbench on /mnt/lustre at Mon Mar 16 10:26:58 EDT 2026 with return code 0 69540 pts/0 S+ 0:00 dbench -c client.txt 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100 killed dbench main pid 69540 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished PASS 26 (231s) == replay-dual test 28: lock replay should be ordered: waiting after granted ========================================================== 10:27:01 (1773671221) Waiting for MDT destroys to complete 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.00431605 s, 949 kB/s fail_loc=0x80000324 fail_loc=0x32a Failing ost1 on oleg348-server Stopping /mnt/lustre-ost1 (opts:) on oleg348-server 10:27:13 (1773671233) shut down facet: ost1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover ost1 to oleg348-server mount facets: ost1 Start ost1: mount -t lustre -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.00362624 s, 1.1 MB/s oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-OST0000 10:27:28 (1773671248) targets are mounted 10:27:28 (1773671248) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 28 (35s) == replay-dual test 29: replay vs update with the same xid ========================================================== 10:27:36 (1773671256) SKIP: replay-dual test_29 needs >= 2 clients SKIP 29 (2s) == replay-dual test 30: layout lock replay is not blocked on IO ========================================================== 10:27:38 (1773671258) 10+0 records in 10+0 records out 40960 bytes (41 kB, 40 KiB) copied, 0.0112145 s, 3.7 MB/s 10+0 records in 10+0 records out 40960 bytes (41 kB, 40 KiB) copied, 0.0104213 s, 3.9 MB/s fail_loc=0x32e fail_val=4 Failing mds1 on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server 10:27:41 (1773671261) shut down facet: mds1 facet_host: oleg348-server facet_failover_host: oleg348-server Failover mds1 to oleg348-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 10:27:56 (1773671276) targets are mounted 10:27:56 (1773671276) facet_failover done 160+0 records in 160+0 records out 81920 bytes (82 kB, 80 KiB) copied, 23.4372 s, 3.5 kB/s oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 30 (29s) == replay-dual test 31: deadlock on file_remove_privs and occupied mod rpc slots ========================================================== 10:28:07 (1773671287) Failing ost1 on oleg348-server Stopping /mnt/lustre-ost1 (opts:) on oleg348-server 10:28:09 (1773671289) shut down facet: ost1 facet_host: oleg348-server facet_failover_host: oleg348-server Creating to objid 2625 on ost lustre-OST0000... total: 32 open/close in 0.23 seconds: 139.88 ops/second at_max=0 fail_loc=0x80001420 file /mnt/lustre2/d31.replay-dual/mdtdir/f31.replay-dual is ready Failover ost1 to oleg348-server mount facets: ost1 Start ost1: mount -t lustre -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-OST0000 10:28:24 (1773671304) targets are mounted 10:28:24 (1773671304) facet_failover done oleg348-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL IDLE state after 0 sec pids: 76978 76979 76984 76985 76986 76987 76988 76989 76990 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2168 1269136 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1904 1269400 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3605452 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3112 7210928 1% /mnt/lustre UUID Inodes IUsed IFree IUse% Mounted on lustre-MDT0000_UUID 1024000 376 1023624 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1024000 289 1023711 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 262144 458 261686 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 262144 391 261753 1% /mnt/lustre[OST:1] filesystem_summary: 524104 665 523439 1% /mnt/lustre at_max=600 PASS 31 (23s) == replay-dual test 32: gap in update llog shouldn't break recovery ========================================================== 10:28:30 (1773671310) fail_loc=0x0000131d fail_val=10 fail_loc=0x726 Stopping /mnt/lustre-mds2 (opts:) on oleg348-server Stopping /mnt/lustre-mds1 (opts:) on oleg348-server fail_loc=0 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0000 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2272 1269032 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1964 1269340 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605472 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3112 7210920 1% /mnt/lustre PASS 32 (20s) == replay-dual test 33: Check for OBD_INCOMPAT_MULTI_RPCS in last_rcvd after abort_recovery ========================================================== 10:28:50 (1773671330) at_min=60 Stopping /mnt/lustre-mds2 (opts:) on oleg348-server last_rcvd in /dev/mapper/mds2_flakey: incompat = 0x61c Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 oleg348-client.virtnet: executing wait_import_state_mount REPLAY_WAIT mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in REPLAY_WAIT state after 0 sec oleg348-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec affected facets: mds2 oleg348-server: oleg348-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 oleg348-server: *.lustre-MDT0001.recovery_status status: COMPLETE Stopping /mnt/lustre-mds2 (opts:) on oleg348-server last_rcvd in /dev/mapper/mds2_flakey: incompat = 0x61c Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg348-server: oleg348-server.virtnet: executing set_default_debug -1 all pdsh@oleg348-client: oleg348-server: ssh exited with exit code 1 Started lustre-MDT0001 Starting client: oleg348-client.virtnet: -o user_xattr,flock 192.168.203.148@tcp:/lustre /mnt/lustre2 oleg348-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL DISCONN state after 2 sec affected facets: mds2 oleg348-server: oleg348-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 oleg348-server: *.lustre-MDT0001.recovery_status status: COMPLETE at_min=5 PASS 33 (40s) == replay-dual test complete, duration 3117 sec ========== 10:29:30 (1773671370) === replay-dual: start cleanup 10:29:31 (1773671371) === Stopping clients: oleg348-client.virtnet /mnt/lustre2 (opts:) Stopping client oleg348-client.virtnet /mnt/lustre2 opts: === replay-dual: finish cleanup 10:29:34 (1773671374) ===