-----============= acceptance-small: replay-single ============----- Mon Mar 16 09:37:47 EDT 2026 mgs: Rocky Linux release 8.10 (Green Obsidian) MGS_OS_ID_LIKE=rhel centos fedora rocky MGS_OS_VERSION_ID=8.10 MGS_OS_ID=rocky MGS_OS_VERSION_CODE=134873088 mds1: Rocky Linux release 8.10 (Green Obsidian) MDS1_OS_VERSION_ID=8.10 MDS1_OS_VERSION_CODE=134873088 MDS1_OS_ID_LIKE=rhel centos fedora rocky MDS1_OS_ID=rocky ost1: Rocky Linux release 8.10 (Green Obsidian) OST1_OS_VERSION_CODE=134873088 OST1_OS_ID_LIKE=rhel centos fedora rocky OST1_OS_VERSION_ID=8.10 OST1_OS_ID=rocky client: Rocky Linux release 8.10 (Green Obsidian) CLIENT_OS_ID=rocky CLIENT_OS_VERSION_CODE=134873088 CLIENT_OS_VERSION_ID=8.10 CLIENT_OS_ID_LIKE=rhel centos fedora rocky oleg436-server: ls: cannot access '/home/green/git/lustre-release/lustre/tests/except/replay-single.*ex': No such file or directory excepting tests: 110f 131b 59 36 === replay-single: start setup 09:37:54 (1773668274) === oleg436-client.virtnet: executing check_config_client /mnt/lustre oleg436-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg436-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff8d9748868000.idle_timeout=debug osc.lustre-OST0001-osc-ffff8d9748868000.idle_timeout=debug disable quota as required oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all osd-ldiskfs.track_declares_assert=1 === replay-single: finish setup 09:38:08 (1773668288) === Cleaning up AT ... == replay-single test 100a: DNE: create striped dir, drop update rep from MDT1, fail MDT1 ========================================================== 09:38:09 (1773668289) fail_loc=0x1701 Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 09:38:11 (1773668291) shut down facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 09:38:26 (1773668306) targets are mounted 09:38:26 (1773668306) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec lmv_stripe_count: 2 lmv_stripe_offset: 0 lmv_hash_type: crush mdtidx FID[seq:oid:ver] 0 [0x200000400:0x2:0x0] 1 [0x240000401:0x2:0x0] total: 20 open/close in 0.09 seconds: 226.55 ops/second PASS 100a (26s) == replay-single test 100b: DNE: create striped dir, fail MDT0 ========================================================== 09:38:35 (1773668315) fail_loc=0x119 Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 09:38:38 (1773668318) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 09:38:53 (1773668333) targets are mounted 09:38:53 (1773668333) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec lmv_stripe_count: 2 lmv_stripe_offset: 0 lmv_hash_type: crush mdtidx FID[seq:oid:ver] 0 [0x200000400:0x5:0x0] 1 [0x240000401:0x5:0x0] total: 20 open/close in 0.10 seconds: 202.31 ops/second PASS 100b (26s) == replay-single test 100c: DNE: create striped dir, abort_recov_mdt mds2 ========================================================== 09:39:01 (1773668341) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1884 1269420 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1760 1269544 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Stopping /mnt/lustre-mds2 (opts:) on oleg436-server Failover mds2 to oleg436-server oleg436-server.virtnet Start mds2: mount -t lustre -o localrecov -o abort_recov_mdt /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 total: 20 open/close in 0.07 seconds: 273.66 ops/second Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 09:39:24 (1773668364) shut down facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 09:39:39 (1773668379) targets are mounted 09:39:39 (1773668379) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec lmv_stripe_count: 2 lmv_stripe_offset: 1 lmv_hash_type: crush mdtidx FID[seq:oid:ver] 1 [0x240000bd0:0x3:0x0] 0 [0x200000403:0x3:0x0] total: 20 open/close in 0.08 seconds: 251.95 ops/second PASS 100c (46s) == replay-single test 100d: DNE: cancel update logs upon recovery abort ========================================================== 09:39:47 (1773668387) striped dir -i0 -c2 -H crush2 /mnt/lustre/d100d.replay-single total: 100 mkdir in 0.36 seconds: 280.34 ops/second lustre-MDT0001-osd [catalog]: [0x2400013a0:0x3:0x0] [index]: 00001 [logid]: [0x240001b70:0x2:0x0] lustre-MDT0000-osp-MDT0001 [catalog]: [0x200000bd1:0x3:0x0] [index]: 00001 [logid]: [0x200000bd2:0x2:0x0] [index]: 00002 [logid]: [0x200000bd2:0x3:0x0] Stopping /mnt/lustre-mds2 (opts:) on oleg436-server Failover mds2 to oleg436-server oleg436-server.virtnet Start mds2: mount -t lustre -o localrecov -o abort_recovery /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 find: '/mnt/lustre/d100d.replay-single': No such file or directory find: '/mnt/lustre/d100d.replay-single': No such file or directory PASS 100d (23s) == replay-single test 100e: DNE: create striped dir on MDT0 and MDT1, fail MDT0, MDT1 ========================================================== 09:40:10 (1773668410) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2292 1269012 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2128 1269176 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2292 1269012 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2128 1269176 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre lmv_stripe_count: 2 lmv_stripe_offset: 0 lmv_hash_type: crush mdtidx FID[seq:oid:ver] 0 [0x200000bd0:0x3a:0x0] 1 [0x240000bd1:0x3a:0x0] Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 09:40:19 (1773668419) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 09:40:46 (1773668446) targets are mounted 09:40:46 (1773668446) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec lmv_stripe_count: 2 lmv_stripe_offset: 0 lmv_hash_type: crush mdtidx FID[seq:oid:ver] 0 [0x200000bd0:0x3a:0x0] 1 [0x240000bd1:0x3a:0x0] PASS 100e (46s) == replay-single test 101: Shouldn't reassign precreated objs to other files after recovery ========================================================== 09:40:56 (1773668456) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2344 1268960 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2164 1269140 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failover mds1 to oleg436-server oleg436-server.virtnet Start mds1: mount -t lustre -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 pdsh@oleg436-client: oleg436-client: ssh exited with exit code 5 first stat failed: 5 PASS 101 (54s) == replay-single test 102a: check resend (request lost) with multiple modify RPCs in flight ========================================================== 09:41:50 (1773668510) creating 7 files ... fail_loc=0x159 launch 7 chmod in parallel (09:41:51) ... fail_loc=0 done (09:42:46) /mnt/lustre/d102a.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-7 has perms 0600 OK PASS 102a (62s) == replay-single test 102b: check resend (reply lost) with multiple modify RPCs in flight ========================================================== 09:42:52 (1773668572) creating 7 files ... fail_loc=0x15a launch 7 chmod in parallel (09:42:55) ... fail_loc=0 done (09:43:50) /mnt/lustre/d102b.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-7 has perms 0600 OK PASS 102b (62s) == replay-single test 102c: check replay w/o reconstruction with multiple mod RPCs in flight ========================================================== 09:43:54 (1773668634) creating 7 files ... UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2504 1268800 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2264 1269040 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3599708 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3599792 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7199500 1% /mnt/lustre fail_loc=0x15a launch 7 chmod in parallel (09:44:00) ... fail_loc=0 Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 09:44:05 (1773668645) shut down facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 09:44:25 (1773668665) targets are mounted 09:44:25 (1773668665) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec done (09:44:31) /mnt/lustre/d102c.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-7 has perms 0600 OK PASS 102c (40s) == replay-single test 102d: check replay & reconstruction with multiple mod RPCs in flight ========================================================== 09:44:34 (1773668674) creating 7 files ... fail_loc=0x15a launch 7 chmod in parallel (09:44:36) ... fail_loc=0 Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 09:44:40 (1773668680) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 09:44:56 (1773668696) targets are mounted 09:44:56 (1773668696) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec done (09:45:03) /mnt/lustre/d102d.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-7 has perms 0600 OK PASS 102d (32s) == replay-single test 103: Check otr_next_id overflow ==== 09:45:06 (1773668706) fail_loc=0x80000162 total: 30 open/close in 0.28 seconds: 108.50 ops/second Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 09:45:10 (1773668710) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 09:45:27 (1773668727) targets are mounted 09:45:27 (1773668727) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 103 (30s) == replay-single test 110a: DNE: create striped dir, fail MDT1 ========================================================== 09:45:36 (1773668736) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2444 1268860 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2256 1269048 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3599708 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3599792 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7199500 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 09:45:43 (1773668743) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 09:46:01 (1773668761) targets are mounted 09:46:01 (1773668761) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d110a.replay-single/striped_dir has type dir OK PASS 110a (34s) == replay-single test 110b: DNE: create striped dir, fail MDT1 and client ========================================================== 09:46:10 (1773668770) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2448 1268856 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2256 1269048 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 09:46:18 (1773668778) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 09:46:48 (1773668808) targets are mounted 09:46:48 (1773668808) facet_failover done pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 Starting client: oleg436-client.virtnet: -o user_xattr,flock 192.168.204.136@tcp:/lustre /mnt/lustre /mnt/lustre/d110b.replay-single/striped_dir has type dir OK PASS 110b (120s) == replay-single test 110c: DNE: create striped dir, fail MDT2 ========================================================== 09:48:10 (1773668890) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2448 1268856 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2252 1269052 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 09:48:17 (1773668897) shut down facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 09:48:35 (1773668915) targets are mounted 09:48:35 (1773668915) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d110c.replay-single/striped_dir has type dir OK PASS 110c (34s) == replay-single test 110d: DNE: create striped dir, fail MDT2 and client ========================================================== 09:48:44 (1773668924) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2492 1268812 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2300 1269004 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 09:48:51 (1773668931) shut down facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 09:49:09 (1773668949) targets are mounted 09:49:09 (1773668949) facet_failover done pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 Starting client: oleg436-client.virtnet: -o user_xattr,flock 192.168.204.136@tcp:/lustre /mnt/lustre /mnt/lustre/d110d.replay-single/striped_dir has type dir OK PASS 110d (100s) == replay-single test 110e: DNE: create striped dir, uncommit on MDT2, fail client/MDT1/MDT2 ========================================================== 09:50:24 (1773669024) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2528 1268776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2336 1268968 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre df:/mnt/lustre Not a Lustre filesystem Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 09:50:36 (1773669036) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 09:51:04 (1773669064) targets are mounted 09:51:04 (1773669064) facet_failover done pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 Starting client: oleg436-client.virtnet: -o user_xattr,flock 192.168.204.136@tcp:/lustre /mnt/lustre /mnt/lustre/d110e.replay-single/striped_dir has type dir OK PASS 110e (119s) SKIP: replay-single test_110f skipping excluded test 110f == replay-single test 110g: DNE: create striped dir, uncommit on MDT1, fail client/MDT1/MDT2 ========================================================== 09:52:24 (1773669144) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2572 1268732 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2372 1268932 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre df:/mnt/lustre Not a Lustre filesystem Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 09:52:41 (1773669161) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 09:53:08 (1773669188) targets are mounted 09:53:08 (1773669188) facet_failover done pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 Starting client: oleg436-client.virtnet: -o user_xattr,flock 192.168.204.136@tcp:/lustre /mnt/lustre /mnt/lustre/d110g.replay-single/striped_dir has type dir OK PASS 110g (123s) == replay-single test 111a: DNE: unlink striped dir, fail MDT1 ========================================================== 09:54:27 (1773669267) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2456 1268848 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2264 1269040 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 09:54:42 (1773669282) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 09:55:16 (1773669316) targets are mounted 09:55:16 (1773669316) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/d111a.replay-single/striped_dir: No such file or directory PASS 111a (62s) == replay-single test 111b: DNE: unlink striped dir, fail MDT2 ========================================================== 09:55:29 (1773669329) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1268844 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2268 1269036 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 09:55:43 (1773669343) shut down facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 09:56:10 (1773669370) targets are mounted 09:56:10 (1773669370) facet_failover done pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 Starting client: oleg436-client.virtnet: -o user_xattr,flock 192.168.204.136@tcp:/lustre /mnt/lustre Can't lstat /mnt/lustre/d111b.replay-single/striped_dir: No such file or directory PASS 111b (115s) == replay-single test 111c: DNE: unlink striped dir, uncommit on MDT1, fail client/MDT1/MDT2 ========================================================== 09:57:24 (1773669444) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2504 1268800 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2268 1269036 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre df:/mnt/lustre Not a Lustre filesystem Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 09:57:49 (1773669469) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 09:58:34 (1773669514) targets are mounted 09:58:34 (1773669514) facet_failover done pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 Starting client: oleg436-client.virtnet: -o user_xattr,flock 192.168.204.136@tcp:/lustre /mnt/lustre Can't lstat /mnt/lustre/d111c.replay-single/striped_dir: No such file or directory PASS 111c (146s) == replay-single test 111d: DNE: unlink striped dir, uncommit on MDT2, fail client/MDT1/MDT2 ========================================================== 09:59:50 (1773669590) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2464 1268840 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2276 1269028 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre df:/mnt/lustre Not a Lustre filesystem Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 10:00:13 (1773669613) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 10:00:41 (1773669641) targets are mounted 10:00:41 (1773669641) facet_failover done pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg436-client: oleg436-client: ssh exited with exit code 95 Starting client: oleg436-client.virtnet: -o user_xattr,flock 192.168.204.136@tcp:/lustre /mnt/lustre Can't lstat /mnt/lustre/d111d.replay-single/striped_dir: No such file or directory PASS 111d (126s) == replay-single test 111e: DNE: unlink striped dir, uncommit on MDT2, fail MDT1/MDT2 ========================================================== 10:01:56 (1773669716) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2512 1268792 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2312 1268992 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2508 1268796 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2312 1268992 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 10:02:11 (1773669731) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 10:02:38 (1773669758) targets are mounted 10:02:38 (1773669758) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/d111e.replay-single/striped_dir: No such file or directory PASS 111e (51s) == replay-single test 111f: DNE: unlink striped dir, uncommit on MDT1, fail MDT1/MDT2 ========================================================== 10:02:47 (1773669767) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2480 1268824 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2276 1269028 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2480 1268824 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2276 1269028 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 10:02:59 (1773669779) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 10:03:26 (1773669806) targets are mounted 10:03:26 (1773669806) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/d111f.replay-single/striped_dir: No such file or directory PASS 111f (48s) == replay-single test 111g: DNE: unlink striped dir, fail MDT1/MDT2 ========================================================== 10:03:35 (1773669815) UUID Inodes IUsed IFree IUse% Mounted on lustre-MDT0000_UUID 1024000 316 1023684 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1024000 287 1023713 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 262144 410 261734 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 262144 410 261734 1% /mnt/lustre[OST:1] filesystem_summary: 524071 603 523468 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2520 1268784 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2324 1268980 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2484 1268820 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2284 1269020 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 10:03:45 (1773669825) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Failover mds2 to oleg436-server mount facets: mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 10:04:11 (1773669851) targets are mounted 10:04:11 (1773669851) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/d111g.replay-single/striped_dir: No such file or directory PASS 111g (43s) == replay-single test 112a: DNE: cross MDT rename, fail MDT1 ========================================================== 10:04:18 (1773669858) SKIP: replay-single test_112a needs >= 4 MDTs SKIP 112a (1s) == replay-single test 112b: DNE: cross MDT rename, fail MDT2 ========================================================== 10:04:19 (1773669859) SKIP: replay-single test_112b needs >= 4 MDTs SKIP 112b (1s) == replay-single test 112c: DNE: cross MDT rename, fail MDT3 ========================================================== 10:04:20 (1773669860) SKIP: replay-single test_112c needs >= 4 MDTs SKIP 112c (1s) == replay-single test 112d: DNE: cross MDT rename, fail MDT4 ========================================================== 10:04:21 (1773669861) SKIP: replay-single test_112d needs >= 4 MDTs SKIP 112d (1s) == replay-single test 112e: DNE: cross MDT rename, fail MDT1 and MDT2 ========================================================== 10:04:22 (1773669862) SKIP: replay-single test_112e needs >= 4 MDTs SKIP 112e (2s) == replay-single test 112f: DNE: cross MDT rename, fail MDT1 and MDT3 ========================================================== 10:04:24 (1773669864) SKIP: replay-single test_112f needs >= 4 MDTs SKIP 112f (1s) == replay-single test 112g: DNE: cross MDT rename, fail MDT1 and MDT4 ========================================================== 10:04:25 (1773669865) SKIP: replay-single test_112g needs >= 4 MDTs SKIP 112g (1s) == replay-single test 112h: DNE: cross MDT rename, fail MDT2 and MDT3 ========================================================== 10:04:26 (1773669866) SKIP: replay-single test_112h needs >= 4 MDTs SKIP 112h (1s) == replay-single test 112i: DNE: cross MDT rename, fail MDT2 and MDT4 ========================================================== 10:04:27 (1773669867) SKIP: replay-single test_112i needs >= 4 MDTs SKIP 112i (1s) == replay-single test 112j: DNE: cross MDT rename, fail MDT3 and MDT4 ========================================================== 10:04:28 (1773669868) SKIP: replay-single test_112j needs >= 4 MDTs SKIP 112j (1s) == replay-single test 112k: DNE: cross MDT rename, fail MDT1,MDT2,MDT3 ========================================================== 10:04:29 (1773669869) SKIP: replay-single test_112k needs >= 4 MDTs SKIP 112k (1s) == replay-single test 112l: DNE: cross MDT rename, fail MDT1,MDT2,MDT4 ========================================================== 10:04:30 (1773669870) SKIP: replay-single test_112l needs >= 4 MDTs SKIP 112l (2s) == replay-single test 112m: DNE: cross MDT rename, fail MDT1,MDT3,MDT4 ========================================================== 10:04:32 (1773669872) SKIP: replay-single test_112m needs >= 4 MDTs SKIP 112m (1s) == replay-single test 112n: DNE: cross MDT rename, fail MDT2,MDT3,MDT4 ========================================================== 10:04:33 (1773669873) SKIP: replay-single test_112n needs >= 4 MDTs SKIP 112n (1s) == replay-single test 115: failover for create/unlink striped directory ========================================================== 10:04:34 (1773669874) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2480 1268824 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2272 1269032 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre striped dir -i1 -c2 -H crush /mnt/lustre/d115.replay-single/test_0 striped dir -i1 -c2 -H fnv_1a_64 /mnt/lustre/d115.replay-single/test_1 striped dir -i1 -c2 -H crush2 /mnt/lustre/d115.replay-single/test_2 striped dir -i1 -c2 -H crush /mnt/lustre/d115.replay-single/test_3 striped dir -i1 -c2 -H all_char /mnt/lustre/d115.replay-single/test_4 Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 10:04:39 (1773669879) shut down facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 10:04:55 (1773669895) targets are mounted 10:04:55 (1773669895) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2532 1268772 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2332 1268972 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre striped dir -i0 -c2 -H fnv_1a_64 /mnt/lustre/d115.replay-single/test_0 striped dir -i0 -c2 -H crush /mnt/lustre/d115.replay-single/test_1 striped dir -i0 -c2 -H all_char /mnt/lustre/d115.replay-single/test_2 striped dir -i0 -c2 -H fnv_1a_64 /mnt/lustre/d115.replay-single/test_3 striped dir -i0 -c2 -H crush2 /mnt/lustre/d115.replay-single/test_4 Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:05:05 (1773669905) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:05:21 (1773669921) targets are mounted 10:05:21 (1773669921) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 115 (55s) == replay-single test 116a: large update log master MDT recovery ========================================================== 10:05:29 (1773669929) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2604 1268700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2392 1268912 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre fail_loc=0x80001702 Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:05:35 (1773669935) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:05:51 (1773669951) targets are mounted 10:05:51 (1773669951) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d116a.replay-single/striped_dir has type dir OK PASS 116a (30s) == replay-single test 116b: large update log slave MDT recovery ========================================================== 10:05:59 (1773669959) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2656 1268648 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2560 1268744 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre fail_loc=0x80001702 Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 10:06:04 (1773669964) shut down facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 10:06:20 (1773669980) targets are mounted 10:06:20 (1773669980) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d116b.replay-single/striped_dir has type dir OK PASS 116b (29s) == replay-single test 117: DNE: cross MDT unlink, fail MDT1 and MDT2 ========================================================== 10:06:28 (1773669988) SKIP: replay-single test_117 needs >= 4 MDTs SKIP 117 (1s) == replay-single test 118: invalidate osp update will not cause update log corruption ========================================================== 10:06:29 (1773669989) fail_loc=0x1705 lfs setdirstripe: dirstripe error on '/mnt/lustre/d118.replay-single/striped_dir': Input/output error lfs setdirstripe: cannot create dir '/mnt/lustre/d118.replay-single/striped_dir': Input/output error lfs setdirstripe: dirstripe error on '/mnt/lustre/d118.replay-single/striped_dir1': Input/output error lfs setdirstripe: cannot create dir '/mnt/lustre/d118.replay-single/striped_dir1': Input/output error fail_loc=0x0 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2748 1268556 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2444 1268860 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:06:34 (1773669994) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:06:50 (1773670010) targets are mounted 10:06:50 (1773670010) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 118 (29s) == replay-single test 119: timeout of normal replay does not cause DNE replay fails ========================================================== 10:06:58 (1773670018) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2828 1268476 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2500 1268804 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failover mds1 to oleg436-server oleg436-server.virtnet fail_loc=0x80000714 fail_val=65 Start mds1: mount -t lustre -o localrecov -o recovery_time_hard=60 /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg436-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 68 sec mdt.lustre-MDT0000.recovery_time_hard=180 PASS 119 (89s) == replay-single test 120: DNE fail abort should stop both normal and DNE replay ========================================================== 10:08:27 (1773670107) Replay barrier on lustre-MDT0000 Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failover mds1 to oleg436-server oleg436-server.virtnet Start mds1: mount -t lustre -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 find: '/mnt/lustre/d120.replay-single': No such file or directory find: '/mnt/lustre/d120.replay-single': No such file or directory PASS 120 (28s) == replay-single test 121: lock replay timed out and race ========================================================== 10:08:55 (1773670135) multiop /mnt/lustre/f121.replay-single vs_s TMPPIPE=/tmp/multiop_open_wait_pipe.8192 Stopping /mnt/lustre-mds1 (opts:) on oleg436-server Failover mds1 to oleg436-server oleg436-server.virtnet fail_loc=0x721 fail_val=0 at_max=0 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 pdsh@oleg436-client: oleg436-client: ssh exited with exit code 5 fail_loc=0x0 at_max=600 PASS 121 (202s) == replay-single test 130a: DoM file create (setstripe) replay ========================================================== 10:12:17 (1773670337) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3208 1268096 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2804 1268500 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:12:22 (1773670342) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:12:37 (1773670357) targets are mounted 10:12:37 (1773670357) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 130a (28s) == replay-single test 130b: DoM file create (inherited) replay ========================================================== 10:12:45 (1773670365) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3208 1268096 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2804 1268500 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:12:50 (1773670370) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:13:06 (1773670386) targets are mounted 10:13:06 (1773670386) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 130b (29s) == replay-single test 131a: DoM file write lock replay === 10:13:14 (1773670394) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3212 1268092 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2804 1268500 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre 1+0 records in 1+0 records out 8 bytes copied, 0.00136028 s, 5.9 kB/s Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:13:19 (1773670399) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:13:34 (1773670414) targets are mounted 10:13:34 (1773670414) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 131a (28s) SKIP: replay-single test_131b skipping excluded test 131b == replay-single test 132a: PFL new component instantiate replay ========================================================== 10:13:43 (1773670423) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3216 1268088 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2804 1268500 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3088 7210952 1% /mnt/lustre 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0196655 s, 53.3 MB/s /mnt/lustre/f132a.replay-single lcm_layout_gen: 3 lcm_mirror_count: 1 lcm_entry_count: 2 lcme_id: 1 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: 1048576 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 0 lmm_objects: - 0: { l_ost_idx: 0, l_fid: [0x280000401:0x702:0x0] } lcme_id: 2 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 1048576 lcme_extent.e_end: EOF lmm_stripe_count: 2 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 1 lmm_objects: - 0: { l_ost_idx: 1, l_fid: [0x2c0000401:0x702:0x0] } - 1: { l_ost_idx: 0, l_fid: [0x280000401:0x703:0x0] } Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:13:47 (1773670427) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:14:03 (1773670443) targets are mounted 10:14:03 (1773670443) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f132a.replay-single lcm_layout_gen: 4 lcm_mirror_count: 1 lcm_entry_count: 2 lcme_id: 1 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: 1048576 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 65535 lmm_stripe_offset: 0 lmm_objects: - 0: { l_ost_idx: 0, l_fid: [0x280000401:0x702:0x0] } lcme_id: 2 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 1048576 lcme_extent.e_end: EOF lmm_stripe_count: 2 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 65535 lmm_stripe_offset: 1 lmm_objects: - 0: { l_ost_idx: 1, l_fid: [0x2c0000401:0x702:0x0] } - 1: { l_ost_idx: 0, l_fid: [0x280000401:0x703:0x0] } PASS 132a (27s) == replay-single test 133: check resend of ongoing requests for lwp during failover ========================================================== 10:14:11 (1773670451) seq.srv-lustre-MDT0001.space=clear Starting client: oleg436-client.virtnet: -o user_xattr,flock 192.168.204.136@tcp:/lustre /mnt/lustre fail_val=700 fail_loc=0x80000123 Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:14:15 (1773670455) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:14:29 (1773670469) targets are mounted 10:14:29 (1773670469) facet_failover done PASS 133 (23s) == replay-single test 134: replay creation of a file created in a pool ========================================================== 10:14:34 (1773670474) Creating new pool pool_134 oleg436-server: Pool lustre.pool_134 created Adding targets to pool oleg436-server: OST lustre-OST0001_UUID added to pool lustre.pool_134 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3324 1267980 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2860 1268444 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2568 3604452 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 4112 7209928 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:14:43 (1773670483) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:14:58 (1773670498) targets are mounted 10:14:58 (1773670498) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Destroy the created pools: pool_134 lustre.pool_134 oleg436-server: OST lustre-OST0001_UUID removed from pool lustre.pool_134 oleg436-server: Pool lustre.pool_134 destroyed PASS 134 (37s) == replay-single test 135: Server failure in lock replay phase ========================================================== 10:15:11 (1773670511) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3280 1268024 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2860 1268444 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2568 3604452 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 4112 7209928 1% /mnt/lustre ldlm.cancel_unused_locks_before_replay=0 debug_mb=100 debug=+info +ha +dlmtrace Stopping /mnt/lustre-ost1 (opts:) on oleg436-server Failover ost1 to oleg436-server oleg436-server.virtnet oleg436-server: oleg436-server.virtnet: executing load_module ../libcfs/libcfs/libcfs fail_loc=0x32d fail_val=20 debug_mb=100 debug=+info +ha +dlmtrace Start ost1: mount -t lustre -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-OST0000 oleg436-client.virtnet: executing wait_import_state_mount REPLAY_LOCKS osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in REPLAY_LOCKS state after 0 sec Stopping /mnt/lustre-ost1 (opts:) on oleg436-server Failover ost1 to oleg436-server oleg436-server.virtnet oleg436-server: oleg436-server.virtnet: executing load_module ../libcfs/libcfs/libcfs fail_loc=0 Start ost1: mount -t lustre -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all End of sync pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-OST0000 Stopping /mnt/lustre-ost1 (opts:-f) on oleg436-server Stopping /mnt/lustre-ost2 (opts:-f) on oleg436-server Start ost1: mount -t lustre -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-OST0000 Start ost2: mount -t lustre -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-OST0001 debug_mb=21 debug=super ioctl neterror warning dlmtrace error emerg ha rpctrace vfstrace config console lfsck ldlm.cancel_unused_locks_before_replay=1 PASS 135 (58s) == replay-single test 136: MDS to disconnect all OSPs first, then cleanup ldlm ========================================================== 10:16:09 (1773670569) SKIP: replay-single test_136 needs > 2 MDTs SKIP 136 (1s) == replay-single test 137a: DNE: create under striped dir, fail MDT1 ========================================================== 10:16:10 (1773670570) llite.lustre-ffff8d9744972800.intent_mkdir=1 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3288 1268016 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2876 1268428 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1640 3605380 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2584 3604436 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 4224 7209816 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:16:14 (1773670574) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:16:29 (1773670589) targets are mounted 10:16:29 (1773670589) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d137a.replay-single/striped_dir/dir0 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none /mnt/lustre/d137a.replay-single/striped_dir/dir1 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none PASS 137a (27s) == replay-single test 137b: DNE: create under striped dir, fail MDT2 ========================================================== 10:16:37 (1773670597) llite.lustre-ffff8d9744972800.intent_mkdir=1 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3284 1268020 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2860 1268444 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2584 3604436 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 4144 7209896 1% /mnt/lustre Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server 10:16:42 (1773670602) shut down facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds2 to oleg436-server mount facets: mds2 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 10:16:57 (1773670617) targets are mounted 10:16:57 (1773670617) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d137b.replay-single/striped_dir/dir0 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none /mnt/lustre/d137b.replay-single/striped_dir/dir1 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none PASS 137b (28s) == replay-single test 137c: DNE: create under striped dir, fail MDT1/MDT2 ========================================================== 10:17:05 (1773670625) llite.lustre-ffff8d9744972800.intent_mkdir=1 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3292 1268012 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2888 1268416 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2584 3604436 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 4144 7209896 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3292 1268012 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2888 1268416 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2584 3604436 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 4144 7209896 1% /mnt/lustre Failing mds2 on oleg436-server Stopping /mnt/lustre-mds2 (opts:) on oleg436-server Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:17:13 (1773670633) shut down facet: mds2 facet_host: oleg436-server facet_failover_host: oleg436-server facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server Failover mds2 to oleg436-server mount facets: mds2 mount facets: mds1 Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 10:17:29 (1773670649) targets are mounted 10:17:29 (1773670649) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d137c.replay-single/striped_dir/dir0 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none /mnt/lustre/d137c.replay-single/striped_dir/dir1 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none PASS 137c (33s) == replay-single test 200: Dropping one OBD_PING should not cause disconnect ========================================================== 10:17:38 (1773670658) SKIP: replay-single test_200 Need remote client SKIP 200 (1s) == replay-single test 201: MDT umount cascading disconnects timeouts ========================================================== 10:17:39 (1773670659) fail_loc=0x245 fail_val=8 fail_loc=0x245 fail_val=8 Stopping /mnt/lustre-mds2 (opts:) on oleg436-server Start mds2: mount -t lustre -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0001 Umount took 18 seconds PASS 201 (24s) == replay-single test 202: pfl replay should recovery layout generation ========================================================== 10:18:03 (1773670683) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3320 1267984 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2900 1268404 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2584 3604436 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 4144 7209896 1% /mnt/lustre Failing mds1 on oleg436-server Stopping /mnt/lustre-mds1 (opts:) on oleg436-server 10:18:08 (1773670688) shut down facet: mds1 facet_host: oleg436-server facet_failover_host: oleg436-server Failover mds1 to oleg436-server mount facets: mds1 Start mds1: mount -t lustre -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg436-server: oleg436-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg436-client: oleg436-server: ssh exited with exit code 1 Started lustre-MDT0000 10:18:24 (1773670704) targets are mounted 10:18:24 (1773670704) facet_failover done oleg436-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 202 (29s) == replay-single test 203: resend can hit original request ========================================================== 10:18:32 (1773670712) Starting client: oleg436-client.virtnet: -o user_xattr,flock 192.168.204.136@tcp:/lustre /mnt/lustre2 fail_loc=0x80002403 fail_val=2 STAT SETSTRIPE File: /mnt/lustre/f203.replay-single Size: 0 Blocks: 0 IO Block: 4194304 regular empty file Device: 2c54f966h/743766374d Inode: 144116211905462273 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2026-03-16 10:18:34.000000000 -0400 Modify: 2026-03-16 10:18:34.000000000 -0400 Change: 2026-03-16 10:18:34.000000000 -0400 Birth: 2026-03-16 10:18:34.000000000 -0400 fail_loc=0 fail_val=0 192.168.204.136@tcp:/lustre /mnt/lustre2 lustre rw,checksum,encrypt,flock,lazystatfs,lruresize,nolock,statfs_project,nouser_fid2path,user_xattr,verbose 0 0 Stopping client oleg436-client.virtnet /mnt/lustre2 (opts:) PASS 203 (5s) == replay-single test complete, duration 2450 sec ======== 10:18:37 (1773670717) === replay-single: start cleanup 10:18:38 (1773670718) === === replay-single: finish cleanup 10:18:40 (1773670720) ===