-----============= acceptance-small: replay-single ============----- Tue Apr 1 03:45:02 EDT 2025 mgs: Rocky Linux release 8.10 (Green Obsidian) MGS_OS_ID_LIKE=rhel centos fedora rocky MGS_OS_VERSION_ID=8.10 MGS_OS_ID=rocky MGS_OS_VERSION_CODE=134873088 mds1: Rocky Linux release 8.10 (Green Obsidian) MDS1_OS_VERSION_ID=8.10 MDS1_OS_VERSION_CODE=134873088 MDS1_OS_ID_LIKE=rhel centos fedora rocky MDS1_OS_ID=rocky ost1: Rocky Linux release 8.10 (Green Obsidian) OST1_OS_VERSION_CODE=134873088 OST1_OS_ID_LIKE=rhel centos fedora rocky OST1_OS_VERSION_ID=8.10 OST1_OS_ID=rocky client: Rocky Linux release 8.10 (Green Obsidian) CLIENT_OS_ID=rocky CLIENT_OS_VERSION_CODE=134873088 CLIENT_OS_VERSION_ID=8.10 CLIENT_OS_ID_LIKE=rhel centos fedora rocky oleg651-server: ls: cannot access '/home/green/git/lustre-release/lustre/tests/except/replay-single.*ex': No such file or directory excepting tests: 110f 131b 59 36 === replay-single: start setup 03:45:40 (1743493540) === oleg651-client.virtnet: executing check_config_client /mnt/lustre oleg651-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg651-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff95e586fe4000.idle_timeout=debug osc.lustre-OST0001-osc-ffff95e586fe4000.idle_timeout=debug disable quota as required oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all osd-ldiskfs.track_declares_assert=1 === replay-single: finish setup 03:46:51 (1743493611) === == replay-single test 0a: empty replay =================== 03:46:56 (1743493616) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1776 1285912 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3072 7210968 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 03:47:19 (1743493639) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 03:48:09 (1743493689) targets are mounted 03:48:09 (1743493689) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 0a (104s) == replay-single test 0b: ensure object created after recover exists. (3284) ========================================================== 03:48:40 (1743493720) Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 03:48:52 (1743493732) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 03:49:30 (1743493770) targets are mounted 03:49:30 (1743493770) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec total: 20 open/close in 0.56 seconds: 35.57 ops/second - unlinked 0 (time 1743493790 ; total 0 ; last 0) total: 20 unlinks in 0 seconds: inf unlinks/second PASS 0b (83s) == replay-single test 0c: check replay-barrier =========== 03:50:03 (1743493803) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1772 1285916 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 03:50:28 (1743493828) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 03:51:20 (1743493880) targets are mounted 03:51:20 (1743493880) facet_failover done Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre rm: cannot remove '/mnt/lustre/f0c.replay-single': No such file or directory PASS 0c (162s) == replay-single test 0d: expired recovery with no clients ========================================================== 03:52:45 (1743493965) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1772 1285916 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 03:53:09 (1743493989) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 03:53:59 (1743494039) targets are mounted 03:53:59 (1743494039) facet_failover done Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre PASS 0d (158s) == replay-single test 1: simple create =================== 03:55:23 (1743494123) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1772 1285916 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 03:55:47 (1743494147) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 03:56:41 (1743494201) targets are mounted 03:56:41 (1743494201) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f1.replay-single has type file OK PASS 1 (108s) == replay-single test 2a: touch ========================== 03:57:12 (1743494232) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1772 1285916 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 03:57:35 (1743494255) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 03:58:29 (1743494309) targets are mounted 03:58:29 (1743494309) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f2a.replay-single has type file OK PASS 2a (109s) == replay-single test 2b: touch ========================== 03:59:01 (1743494341) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1772 1285916 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 03:59:23 (1743494363) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:00:01 (1743494401) targets are mounted 04:00:01 (1743494401) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f2b.replay-single has type file OK PASS 2b (91s) == replay-single test 2c: setstripe replay =============== 04:00:32 (1743494432) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1772 1285916 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:00:55 (1743494455) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:01:45 (1743494505) targets are mounted 04:01:45 (1743494505) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f2c.replay-single has type file OK PASS 2c (103s) == replay-single test 2d: setdirstripe replay ============ 04:02:15 (1743494535) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1772 1285916 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1616 1286072 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:02:38 (1743494558) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:03:27 (1743494607) targets are mounted 04:03:28 (1743494608) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d2d.replay-single has type dir OK PASS 2d (102s) == replay-single test 2e: O_CREAT|O_EXCL create replay === 04:03:58 (1743494638) fail_loc=0x8000013b UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1856 1285832 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Succeed in opening file "/mnt/lustre/f2e.replay-single"(flags=O_CREAT) 04:04:25 (1743494665) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:05:16 (1743494716) targets are mounted 04:05:16 (1743494716) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f2e.replay-single has type file OK PASS 2e (112s) == replay-single test 3a: replay failed open(O_DIRECTORY) ========================================================== 04:05:49 (1743494749) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1856 1285832 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre Error in opening file "/mnt/lustre/f3a.replay-single"(flags=O_DIRECTORY) 20: Not a directory Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:06:12 (1743494772) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:07:04 (1743494824) targets are mounted 04:07:04 (1743494824) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f3a.replay-single has type file OK PASS 3a (105s) == replay-single test 3b: replay failed open -ENOMEM ===== 04:07:34 (1743494854) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1856 1285832 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre fail_loc=0x80000114 touch: cannot touch '/mnt/lustre/f3b.replay-single': Cannot allocate memory fail_loc=0 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:08:02 (1743494882) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:08:51 (1743494931) targets are mounted 04:08:51 (1743494931) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/f3b.replay-single: No such file or directory PASS 3b (108s) == replay-single test 3c: replay failed open -ENOMEM ===== 04:09:22 (1743494962) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1856 1285832 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre fail_loc=0x80000128 touch: cannot touch '/mnt/lustre/f3c.replay-single': Cannot allocate memory fail_loc=0 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:09:48 (1743494988) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:10:40 (1743495040) targets are mounted 04:10:40 (1743495040) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/f3c.replay-single: No such file or directory PASS 3c (106s) == replay-single test 4a: |x| 10 open(O_CREAT)s ========== 04:11:08 (1743495068) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1856 1285832 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3076 7210964 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:11:30 (1743495090) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:12:19 (1743495139) targets are mounted 04:12:19 (1743495139) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 4a (98s) == replay-single test 4b: |x| rm 10 files ================ 04:12:47 (1743495167) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1856 1285832 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1556 3605464 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3116 7210924 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:13:10 (1743495190) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:14:02 (1743495242) targets are mounted 04:14:02 (1743495242) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/f4b.replay-single-*: No such file or directory PASS 4b (105s) == replay-single test 5: |x| 220 open(O_CREAT) =========== 04:14:33 (1743495273) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1856 1285832 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1556 3605464 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3116 7210924 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:15:03 (1743495303) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:15:55 (1743495355) targets are mounted 04:15:55 (1743495355) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 5 (131s) == replay-single test 6a: mkdir + contained create ======= 04:16:44 (1743495404) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1888 1285800 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1556 3605464 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3116 7210924 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:17:06 (1743495426) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:17:55 (1743495475) targets are mounted 04:17:55 (1743495475) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d6a.replay-single has type dir OK /mnt/lustre/d6a.replay-single/f6a.replay-single has type file OK PASS 6a (101s) == replay-single test 6b: |X| rmdir ====================== 04:18:25 (1743495505) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1884 1285804 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1556 3605464 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3116 7210924 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:18:48 (1743495528) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:19:38 (1743495578) targets are mounted 04:19:38 (1743495578) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/d6b.replay-single: No such file or directory PASS 6b (102s) == replay-single test 7: mkdir |X| contained create ====== 04:20:08 (1743495608) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1884 1285804 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1556 3605464 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3116 7210924 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:20:32 (1743495632) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:21:22 (1743495682) targets are mounted 04:21:22 (1743495682) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d7.replay-single has type dir OK /mnt/lustre/d7.replay-single/f7.replay-single has type file OK PASS 7 (105s) == replay-single test 8: creat open |X| close ============ 04:21:52 (1743495712) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1556 3605464 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3116 7210924 1% /mnt/lustre multiop /mnt/lustre/f8.replay-single vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7954 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:22:14 (1743495734) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:23:07 (1743495787) targets are mounted 04:23:07 (1743495787) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f8.replay-single /mnt/lustre/f8.replay-single has type file OK PASS 8 (105s) == replay-single test 9: |X| create (same inum/gen) ====== 04:23:37 (1743495817) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1556 3605464 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3116 7210924 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:24:01 (1743495841) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:24:54 (1743495894) targets are mounted 04:24:54 (1743495894) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec old_inum == 144115305935798546, new_inum == 144115305935798546 old_inum and new_inum match PASS 9 (108s) == replay-single test 10: create |X| rename unlink ======= 04:25:26 (1743495926) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1556 3605464 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3116 7210924 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:25:50 (1743495950) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:26:42 (1743496002) targets are mounted 04:26:42 (1743496002) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/f10.replay-single: No such file or directory PASS 10 (106s) == replay-single test 11: create open write rename |X| create-old-name read ========================================================== 04:27:11 (1743496031) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1556 3605464 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3116 7210924 1% /mnt/lustre new old Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:27:33 (1743496053) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:28:09 (1743496089) targets are mounted 04:28:09 (1743496089) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec new old PASS 11 (86s) == replay-single test 12: open, unlink |X| close ========= 04:28:38 (1743496118) multiop /mnt/lustre/f12.replay-single vo_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3124 7210916 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:29:00 (1743496140) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:29:37 (1743496177) targets are mounted 04:29:37 (1743496177) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 12 (88s) == replay-single test 13: open chmod 0 |x| write close === 04:30:07 (1743496207) multiop /mnt/lustre/f13.replay-single vO_wc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 /mnt/lustre/f13.replay-single has perms 00 OK UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3124 7210916 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:30:33 (1743496233) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:31:27 (1743496287) targets are mounted 04:31:27 (1743496287) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f13.replay-single has perms 00 OK /mnt/lustre/f13.replay-single has size 1 OK PASS 13 (109s) == replay-single test 14: open(O_CREAT), unlink |X| close ========================================================== 04:31:56 (1743496316) multiop /mnt/lustre/f14.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3124 7210916 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:32:18 (1743496338) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:33:09 (1743496389) targets are mounted 04:33:09 (1743496389) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 14 (105s) == replay-single test 15: open(O_CREAT), unlink |X| touch new, close ========================================================== 04:33:41 (1743496421) multiop /mnt/lustre/f15.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3124 7210916 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:34:04 (1743496444) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:34:43 (1743496483) targets are mounted 04:34:43 (1743496483) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 15 (90s) == replay-single test 16: |X| open(O_CREAT), unlink, touch new, unlink new ========================================================== 04:35:11 (1743496511) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3124 7210916 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:35:33 (1743496533) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:36:25 (1743496585) targets are mounted 04:36:25 (1743496585) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 16 (102s) == replay-single test 17: |X| open(O_CREAT), |replay| close ========================================================== 04:36:54 (1743496614) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3124 7210916 1% /mnt/lustre multiop /mnt/lustre/f17.replay-single vO_c TMPPIPE=/tmp/multiop_open_wait_pipe.7954 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:37:16 (1743496636) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:38:09 (1743496689) targets are mounted 04:38:09 (1743496689) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f17.replay-single has type file OK PASS 17 (106s) == replay-single test 18: open(O_CREAT), unlink, touch new, close, touch, unlink ========================================================== 04:38:40 (1743496720) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3124 7210916 1% /mnt/lustre multiop /mnt/lustre/f18.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 pid: 52353 will close Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:39:05 (1743496745) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:39:59 (1743496799) targets are mounted 04:39:59 (1743496799) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 18 (108s) == replay-single test 19: mcreate, open, write, rename === 04:40:29 (1743496829) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1560 3605460 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3124 7210916 1% /mnt/lustre old Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:40:51 (1743496851) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:41:43 (1743496903) targets are mounted 04:41:43 (1743496903) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec old PASS 19 (105s) == replay-single test 20a: |X| open(O_CREAT), unlink, replay, close (test mds_cleanup_orphans) ========================================================== 04:42:13 (1743496933) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3128 7210912 1% /mnt/lustre multiop /mnt/lustre/f20a.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:42:34 (1743496954) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:43:25 (1743497005) targets are mounted 04:43:25 (1743497005) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 20a (99s) == replay-single test 20b: write, unlink, eviction, replay (test mds_cleanup_orphans) ========================================================== 04:43:53 (1743497033) /mnt/lustre/f20b.replay-single lmm_stripe_count: 1 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 1090 0x442 0x280000401 pdsh@oleg651-client: oleg651-client: ssh exited with exit code 5 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:44:17 (1743497057) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 dd: error writing '/mnt/lustre/f20b.replay-single': No such file or directory 4150+0 records in 4149+0 records out 16994304 bytes (17 MB, 16 MiB) copied, 46.57 s, 365 kB/s oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:44:52 (1743497092) targets are mounted 04:44:52 (1743497092) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec affected facets: mds1 oleg651-server: oleg651-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 oleg651-server: *.lustre-MDT0000.recovery_status status: COMPLETE Waiting for MDT destroys to complete /mnt/lustre-ost1: 119 MiB (124817408 bytes) trimmed /mnt/lustre-ost2: 0 B (0 bytes) trimmed before 3128, after 3128 PASS 20b (106s) == replay-single test 20c: check that client eviction does not affect file content ========================================================== 04:45:40 (1743497140) multiop /mnt/lustre/f20c.replay-single vOw_c TMPPIPE=/tmp/multiop_open_wait_pipe.7954 -rw-r--r-- 1 root root 1 Apr 1 04:45 /mnt/lustre/f20c.replay-single PASS 20c (19s) == replay-single test 21: |X| open(O_CREAT), unlink touch new, replay, close (test mds_cleanup_orphans) ========================================================== 04:45:59 (1743497159) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre multiop /mnt/lustre/f21.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:46:20 (1743497180) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:47:11 (1743497231) targets are mounted 04:47:11 (1743497231) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 21 (100s) == replay-single test 22: open(O_CREAT), |X| unlink, replay, close (test mds_cleanup_orphans) ========================================================== 04:47:38 (1743497258) multiop /mnt/lustre/f22.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:48:00 (1743497280) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:48:50 (1743497330) targets are mounted 04:48:50 (1743497330) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22 (99s) == replay-single test 23: open(O_CREAT), |X| unlink touch new, replay, close (test mds_cleanup_orphans) ========================================================== 04:49:18 (1743497358) multiop /mnt/lustre/f23.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:49:39 (1743497379) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:50:29 (1743497429) targets are mounted 04:50:29 (1743497429) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23 (99s) == replay-single test 24: open(O_CREAT), replay, unlink, close (test mds_cleanup_orphans) ========================================================== 04:50:57 (1743497457) multiop /mnt/lustre/f24.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:51:17 (1743497477) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:52:07 (1743497527) targets are mounted 04:52:07 (1743497527) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 24 (94s) == replay-single test 25: open(O_CREAT), unlink, replay, close (test mds_cleanup_orphans) ========================================================== 04:52:31 (1743497551) multiop /mnt/lustre/f25.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:52:51 (1743497571) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:53:39 (1743497619) targets are mounted 04:53:39 (1743497619) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 25 (93s) == replay-single test 26: |X| open(O_CREAT), unlink two, close one, replay, close one (test mds_cleanup_orphans) ========================================================== 04:54:05 (1743497645) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre multiop /mnt/lustre/f26.replay-single-1 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 multiop /mnt/lustre/f26.replay-single-2 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:54:25 (1743497665) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:55:02 (1743497702) targets are mounted 04:55:02 (1743497702) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 26 (83s) == replay-single test 27: |X| open(O_CREAT), unlink two, replay, close two (test mds_cleanup_orphans) ========================================================== 04:55:27 (1743497727) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre multiop /mnt/lustre/f27.replay-single-1 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 multiop /mnt/lustre/f27.replay-single-2 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:55:45 (1743497745) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:56:19 (1743497779) targets are mounted 04:56:19 (1743497779) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 27 (75s) == replay-single test 28: open(O_CREAT), |X| unlink two, close one, replay, close one (test mds_cleanup_orphans) ========================================================== 04:56:42 (1743497802) multiop /mnt/lustre/f28.replay-single-1 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 multiop /mnt/lustre/f28.replay-single-2 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:57:01 (1743497821) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:57:37 (1743497857) targets are mounted 04:57:37 (1743497857) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 28 (79s) == replay-single test 29: open(O_CREAT), |X| unlink two, replay, close two (test mds_cleanup_orphans) ========================================================== 04:58:01 (1743497881) multiop /mnt/lustre/f29.replay-single-1 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 multiop /mnt/lustre/f29.replay-single-2 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:58:19 (1743497899) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 04:58:56 (1743497936) targets are mounted 04:58:56 (1743497936) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 29 (78s) == replay-single test 30: open(O_CREAT) two, unlink two, replay, close two (test mds_cleanup_orphans) ========================================================== 04:59:19 (1743497959) multiop /mnt/lustre/f30.replay-single-1 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 multiop /mnt/lustre/f30.replay-single-2 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 04:59:39 (1743497979) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:00:13 (1743498013) targets are mounted 05:00:13 (1743498013) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 30 (76s) == replay-single test 31: open(O_CREAT) two, unlink one, |X| unlink one, close two (test mds_cleanup_orphans) ========================================================== 05:00:36 (1743498036) multiop /mnt/lustre/f31.replay-single-1 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 multiop /mnt/lustre/f31.replay-single-2 vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1880 1285808 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1692 1285996 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:00:53 (1743498053) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:01:24 (1743498084) targets are mounted 05:01:24 (1743498084) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 31 (69s) == replay-single test 32: close() notices client eviction; close() after client eviction ========================================================== 05:01:45 (1743498105) multiop /mnt/lustre/f32.replay-single vO_c TMPPIPE=/tmp/multiop_open_wait_pipe.7954 multiop /mnt/lustre/f32.replay-single vO_c TMPPIPE=/tmp/multiop_open_wait_pipe.7954 pdsh@oleg651-client: oleg651-client: ssh exited with exit code 5 PASS 32 (17s) == replay-single test 33a: fid seq shouldn't be reused after abort recovery ========================================================== 05:02:02 (1743498122) total: 10 open/close in 0.32 seconds: 31.39 ops/second Replay barrier on lustre-MDT0000 Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet Starting mds1: -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 total: 10 open/close in 0.34 seconds: 29.42 ops/second PASS 33a (59s) == replay-single test 33b: test fid seq allocation ======= 05:03:01 (1743498181) fail_loc=0x1311 total: 10 open/close in 0.27 seconds: 37.24 ops/second Replay barrier on lustre-MDT0000 Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet Starting mds1: -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 total: 10 open/close in 0.28 seconds: 35.16 ops/second PASS 33b (58s) == replay-single test 34: abort recovery before client does replay (test mds_cleanup_orphans) ========================================================== 05:03:59 (1743498239) multiop /mnt/lustre/f34.replay-single vO_c TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1724 1285964 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet Starting mds1: -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 PASS 34 (59s) == replay-single test 35: test recovery from llog for unlink op ========================================================== 05:04:58 (1743498298) fail_loc=0x80000119 Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet Starting mds1: -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 rm: cannot remove '/mnt/lustre/f35.replay-single': Input/output error oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 Can't lstat /mnt/lustre/f35.replay-single: No such file or directory PASS 35 (48s) SKIP: replay-single test_36 skipping ALWAYS excluded test 36 == replay-single test 37: abort recovery before client does replay (test mds_cleanup_orphans for directories) ========================================================== 05:05:49 (1743498349) multiop /mnt/lustre/d37.replay-single/f37.replay-single vdD_c TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1984 1285704 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1788 1285900 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet Starting mds1: -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 PASS 37 (61s) == replay-single test 38: test recovery from unlink llog (test llog_gen_rec) ========================================================== 05:06:49 (1743498409) - open/close 471 (time 1743498424.62 total 10.00 last 47.08) total: 800 open/close in 17.25 seconds: 46.37 ops/second - unlinked 0 (time 1743498437 ; total 0 ; last 0) total: 400 unlinks in 6 seconds: 66.666664 unlinks/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2116 1285572 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:07:38 (1743498458) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:08:14 (1743498494) targets are mounted 05:08:14 (1743498494) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1743498511 ; total 1 ; last 1) total: 400 unlinks in 9 seconds: 44.444443 unlinks/second Can't lstat /mnt/lustre/f38.replay-single-*: No such file or directory PASS 38 (124s) == replay-single test 39: test recovery from unlink llog (test llog_gen_rec) ========================================================== 05:08:53 (1743498533) - open/close 437 (time 1743498549.32 total 10.01 last 43.66) total: 800 open/close in 18.08 seconds: 44.25 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2116 1285572 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1568 3603104 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3132 7208560 1% /mnt/lustre - unlinked 0 (time 1743498572 ; total 0 ; last 0) total: 400 unlinks in 6 seconds: 66.666664 unlinks/second Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:09:45 (1743498585) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:10:19 (1743498619) targets are mounted 05:10:19 (1743498619) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1743498635 ; total 0 ; last 0) total: 400 unlinks in 9 seconds: 44.444443 unlinks/second Can't lstat /mnt/lustre/f39.replay-single-*: No such file or directory PASS 39 (123s) == replay-single test 41: read from a valid osc while other oscs are invalid ========================================================== 05:10:56 (1743498656) 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.00759568 s, 539 kB/s 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.0363948 s, 113 kB/s PASS 41 (14s) == replay-single test 42: recovery after ost failure ===== 05:11:10 (1743498670) - open/close 436 (time 1743498686.94 total 10.02 last 43.53) total: 800 open/close in 17.97 seconds: 44.52 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2140 1285548 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3603100 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7208556 1% /mnt/lustre - unlinked 0 (time 1743498709 ; total 0 ; last 0) total: 400 unlinks in 6 seconds: 66.666664 unlinks/second debug=-1 Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 05:12:02 (1743498722) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 05:12:33 (1743498753) targets are mounted 05:12:33 (1743498753) facet_failover done wait for MDS to timeout and recover - unlinked 0 (time 1743498798 ; total 0 ; last 0) total: 400 unlinks in 6 seconds: 66.666664 unlinks/second Can't lstat /mnt/lustre/f42.replay-single-*: No such file or directory PASS 42 (143s) == replay-single test 43: mds osc import failure during recovery; don't LBUG ========================================================== 05:13:34 (1743498814) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2196 1285492 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre fail_loc=0x80000204 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:13:50 (1743498830) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:14:23 (1743498863) targets are mounted 05:14:23 (1743498863) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 43 (80s) == replay-single test 44a: race in target handle connect ========================================================== 05:14:54 (1743498894) at_max=40 1 of 10 (1743498901) service : cur 5 worst 5 (at 1743493331, 5571s ago) 1 4 4 4 fail_loc=0x80000701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre 2 of 10 (1743498908) service : cur 5 worst 5 (at 1743493331, 5578s ago) 5 4 4 4 fail_loc=0x80000701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre 3 of 10 (1743498915) service : cur 5 worst 5 (at 1743493331, 5585s ago) 5 4 4 4 fail_loc=0x80000701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre 4 of 10 (1743498922) service : cur 6 worst 6 (at 1743498923, 0s ago) 6 4 4 4 fail_loc=0x80000701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre 5 of 10 (1743498929) service : cur 6 worst 6 (at 1743498923, 8s ago) 6 4 4 4 fail_loc=0x80000701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre 6 of 10 (1743498937) service : cur 6 worst 6 (at 1743498923, 15s ago) 6 4 4 4 fail_loc=0x80000701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre 7 of 10 (1743498944) service : cur 6 worst 6 (at 1743498923, 22s ago) 6 4 4 4 fail_loc=0x80000701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre 8 of 10 (1743498951) service : cur 6 worst 6 (at 1743498923, 30s ago) 6 4 4 4 fail_loc=0x80000701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre 9 of 10 (1743498959) service : cur 6 worst 6 (at 1743498923, 37s ago) 6 4 4 4 fail_loc=0x80000701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre 10 of 10 (1743498966) service : cur 6 worst 6 (at 1743498923, 44s ago) 6 4 4 4 fail_loc=0x80000701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre fail_loc=0 at_max=600 PASS 44a (90s) == replay-single test 44b: race in target handle connect ========================================================== 05:16:24 (1743498984) 1 of 10 (1743498987) service : cur 6 worst 6 (at 1743498923, 65s ago) 6 4 4 4 fail_loc=0x80000704 error: recover: Connection timed out Filesystem 1K-blocks Used Available Use% Mounted on 192.168.206.151@tcp:/lustre 7666232 3136 7210904 1% /mnt/lustre 2 of 10 (1743499009) service : cur 6 worst 6 (at 1743498923, 88s ago) 6 4 4 4 fail_loc=0x80000704 error: recover: Connection timed out Filesystem 1K-blocks Used Available Use% Mounted on 192.168.206.151@tcp:/lustre 7666232 3136 7210904 1% /mnt/lustre 3 of 10 (1743499031) service : cur 40 worst 40 (at 1743499029, 5s ago) 40 6 4 4 fail_loc=0x80000704 error: recover: Connection timed out Filesystem 1K-blocks Used Available Use% Mounted on 192.168.206.151@tcp:/lustre 7666232 3136 7210904 1% /mnt/lustre 4 of 10 (1743499055) service : cur 40 worst 40 (at 1743499029, 28s ago) 40 6 4 4 fail_loc=0x80000704 error: recover: Connection timed out Filesystem 1K-blocks Used Available Use% Mounted on 192.168.206.151@tcp:/lustre 7666232 3136 7210904 1% /mnt/lustre 5 of 10 (1743499078) service : cur 41 worst 41 (at 1743499073, 7s ago) 41 6 4 4 fail_loc=0x80000704 error: recover: Connection timed out Filesystem 1K-blocks Used Available Use% Mounted on 192.168.206.151@tcp:/lustre 7666232 3136 7210904 1% /mnt/lustre 6 of 10 (1743499102) service : cur 41 worst 41 (at 1743499073, 31s ago) 41 6 4 4 fail_loc=0x80000704 error: recover: Connection timed out Filesystem 1K-blocks Used Available Use% Mounted on 192.168.206.151@tcp:/lustre 7666232 3136 7210904 1% /mnt/lustre 7 of 10 (1743499125) service : cur 41 worst 41 (at 1743499073, 54s ago) 41 6 4 4 fail_loc=0x80000704 error: recover: Connection timed out Filesystem 1K-blocks Used Available Use% Mounted on 192.168.206.151@tcp:/lustre 7666232 3136 7210904 1% /mnt/lustre 8 of 10 (1743499148) service : cur 41 worst 41 (at 1743499073, 77s ago) 41 6 4 4 fail_loc=0x80000704 error: recover: Connection timed out Filesystem 1K-blocks Used Available Use% Mounted on 192.168.206.151@tcp:/lustre 7666232 3136 7210904 1% /mnt/lustre 9 of 10 (1743499171) service : cur 41 worst 41 (at 1743499073, 100s ago) 41 6 4 4 fail_loc=0x80000704 error: recover: Connection timed out Filesystem 1K-blocks Used Available Use% Mounted on 192.168.206.151@tcp:/lustre 7666232 3136 7210904 1% /mnt/lustre 10 of 10 (1743499194) service : cur 41 worst 41 (at 1743499073, 123s ago) 1 41 6 4 fail_loc=0x80000704 error: recover: Connection timed out Filesystem 1K-blocks Used Available Use% Mounted on 192.168.206.151@tcp:/lustre 7666232 3136 7210904 1% /mnt/lustre PASS 44b (240s) == replay-single test 44c: race in target handle connect ========================================================== 05:20:24 (1743499224) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2092 1285596 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre total: 100 create in 1.59 seconds: 62.76 ops/second fail_loc=0x80000712 Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet Starting mds1: -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 unlink(/mnt/lustre/f44c.replay-single-0) error: No such file or directory total: 0 unlinks in 0 seconds: -nan unlinks/second Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:21:14 (1743499274) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:21:45 (1743499305) targets are mounted 05:21:45 (1743499305) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec unlink(/mnt/lustre/f44c.replay-single-0) error: No such file or directory total: 0 unlinks in 0 seconds: -nan unlinks/second PASS 44c (99s) == replay-single test 45: Handle failed close ============ 05:22:04 (1743499324) multiop /mnt/lustre/f45.replay-single vO_c TMPPIPE=/tmp/multiop_open_wait_pipe.7954 /mnt/lustre/f45.replay-single has type file OK PASS 45 (11s) == replay-single test 46: Don't leak file handle after open resend (3325) ========================================================== 05:22:14 (1743499334) fail_loc=0x122 fail_loc=0 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:22:40 (1743499360) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:23:02 (1743499382) targets are mounted 05:23:02 (1743499382) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec lfs path2fid: cannot get fid for 'f46.replay-single': No such file or directory PASS 46 (67s) == replay-single test 47: MDS->OSC failure during precreate cleanup (2824) ========================================================== 05:23:21 (1743499401) total: 20 open/close in 0.47 seconds: 42.49 ops/second Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 05:23:28 (1743499408) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 05:23:51 (1743499431) targets are mounted 05:23:51 (1743499431) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec fail_loc=0x80000204 total: 20 open/close in 0.38 seconds: 52.03 ops/second - unlinked 0 (time 1743499505 ; total 0 ; last 0) total: 20 unlinks in 0 seconds: inf unlinks/second PASS 47 (111s) == replay-single test 48: MDS->OSC failure during precreate cleanup (2824) ========================================================== 05:25:12 (1743499512) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2124 1285564 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1852 1285836 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre total: 20 open/close in 0.42 seconds: 47.59 ops/second Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:25:26 (1743499526) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:25:58 (1743499558) targets are mounted 05:25:58 (1743499558) facet_failover done fail_loc=0x80000216 total: 20 open/close in 0.43 seconds: 46.70 ops/second - unlinked 0 (time 1743499621 ; total 0 ; last 0) total: 40 unlinks in 1 seconds: 40.000000 unlinks/second PASS 48 (118s) == replay-single test 50: Double OSC recovery, don't LASSERT (3812) ========================================================== 05:27:10 (1743499630) PASS 50 (18s) == replay-single test 52: time out lock replay (3764) ==== 05:27:28 (1743499648) multiop /mnt/lustre/f52.replay-single vs_s TMPPIPE=/tmp/multiop_open_wait_pipe.7954 fail_loc=0x80000157 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:27:35 (1743499655) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:28:07 (1743499687) targets are mounted 05:28:07 (1743499687) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0x0 fail_loc=0x0 PASS 52 (112s) == replay-single test 53a: |X| close request while two MDC requests in flight ========================================================== 05:29:20 (1743499760) fail_loc=0x80000115 fail_loc=0 Replay barrier on lustre-MDT0000 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:29:36 (1743499776) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:30:11 (1743499811) targets are mounted 05:30:11 (1743499811) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d53a.replay-single-1/f has type file OK /mnt/lustre/d53a.replay-single-2/f has type file OK PASS 53a (70s) == replay-single test 53b: |X| open request while two MDC requests in flight ========================================================== 05:30:30 (1743499830) multiop /mnt/lustre/d53b.replay-single-1/f vO_c TMPPIPE=/tmp/multiop_open_wait_pipe.7954 fail_loc=0x80000107 fail_loc=0 Replay barrier on lustre-MDT0000 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:30:46 (1743499846) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:31:20 (1743499880) targets are mounted 05:31:20 (1743499880) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d53b.replay-single-1/f has type file OK /mnt/lustre/d53b.replay-single-2/f has type file OK PASS 53b (69s) == replay-single test 53c: |X| open request and close request while two MDC requests in flight ========================================================== 05:31:39 (1743499899) fail_loc=0x80000107 fail_loc=0x80000115 Replay barrier on lustre-MDT0000 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:31:55 (1743499915) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:32:28 (1743499948) targets are mounted 05:32:28 (1743499948) facet_failover done fail_loc=0 /mnt/lustre/d53c.replay-single-1/f has type file OK /mnt/lustre/d53c.replay-single-2/f has type file OK PASS 53c (60s) == replay-single test 53d: close reply while two MDC requests in flight ========================================================== 05:32:39 (1743499959) fail_loc=0x8000013b fail_loc=0 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:32:49 (1743499969) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:33:20 (1743500000) targets are mounted 05:33:20 (1743500000) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d53d.replay-single-1/f has type file OK /mnt/lustre/d53d.replay-single-2/f has type file OK PASS 53d (60s) == replay-single test 53e: |X| open reply while two MDC requests in flight ========================================================== 05:33:39 (1743500019) fail_loc=0x119 fail_loc=0 Replay barrier on lustre-MDT0000 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:33:55 (1743500035) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:34:28 (1743500068) targets are mounted 05:34:28 (1743500068) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d53e.replay-single-1/f has type file OK /mnt/lustre/d53e.replay-single-2/f has type file OK PASS 53e (68s) == replay-single test 53f: |X| open reply and close reply while two MDC requests in flight ========================================================== 05:34:47 (1743500087) fail_loc=0x119 fail_loc=0x8000013b Replay barrier on lustre-MDT0000 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:35:03 (1743500103) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:35:36 (1743500136) targets are mounted 05:35:36 (1743500136) facet_failover done fail_loc=0 /mnt/lustre/d53f.replay-single-1/f has type file OK /mnt/lustre/d53f.replay-single-2/f has type file OK PASS 53f (60s) == replay-single test 53g: |X| drop open reply and close request while close and open are both in flight ========================================================== 05:35:47 (1743500147) fail_loc=0x119 fail_loc=0x80000115 fail_loc=0 Replay barrier on lustre-MDT0000 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:36:04 (1743500164) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:36:35 (1743500195) targets are mounted 05:36:35 (1743500195) facet_failover done /mnt/lustre/d53g.replay-single-1/f has type file OK /mnt/lustre/d53g.replay-single-2/f has type file OK PASS 53g (57s) == replay-single test 53h: open request and close reply while two MDC requests in flight ========================================================== 05:36:44 (1743500204) fail_loc=0x80000107 fail_loc=0x8000013b Replay barrier on lustre-MDT0000 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:37:01 (1743500221) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:37:31 (1743500251) targets are mounted 05:37:31 (1743500251) facet_failover done fail_loc=0 /mnt/lustre/d53h.replay-single-1/f has type file OK /mnt/lustre/d53h.replay-single-2/f has type file OK PASS 53h (57s) == replay-single test 55: let MDS_CHECK_RESENT return the original return code instead of 0 ========================================================== 05:37:41 (1743500261) fail_loc=0x8000012b fail_loc=0x0 touch: cannot touch '/mnt/lustre/f55.replay-single': No such file or directory rm: cannot remove '/mnt/lustre/f55.replay-single': No such file or directory PASS 55 (70s) == replay-single test 56: don't replay a symlink open request (3440) ========================================================== 05:38:51 (1743500331) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2132 1285556 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1852 1285836 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:39:05 (1743500345) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:39:36 (1743500376) targets are mounted 05:39:36 (1743500376) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 56 (73s) == replay-single test 57: test recovery from llog for setattr op ========================================================== 05:40:05 (1743500405) fail_loc=0x8000012c UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2132 1285556 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1852 1285836 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:40:18 (1743500418) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:40:53 (1743500453) targets are mounted 05:40:53 (1743500453) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec affected facets: mds1 oleg651-server: oleg651-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 oleg651-server: *.lustre-MDT0000.recovery_status status: COMPLETE Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0000-osc-MDT0001.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0001.old_sync_processed wait 40 secs maximumly for oleg651-server mds-ost sync done. /mnt/lustre/f57.replay-single has type file OK fail_loc=0x0 PASS 57 (72s) == replay-single test 58a: test recovery from llog for setattr op (test llog_gen_rec) ========================================================== 05:41:18 (1743500478) fail_loc=0x8000012c - open/close 766 (time 1743500493.87 total 10.01 last 76.52) - open/close 1666 (time 1743500503.88 total 20.02 last 89.96) - open/close 2405 (time 1743500513.88 total 30.02 last 73.88) total: 2500 open/close in 31.32 seconds: 79.82 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2428 1285260 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1852 1285836 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:42:07 (1743500527) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:42:40 (1743500560) targets are mounted 05:42:40 (1743500560) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0x0 - unlinked 0 (time 1743500590 ; total 0 ; last 0) total: 2500 unlinks in 19 seconds: 131.578949 unlinks/second PASS 58a (141s) == replay-single test 58b: test replay of setxattr op ==== 05:43:39 (1743500619) Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre2 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2412 1285276 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1852 1285836 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:43:55 (1743500635) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:44:29 (1743500669) targets are mounted 05:44:29 (1743500669) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Stopping client oleg651-client.virtnet /mnt/lustre2 (opts:) oleg651-client.virtnet: executing wait_import_state_mount FULL mgc.*.mgs_server_uuid mgc.*.mgs_server_uuid in FULL state after 0 sec PASS 58b (75s) == replay-single test 58c: resend/reconstruct setxattr op ========================================================== 05:44:54 (1743500694) Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre2 fail_val=0 fail_loc=0x123 fail_loc=0 fail_loc=0x119 fail_loc=0 Stopping client oleg651-client.virtnet /mnt/lustre2 (opts:) PASS 58c (143s) SKIP: replay-single test_59 skipping ALWAYS excluded test 59 == replay-single test 60: test llog post recovery init vs llog unlink ========================================================== 05:47:19 (1743500839) total: 200 open/close in 2.37 seconds: 84.22 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2268 1285420 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1852 1285836 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre - unlinked 0 (time 1743500855 ; total 0 ; last 0) total: 100 unlinks in 1 seconds: 100.000000 unlinks/second Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:47:40 (1743500860) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:48:10 (1743500890) targets are mounted 05:48:10 (1743500890) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1743500900 ; total 0 ; last 0) total: 100 unlinks in 2 seconds: 50.000000 unlinks/second PASS 60 (69s) == replay-single test 61a: test race llog recovery vs llog cleanup ========================================================== 05:48:29 (1743500909) - open/close 792 (time 1743500924.69 total 10.02 last 79.06) total: 800 open/close in 10.09 seconds: 79.27 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2352 1285336 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2024 1285664 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1564 3605456 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3136 7210904 1% /mnt/lustre - unlinked 0 (time 1743500936 ; total 0 ; last 0) total: 800 unlinks in 5 seconds: 160.000000 unlinks/second fail_val=0 fail_loc=0x80000221 Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 05:49:06 (1743500946) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 05:49:34 (1743500974) targets are mounted 05:49:34 (1743500974) facet_failover done Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 05:49:48 (1743500988) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 05:50:11 (1743501011) targets are mounted 05:50:11 (1743501011) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec fail_val=0 fail_loc=0x0 Can't lstat /mnt/lustre/d61a.replay-single/f61a.replay-single-*: No such file or directory PASS 61a (153s) == replay-single test 61b: test race mds llog sync vs llog cleanup ========================================================== 05:51:02 (1743501062) fail_loc=0x8000013a Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:51:08 (1743501068) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:51:30 (1743501090) targets are mounted 05:51:30 (1743501090) facet_failover done Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:51:44 (1743501104) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:52:05 (1743501125) targets are mounted 05:52:06 (1743501126) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1+0 records in 1+0 records out 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.0105241 s, 389 kB/s PASS 61b (80s) == replay-single test 61c: test race mds llog sync vs llog cleanup ========================================================== 05:52:23 (1743501143) fail_val=0 fail_loc=0x80000222 Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 05:52:41 (1743501161) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 05:53:02 (1743501182) targets are mounted 05:53:02 (1743501182) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec fail_val=0 fail_loc=0x0 PASS 61c (56s) == replay-single test 61d: error in llog_setup should cleanup the llog context correctly ========================================================== 05:53:19 (1743501199) Stopping /mnt/lustre-mds1 (opts:) on oleg651-server fail_loc=0x80000605 Starting mgs: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: mount.lustre: mount /dev/mapper/mds1_flakey at /mnt/lustre-mds1 failed: Operation not supported pdsh@oleg651-client: oleg651-server: ssh exited with exit code 95 Start of /dev/mapper/mds1_flakey on mgs failed 95 fail_loc=0 Starting mgs: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 PASS 61d (28s) == replay-single test 62: don't mis-drop resent replay === 05:53:47 (1743501227) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2344 1285344 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2008 1285680 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1568 3605452 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3140 7210900 1% /mnt/lustre total: 25 open/close in 0.25 seconds: 100.62 ops/second fail_loc=0x80000707 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 05:53:58 (1743501238) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 05:54:29 (1743501269) targets are mounted 05:54:29 (1743501269) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 - unlinked 0 (time 1743501335 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second PASS 62 (114s) == replay-single test 65a: AT: verify early replies ====== 05:55:41 (1743501341) at_history=8 at_history=8 debug=other fail_val=11000 fail_loc=0x8000050a 00000100:00001000:1.0:1743501378.120950:0:127304:0:(client.c:536:ptlrpc_at_recv_early_reply()) @@@ Early reply #1, new deadline in 34s (25s) req@ffff95e59837dc40 x1828185032215936/t0(0) o101->lustre-MDT0000-mdc-ffff95e5981e1000@192.168.206.151@tcp:12/10 lens 664/66320 e 1 to 0 dl 1743501412 ref 2 fl Rpc:PQr/200/ffffffff rc 0/-1 job:'createmany.0' uid:0 gid:0 portal 12 : cur 36 worst 41 (at 1743499102, 2283s ago) 40 40 40 40 portal 29 : cur 5 worst 5 (at 1743494107, 7278s ago) 5 0 0 5 portal 23 : cur 5 worst 5 (at 1743494107, 7278s ago) 5 5 5 0 portal 30 : cur 5 worst 5 (at 1743494129, 7256s ago) 5 0 0 0 portal 17 : cur 5 worst 5 (at 1743494448, 6937s ago) 5 5 5 5 portal 24 : cur 5 worst 5 (at 1743495156, 6229s ago) 5 0 0 0 portal 13 : cur 5 worst 5 (at 1743495942, 5443s ago) 5 0 0 0 portal 12 : cur 5 worst 41 (at 1743499102, 2292s ago) 40 40 40 40 portal 29 : cur 5 worst 5 (at 1743494107, 7287s ago) 5 0 0 5 portal 23 : cur 5 worst 5 (at 1743494107, 7287s ago) 5 5 5 0 portal 30 : cur 5 worst 5 (at 1743494129, 7265s ago) 5 0 0 0 portal 17 : cur 5 worst 5 (at 1743494448, 6946s ago) 5 5 5 5 portal 24 : cur 5 worst 5 (at 1743495156, 6238s ago) 5 0 0 0 portal 13 : cur 5 worst 5 (at 1743495942, 5452s ago) 5 0 0 0 PASS 65a (59s) == replay-single test 65b: AT: verify early replies on packed reply / bulk ========================================================== 05:56:41 (1743501401) at_history=8 at_history=8 debug=other trace fail_val=11 fail_loc=0x224 fail_loc=0 00000100:00001000:0.0:1743501438.536426:0:2438:0:(client.c:536:ptlrpc_at_recv_early_reply()) @@@ Early reply #1, new deadline in 36s (26s) req@ffff95e586171d00 x1828185032237184/t0(0) o4->lustre-OST0000-osc-ffff95e5981e1000@192.168.206.151@tcp:6/4 lens 4584/448 e 1 to 0 dl 1743501474 ref 2 fl Rpc:Qr/200/ffffffff rc 0/-1 job:'multiop.0' uid:0 gid:0 portal 28 : cur 5 worst 5 (at 1743494107, 7339s ago) 5 5 5 5 portal 7 : cur 5 worst 5 (at 1743494110, 7336s ago) 5 5 5 5 portal 17 : cur 5 worst 5 (at 1743494334, 7112s ago) 5 0 5 0 portal 6 : cur 37 worst 37 (at 1743501443, 3s ago) 37 0 0 0 PASS 65b (51s) == replay-single test 66a: AT: verify MDT service time adjusts with no early replies ========================================================== 05:57:32 (1743501452) at_history=8 at_history=8 portal 12 : cur 5 worst 41 (at 1743499102, 2377s ago) 5 40 40 40 fail_val=5000 fail_loc=0x8000050a portal 12 : cur 5 worst 41 (at 1743499102, 2385s ago) 5 40 40 40 fail_val=10000 fail_loc=0x8000050a portal 12 : cur 36 worst 41 (at 1743499102, 2397s ago) 36 40 40 40 fail_loc=0 portal 12 : cur 5 worst 41 (at 1743499102, 2408s ago) 36 40 40 40 Current MDT timeout 5, worst 41 PASS 66a (64s) == replay-single test 66b: AT: verify net latency adjusts ========================================================== 05:58:36 (1743501516) at_history=8 at_history=8 fail_val=10 fail_loc=0x50c fail_loc=0 network timeout orig 5, cur 11, worst 11 PASS 66b (94s) == replay-single test 67a: AT: verify slow request processing doesn't induce reconnects ========================================================== 06:00:10 (1743501610) at_history=8 at_history=8 fail_val=400 fail_loc=0x50a fail_loc=0 0 osc reconnect attempts on gradual slow PASS 67a (77s) == replay-single test 67b: AT: verify instant slowdown doesn't induce reconnects ========================================================== 06:01:27 (1743501687) at_history=8 at_history=8 Creating to objid 5057 on ost lustre-OST0000... fail_val=20000 fail_loc=0x80000223 total: 18 open/close in 0.14 seconds: 131.14 ops/second Connected clients: oleg651-client.virtnet oleg651-client.virtnet service : cur 5 worst 5 (at 1743493387, 8331s ago) 1 1 1 0 phase 2 0 osc reconnect attempts on instant slow fail_loc=0x80000223 fail_loc=0 Connected clients: oleg651-client.virtnet oleg651-client.virtnet service : cur 5 worst 5 (at 1743493387, 8334s ago) 1 1 1 1 0 osc reconnect attempts on 2nd slow PASS 67b (40s) == replay-single test 68: AT: verify slowing locks ======= 06:02:07 (1743501727) at_history=8 at_history=8 /home/green/git/lustre-release/lustre/tests/replay-single.sh: line 1975: $ldlm_enqueue_min: ambiguous redirect fail_val=19 fail_loc=0x80000312 fail_val=25 fail_loc=0x80000312 fail_loc=0 /home/green/git/lustre-release/lustre/tests/replay-single.sh: line 1990: $ldlm_enqueue_min: ambiguous redirect PASS 68 (82s) Cleaning up AT ... == replay-single test 70a: check multi client t-f ======== 06:03:29 (1743501809) SKIP: replay-single test_70a Need two or more clients, have 1 SKIP 70a (4s) == replay-single test 70b: dbench 2mdts recovery; 1 clients ========================================================== 06:03:33 (1743501813) Starting client oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Started clients oleg651-client.virtnet: 192.168.206.151@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,encrypt,statfs_project) striped dir -i0 -c2 -H crush2 /mnt/lustre/d70b.replay-single + MISSING_DBENCH_OK= + PATH=/opt/iozone/bin:/opt/iozone/bin:/home/green/git/lustre-release/lustre/tests/mpi:/home/green/git/lustre-release/lustre/tests/racer:/home/green/git/lustre-release/lustre/../lustre-iokit/sgpdd-survey:/home/green/git/lustre-release/lustre/tests:/home/green/git/lustre-release/lustre/utils/gss:/home/green/git/lustre-release/lustre/utils:/opt/iozone/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin::/home/green/git/lustre-release/lustre/utils:/home/green/git/lustre-release/lustre/tests:/sbin:/usr/sbin:/home/green/git/lustre-release/lustre/utils:/home/green/git/lustre-release/lustre/tests/: + DBENCH_LIB= + TESTSUITE=replay-single + TESTNAME=test_70b + MOUNT=/mnt/lustre ++ hostname + DIR=/mnt/lustre/d70b.replay-single/oleg651-client.virtnet + LCTL=/home/green/git/lustre-release/lustre/utils/lctl + rundbench 1 -t 300 dbench: no process found Started rundbench load pid=132958 ... UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2468 1285220 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2164 1285524 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3593592 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 26148 3580872 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 27724 7174464 1% /mnt/lustre test_70b fail mds1 1 times Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:03:54 (1743501834) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-client.virtnet: looking for dbench program oleg651-client.virtnet: /usr/bin/dbench oleg651-client.virtnet: creating output directory /mnt/lustre/d70b.replay-single/oleg651-client.virtnet oleg651-client.virtnet: mkdir: created directory '/mnt/lustre/d70b.replay-single/oleg651-client.virtnet' oleg651-client.virtnet: found dbench client file /usr/share/dbench/client.txt oleg651-client.virtnet: '/usr/share/dbench/client.txt' -> 'client.txt' oleg651-client.virtnet: running 'dbench 1 -t 300' on /mnt/lustre/d70b.replay-single/oleg651-client.virtnet at Tue Apr 1 06:03:37 EDT 2025 oleg651-client.virtnet: waiting for dbench pid 132985 oleg651-client.virtnet: dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 oleg651-client.virtnet: oleg651-client.virtnet: Running for 300 seconds with load 'client.txt' and minimum warmup 60 secs oleg651-client.virtnet: failed to create barrier semaphore oleg651-client.virtnet: 0 of 1 processes prepared for launch 0 sec oleg651-client.virtnet: 1 of 1 processes prepared for launch 0 sec oleg651-client.virtnet: releasing clients oleg651-client.virtnet: 1 139 6.02 MB/sec warmup 1 sec latency 45.799 ms oleg651-client.virtnet: 1 279 5.22 MB/sec warmup 2 sec latency 39.462 ms oleg651-client.virtnet: 1 411 4.85 MB/sec warmup 3 sec latency 46.647 ms oleg651-client.virtnet: 1 664 5.29 MB/sec warmup 4 sec latency 85.597 ms oleg651-client.virtnet: 1 873 4.33 MB/sec warmup 5 sec latency 26.574 ms oleg651-client.virtnet: 1 1051 3.81 MB/sec warmup 6 sec latency 34.273 ms oleg651-client.virtnet: 1 1245 3.30 MB/sec warmup 7 sec latency 26.960 ms oleg651-client.virtnet: 1 1512 3.26 MB/sec warmup 8 sec latency 33.523 ms oleg651-client.virtnet: 1 1729 2.92 MB/sec warmup 9 sec latency 29.768 ms oleg651-client.virtnet: 1 1996 2.65 MB/sec warmup 10 sec latency 28.471 ms oleg651-client.virtnet: 1 2327 2.53 MB/sec warmup 11 sec latency 33.550 ms oleg651-client.virtnet: 1 2455 2.34 MB/sec warmup 12 sec latency 39.935 ms oleg651-client.virtnet: 1 2758 2.46 MB/sec warmup 13 sec latency 29.001 ms oleg651-client.virtnet: 1 3026 2.39 MB/sec warmup 14 sec latency 35.087 ms oleg651-client.virtnet: 1 3543 2.56 MB/sec warmup 15 sec latency 77.298 ms oleg651-client.virtnet: 1 3793 2.48 MB/sec warmup 16 sec latency 29.649 ms oleg651-client.virtnet: 1 3979 2.36 MB/sec warmup 17 sec latency 58.253 ms oleg651-client.virtnet: 1 4246 2.24 MB/sec warmup 18 sec latency 69.853 ms oleg651-client.virtnet: 1 4483 2.15 MB/sec warmup 19 sec latency 21.771 ms oleg651-client.virtnet: 1 4758 2.11 MB/sec warmup 20 sec latency 19.098 ms oleg651-client.virtnet: 1 5063 2.15 MB/sec warmup 21 sec latency 26.056 ms oleg651-client.virtnet: 1 5284 2.06 MB/sec warmup 22 sec latency 25.313 ms oleg651-client.virtnet: 1 5561 1.98 MB/sec warmup 23 sec latency 19.620 ms oleg651-client.virtnet: 1 5907 1.96 MB/sec warmup 24 sec latency 21.725 ms oleg651-client.virtnet: 1 6191 1.94 MB/sec warmup 25 sec latency 19.274 ms oleg651-client.virtnet: 1 6792 2.06 MB/sec warmup 26 sec latency 22.477 ms oleg651-client.virtnet: 1 7210 2.14 MB/sec warmup 27 sec latency 24.860 ms oleg651-client.virtnet: 1 7396 2.10 MB/sec warmup 28 sec latency 27.471 ms oleg651-client.virtnet: 1 7589 2.04 MB/sec warmup 29 sec latency 33.737 ms oleg651-client.virtnet: 1 7805 1.98 MB/sec warmup 30 sec latency 63.997 ms oleg651-client.virtnet: 1 8012 1.93 MB/sec warmup 31 sec latency 29.438 ms oleg651-client.virtnet: 1 8233 1.91 MB/sec warmup 32 sec latency 31.273 ms oleg651-client.virtnet: 1 8512 1.95 MB/sec warmup 33 sec latency 24.651 ms oleg651-client.virtnet: 1 8690 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:04:25 (1743501865) targets are mounted 06:04:25 (1743501865) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2428 1285260 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2164 1285524 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 12072 3593060 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 37416 3569016 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 49488 7162076 1% /mnt/lustre test_70b fail mds2 2 times Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:04:49 (1743501889) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server 1.89 MB/sec warmup 34 sec latency 34.191 ms oleg651-client.virtnet: 1 8910 1.84 MB/sec warmup 35 sec latency 27.819 ms oleg651-client.virtnet: 1 9188 1.80 MB/sec warmup 36 sec latency 43.063 ms oleg651-client.virtnet: 1 9471 1.79 MB/sec warmup 37 sec latency 29.224 ms oleg651-client.virtnet: 1 9709 1.77 MB/sec warmup 38 sec latency 25.478 ms oleg651-client.virtnet: 1 10285 1.86 MB/sec warmup 39 sec latency 30.484 ms oleg651-client.virtnet: 1 10709 1.92 MB/sec warmup 40 sec latency 21.367 ms oleg651-client.virtnet: 1 10938 1.90 MB/sec warmup 41 sec latency 23.125 ms oleg651-client.virtnet: 1 11151 1.86 MB/sec warmup 42 sec latency 24.415 ms oleg651-client.virtnet: 1 11371 1.83 MB/sec warmup 43 sec latency 70.699 ms oleg651-client.virtnet: 1 11547 1.80 MB/sec warmup 44 sec latency 28.530 ms oleg651-client.virtnet: 1 11782 1.78 MB/sec warmup 45 sec latency 30.343 ms oleg651-client.virtnet: 1 12065 1.81 MB/sec warmup 46 sec latency 26.173 ms oleg651-client.virtnet: 1 12280 1.78 MB/sec warmup 47 sec latency 43.839 ms oleg651-client.virtnet: 1 12486 1.74 MB/sec warmup 48 sec latency 29.054 ms oleg651-client.virtnet: 1 12701 1.71 MB/sec warmup 49 sec latency 29.764 ms oleg651-client.virtnet: 1 12858 1.68 MB/sec warmup 50 sec latency 57.386 ms oleg651-client.virtnet: 1 13080 1.68 MB/sec warmup 51 sec latency 37.459 ms oleg651-client.virtnet: 1 13274 1.66 MB/sec warmup 52 sec latency 28.594 ms oleg651-client.virtnet: 1 13628 1.71 MB/sec warmup 53 sec latency 53.627 ms oleg651-client.virtnet: 1 13978 1.72 MB/sec warmup 54 sec latency 34.775 ms oleg651-client.virtnet: 1 14217 1.74 MB/sec warmup 55 sec latency 32.457 ms oleg651-client.virtnet: 1 14369 1.72 MB/sec warmup 56 sec latency 38.762 ms oleg651-client.virtnet: 1 14565 1.71 MB/sec warmup 57 sec latency 37.820 ms oleg651-client.virtnet: 1 14748 1.68 MB/sec warmup 58 sec latency 28.685 ms oleg651-client.virtnet: 1 14933 1.66 MB/sec warmup 59 sec latency 46.800 ms oleg651-client.virtnet: 1 15334 1.22 MB/sec execute 1 sec latency 40.122 ms oleg651-client.virtnet: 1 15589 2.12 MB/sec execute 2 sec latency 30.684 ms oleg651-client.virtnet: 1 15773 1.45 MB/sec execute 3 sec latency 34.180 ms oleg651-client.virtnet: 1 15833 1.10 MB/sec execute 4 sec latency 658.476 ms oleg651-client.virtnet: 1 15963 0.90 MB/sec execute 5 sec latency 985.447 ms oleg651-client.virtnet: 1 16205 0.79 MB/sec execute 6 sec latency 31.272 ms oleg651-client.virtnet: 1 16379 0.71 MB/sec execute 7 sec latency 28.307 ms oleg651-client.virtnet: 1 16582 0.77 MB/sec execute 8 sec latency 35.996 ms oleg651-client.virtnet: 1 16744 0.79 MB/sec execute 9 sec latency 34.473 ms oleg651-client.virtnet: 1 17122 1.17 MB/sec execute 10 sec latency 26.489 ms oleg651-client.virtnet: 1 17135 1.07 MB/sec execute 11 sec latency 933.433 ms oleg651-client.virtnet: 1 17135 0.98 MB/sec execute 12 sec latency 1933.731 ms oleg651-client.virtnet: 1 17135 0.90 MB/sec execute 13 sec latency 2934.080 ms oleg651-client.virtnet: 1 17135 0.84 MB/sec execute 14 sec latency 3934.547 ms oleg651-client.virtnet: 1 17135 0.78 MB/sec execute 15 sec latency 4934.803 ms oleg651-client.virtnet: 1 17135 0.73 MB/sec execute 16 sec latency 5935.055 ms oleg651-client.virtnet: 1 17135 0.69 MB/sec execute 17 sec latency 6936.556 ms oleg651-client.virtnet: 1 17135 0.65 MB/sec execute 18 sec latency 7936.815 ms oleg651-client.virtnet: 1 17135 0.62 MB/sec execute 19 sec latency 8937.110 ms oleg651-client.virtnet: 1 17135 0.59Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:05:10 (1743501910) targets are mounted 06:05:10 (1743501910) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2388 1285300 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2060 1285628 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 11916 3594304 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 37576 3568880 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 49492 7163184 1% /mnt/lustre test_70b fail mds1 3 times Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:05:35 (1743501935) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server MB/sec execute 20 sec latency 9937.323 ms oleg651-client.virtnet: 1 17135 0.56 MB/sec execute 21 sec latency 10937.552 ms oleg651-client.virtnet: 1 17135 0.53 MB/sec execute 22 sec latency 11937.770 ms oleg651-client.virtnet: 1 17135 0.51 MB/sec execute 23 sec latency 12938.033 ms oleg651-client.virtnet: 1 17135 0.49 MB/sec execute 24 sec latency 13938.358 ms oleg651-client.virtnet: 1 17135 0.47 MB/sec execute 25 sec latency 14938.561 ms oleg651-client.virtnet: 1 17135 0.45 MB/sec execute 26 sec latency 15938.839 ms oleg651-client.virtnet: 1 17135 0.43 MB/sec execute 27 sec latency 16939.267 ms oleg651-client.virtnet: 1 17135 0.42 MB/sec execute 28 sec latency 17939.528 ms oleg651-client.virtnet: 1 17135 0.40 MB/sec execute 29 sec latency 18939.827 ms oleg651-client.virtnet: 1 17135 0.39 MB/sec execute 30 sec latency 19940.105 ms oleg651-client.virtnet: 1 17135 0.38 MB/sec execute 31 sec latency 20940.446 ms oleg651-client.virtnet: 1 17135 0.37 MB/sec execute 32 sec latency 21940.807 ms oleg651-client.virtnet: 1 17135 0.36 MB/sec execute 33 sec latency 22941.078 ms oleg651-client.virtnet: 1 17135 0.34 MB/sec execute 34 sec latency 23941.286 ms oleg651-client.virtnet: 1 17402 0.36 MB/sec execute 35 sec latency 24053.388 ms oleg651-client.virtnet: 1 17758 0.46 MB/sec execute 36 sec latency 47.143 ms oleg651-client.virtnet: 1 17943 0.49 MB/sec execute 37 sec latency 38.027 ms oleg651-client.virtnet: 1 18090 0.48 MB/sec execute 38 sec latency 30.656 ms oleg651-client.virtnet: 1 18218 0.47 MB/sec execute 39 sec latency 36.929 ms oleg651-client.virtnet: 1 18363 0.46 MB/sec execute 40 sec latency 70.795 ms oleg651-client.virtnet: 1 18531 0.46 MB/sec execute 41 sec latency 37.775 ms oleg651-client.virtnet: 1 18724 0.46 MB/sec execute 42 sec latency 67.105 ms oleg651-client.virtnet: 1 18957 0.47 MB/sec execute 43 sec latency 34.755 ms oleg651-client.virtnet: 1 19190 0.53 MB/sec execute 44 sec latency 45.738 ms oleg651-client.virtnet: 1 19384 0.52 MB/sec execute 45 sec latency 34.927 ms oleg651-client.virtnet: 1 19549 0.51 MB/sec execute 46 sec latency 39.572 ms oleg651-client.virtnet: 1 19761 0.51 MB/sec execute 47 sec latency 34.104 ms oleg651-client.virtnet: 1 19957 0.50 MB/sec execute 48 sec latency 31.382 ms oleg651-client.virtnet: 1 20197 0.52 MB/sec execute 49 sec latency 29.721 ms oleg651-client.virtnet: 1 20543 0.59 MB/sec execute 50 sec latency 35.955 ms oleg651-client.virtnet: 1 21015 0.62 MB/sec execute 51 sec latency 26.134 ms oleg651-client.virtnet: 1 21356 0.69 MB/sec execute 52 sec latency 25.964 ms oleg651-client.virtnet: 1 21529 0.70 MB/sec execute 53 sec latency 32.977 ms oleg651-client.virtnet: 1 21654 0.69 MB/sec execute 54 sec latency 40.358 ms oleg651-client.virtnet: 1 21817 0.68 MB/sec execute 55 sec latency 45.327 ms oleg651-client.virtnet: 1 22035 0.67 MB/sec execute 56 sec latency 63.817 ms oleg651-client.virtnet: 1 22239 0.67 MB/sec execute 57 sec latency 27.076 ms oleg651-client.virtnet: 1 22478 0.68 MB/sec execute 58 sec latency 27.871 ms oleg651-client.virtnet: 1 22788 0.72 MB/sec execute 59 sec latency 22.729 ms oleg651-client.virtnet: 1 23003 0.71 MB/sec execute 60 sec latency 33.093 ms oleg651-client.virtnet: 1 23254 0.70 MB/sec execute 61 sec latency 25.390 ms oleg651-client.virtnet: 1 23493 0.70 MB/sec execute 62 sec latency 26.628 ms oleg651-client.virtnet: 1 23773 0.71 MB/sec execute 63 sec latency 25.317 ms oleg651-client.virtnet: 1 24205 0.76 MB/sec execute 64 sec latency 28.348 ms oFailover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:05:57 (1743501957) targets are mounted 06:05:57 (1743501957) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2388 1285300 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2088 1285600 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 12028 3593956 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 37588 3566756 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 49616 7160712 1% /mnt/lustre test_70b fail mds2 4 times Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:06:22 (1743501982) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server leg651-client.virtnet: 1 24616 0.80 MB/sec execute 65 sec latency 34.607 ms oleg651-client.virtnet: 1 24983 0.84 MB/sec execute 66 sec latency 18.195 ms oleg651-client.virtnet: 1 25221 0.84 MB/sec execute 67 sec latency 25.931 ms oleg651-client.virtnet: 1 25432 0.83 MB/sec execute 68 sec latency 30.993 ms oleg651-client.virtnet: 1 25659 0.83 MB/sec execute 69 sec latency 73.209 ms oleg651-client.virtnet: 1 25864 0.82 MB/sec execute 70 sec latency 33.052 ms oleg651-client.virtnet: 1 26114 0.83 MB/sec execute 71 sec latency 27.856 ms oleg651-client.virtnet: 1 26414 0.86 MB/sec execute 72 sec latency 40.988 ms oleg651-client.virtnet: 1 26606 0.85 MB/sec execute 73 sec latency 25.930 ms oleg651-client.virtnet: 1 26805 0.84 MB/sec execute 74 sec latency 41.292 ms oleg651-client.virtnet: 1 27034 0.83 MB/sec execute 75 sec latency 32.907 ms oleg651-client.virtnet: 1 27264 0.84 MB/sec execute 76 sec latency 54.890 ms oleg651-client.virtnet: 1 27482 0.84 MB/sec execute 77 sec latency 34.707 ms oleg651-client.virtnet: 1 28069 0.90 MB/sec execute 78 sec latency 23.556 ms oleg651-client.virtnet: 1 28452 0.94 MB/sec execute 79 sec latency 30.265 ms oleg651-client.virtnet: 1 28702 0.94 MB/sec execute 80 sec latency 21.951 ms oleg651-client.virtnet: 1 28967 0.93 MB/sec execute 81 sec latency 16.712 ms oleg651-client.virtnet: 1 29164 0.93 MB/sec execute 82 sec latency 71.730 ms oleg651-client.virtnet: 1 29327 0.92 MB/sec execute 83 sec latency 71.235 ms oleg651-client.virtnet: 1 29509 0.92 MB/sec execute 84 sec latency 30.048 ms oleg651-client.virtnet: 1 29691 0.92 MB/sec execute 85 sec latency 26.637 ms oleg651-client.virtnet: 1 29936 0.94 MB/sec execute 86 sec latency 35.931 ms oleg651-client.virtnet: 1 30097 0.93 MB/sec execute 87 sec latency 35.700 ms oleg651-client.virtnet: 1 30228 0.92 MB/sec execute 88 sec latency 47.352 ms oleg651-client.virtnet: 1 30410 0.91 MB/sec execute 89 sec latency 34.993 ms oleg651-client.virtnet: 1 30630 0.91 MB/sec execute 90 sec latency 37.070 ms oleg651-client.virtnet: 1 30860 0.91 MB/sec execute 91 sec latency 36.681 ms oleg651-client.virtnet: 1 31290 0.94 MB/sec execute 92 sec latency 29.888 ms oleg651-client.virtnet: 1 31686 0.97 MB/sec execute 93 sec latency 27.703 ms oleg651-client.virtnet: 1 31992 0.99 MB/sec execute 94 sec latency 26.569 ms oleg651-client.virtnet: 1 32207 0.99 MB/sec execute 95 sec latency 33.882 ms oleg651-client.virtnet: 1 32416 0.99 MB/sec execute 96 sec latency 22.037 ms oleg651-client.virtnet: 1 32641 0.98 MB/sec execute 97 sec latency 51.787 ms oleg651-client.virtnet: 1 32772 0.97 MB/sec execute 98 sec latency 279.515 ms oleg651-client.virtnet: 1 32957 0.97 MB/sec execute 99 sec latency 25.060 ms oleg651-client.virtnet: 1 33128 0.97 MB/sec execute 100 sec latency 53.713 ms oleg651-client.virtnet: 1 33362 0.99 MB/sec execute 101 sec latency 30.398 ms oleg651-client.virtnet: 1 33534 0.98 MB/sec execute 102 sec latency 36.581 ms oleg651-client.virtnet: 1 33689 0.97 MB/sec execute 103 sec latency 177.048 ms oleg651-client.virtnet: 1 33689 0.96 MB/sec execute 104 sec latency 1177.330 ms oleg651-client.virtnet: 1 33689 0.95 MB/sec execute 105 sec latency 2177.624 ms oleg651-client.virtnet: 1 33689 0.94 MB/sec execute 106 sec latency 3177.882 ms oleg651-client.virtnet: 1 33689 0.94 MB/sec execute 107 sec latency 4178.190 ms oleg651-client.virtnet: 1 33689 0.93 MB/sec execute 108 sec latency 5178.529 ms oleg651-client.virtnet: 1 33689 0.92 MB/sec execute 109 sec latency Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:06:43 (1743502003) targets are mounted 06:06:43 (1743502003) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2388 1285300 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2064 1285624 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 11912 3594200 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 37640 3568652 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 49552 7162852 1% /mnt/lustre test_70b fail mds1 5 times Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:07:11 (1743502031) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server 6178.781 ms oleg651-client.virtnet: 1 33689 0.91 MB/sec execute 110 sec latency 7179.019 ms oleg651-client.virtnet: 1 33689 0.90 MB/sec execute 111 sec latency 8179.338 ms oleg651-client.virtnet: 1 33689 0.89 MB/sec execute 112 sec latency 9179.634 ms oleg651-client.virtnet: 1 33689 0.89 MB/sec execute 113 sec latency 10179.965 ms oleg651-client.virtnet: 1 33689 0.88 MB/sec execute 114 sec latency 11181.265 ms oleg651-client.virtnet: 1 33689 0.87 MB/sec execute 115 sec latency 12181.617 ms oleg651-client.virtnet: 1 33689 0.86 MB/sec execute 116 sec latency 13181.905 ms oleg651-client.virtnet: 1 33689 0.86 MB/sec execute 117 sec latency 14182.159 ms oleg651-client.virtnet: 1 33689 0.85 MB/sec execute 118 sec latency 15182.427 ms oleg651-client.virtnet: 1 33689 0.84 MB/sec execute 119 sec latency 16182.643 ms oleg651-client.virtnet: 1 33689 0.83 MB/sec execute 120 sec latency 17182.840 ms oleg651-client.virtnet: 1 33689 0.83 MB/sec execute 121 sec latency 18183.161 ms oleg651-client.virtnet: 1 33689 0.82 MB/sec execute 122 sec latency 19183.442 ms oleg651-client.virtnet: 1 33689 0.81 MB/sec execute 123 sec latency 20183.667 ms oleg651-client.virtnet: 1 33689 0.81 MB/sec execute 124 sec latency 21183.975 ms oleg651-client.virtnet: 1 33689 0.80 MB/sec execute 125 sec latency 22184.343 ms oleg651-client.virtnet: 1 33689 0.79 MB/sec execute 126 sec latency 23184.590 ms oleg651-client.virtnet: 1 33689 0.79 MB/sec execute 127 sec latency 24184.801 ms oleg651-client.virtnet: 1 33742 0.78 MB/sec execute 128 sec latency 24563.371 ms oleg651-client.virtnet: 1 33904 0.78 MB/sec execute 129 sec latency 51.951 ms oleg651-client.virtnet: 1 34065 0.77 MB/sec execute 130 sec latency 41.198 ms oleg651-client.virtnet: 1 34249 0.77 MB/sec execute 131 sec latency 26.774 ms oleg651-client.virtnet: 1 34390 0.77 MB/sec execute 132 sec latency 40.059 ms oleg651-client.virtnet: 1 34560 0.77 MB/sec execute 133 sec latency 34.891 ms oleg651-client.virtnet: 1 34876 0.80 MB/sec execute 134 sec latency 48.723 ms oleg651-client.virtnet: 1 35152 0.80 MB/sec execute 135 sec latency 33.681 ms oleg651-client.virtnet: 1 35467 0.83 MB/sec execute 136 sec latency 38.048 ms oleg651-client.virtnet: 1 35643 0.82 MB/sec execute 137 sec latency 41.335 ms oleg651-client.virtnet: 1 35831 0.82 MB/sec execute 138 sec latency 30.981 ms oleg651-client.virtnet: 1 35953 0.82 MB/sec execute 139 sec latency 51.158 ms oleg651-client.virtnet: 1 36156 0.81 MB/sec execute 140 sec latency 25.269 ms oleg651-client.virtnet: 1 36257 0.81 MB/sec execute 141 sec latency 41.807 ms oleg651-client.virtnet: 1 36394 0.81 MB/sec execute 142 sec latency 51.025 ms oleg651-client.virtnet: 1 36568 0.81 MB/sec execute 143 sec latency 33.999 ms oleg651-client.virtnet: 1 36767 0.81 MB/sec execute 144 sec latency 25.234 ms oleg651-client.virtnet: 1 37053 0.82 MB/sec execute 145 sec latency 27.475 ms oleg651-client.virtnet: 1 37214 0.82 MB/sec execute 146 sec latency 40.491 ms oleg651-client.virtnet: 1 37406 0.81 MB/sec execute 147 sec latency 27.097 ms oleg651-client.virtnet: 1 37538 0.81 MB/sec execute 148 sec latency 44.281 ms oleg651-client.virtnet: 1 37681 0.80 MB/sec execute 149 sec latency 33.101 ms oleg651-client.virtnet: 1 37856 0.81 MB/sec execute 150 sec latency 49.746 ms oleg651-client.virtnet: 1 37962 0.80 MB/sec execute 151 sec latency 42.744 ms oleg651-client.virtnet: 1 38275 0.82 MB/sec execute 152 sec latency 36.062 ms oleg651-client.virtnet: 1 38701 0.83 MB/sec execute 153 sec latency 28.566 ms oleg651-client.virtnet: Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:07:41 (1743502061) targets are mounted 06:07:41 (1743502061) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2388 1285300 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2088 1285600 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 11912 3593956 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 37632 3568628 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 49544 7162584 1% /mnt/lustre 1 39018 0.85 MB/sec execute 154 sec latency 33.542 ms oleg651-client.virtnet: 1 39268 0.86 MB/sec execute 155 sec latency 20.889 ms oleg651-client.virtnet: 1 39476 0.85 MB/sec execute 156 sec latency 22.045 ms oleg651-client.virtnet: 1 39723 0.85 MB/sec execute 157 sec latency 21.570 ms oleg651-client.virtnet: 1 39911 0.85 MB/sec execute 158 sec latency 79.159 ms oleg651-client.virtnet: 1 40140 0.85 MB/sec execute 159 sec latency 21.088 ms oleg651-client.virtnet: 1 40457 0.86 MB/sec execute 160 sec latency 21.282 ms oleg651-client.virtnet: 1 40671 0.86 MB/sec execute 161 sec latency 26.362 ms oleg651-client.virtnet: 1 40864 0.85 MB/sec execute 162 sec latency 23.842 ms oleg651-client.virtnet: 1 41123 0.85 MB/sec execute 163 sec latency 35.981 ms oleg651-client.virtnet: 1 41412 0.85 MB/sec execute 164 sec latency 31.784 ms oleg651-client.virtnet: 1 41549 0.85 MB/sec execute 165 sec latency 47.093 ms oleg651-client.virtnet: 1 41976 0.88 MB/sec execute 166 sec latency 28.055 ms oleg651-client.virtnet: 1 42397 0.88 MB/sec execute 167 sec latency 32.862 ms oleg651-client.virtnet: 1 42678 0.90 MB/sec execute 168 sec latency 34.200 ms oleg651-client.virtnet: 1 42867 0.90 MB/sec execute 169 sec latency 32.128 ms oleg651-client.virtnet: 1 43044 0.89 MB/sec execute 170 sec latency 28.020 ms oleg651-client.virtnet: 1 43222 0.89 MB/sec execute 171 sec latency 30.072 ms oleg651-client.virtnet: 1 43371 0.89 MB/sec execute 172 sec latency 72.918 ms oleg651-client.virtnet: 1 43577 0.88 MB/sec execute 173 sec latency 36.518 ms oleg651-client.virtnet: 1 43815 0.89 MB/sec execute 174 sec latency 22.499 ms oleg651-client.virtnet: 1 44093 0.90 MB/sec execute 175 sec latency 35.873 ms oleg651-client.virtnet: 1 44299 0.89 MB/sec execute 176 sec latency 25.123 ms oleg651-client.virtnet: 1 44459 0.89 MB/sec execute 177 sec latency 33.723 ms oleg651-client.virtnet: 1 44695 0.89 MB/sec execute 178 sec latency 37.424 ms oleg651-client.virtnet: 1 44944 0.89 MB/sec execute 179 sec latency 29.237 ms oleg651-client.virtnet: 1 45076 0.89 MB/sec execute 180 sec latency 37.655 ms oleg651-client.virtnet: 1 45498 0.91 MB/sec execute 181 sec latency 47.050 ms oleg651-client.virtnet: 1 45915 0.92 MB/sec execute 182 sec latency 26.351 ms oleg651-client.virtnet: 1 46216 0.93 MB/sec execute 183 sec latency 23.994 ms oleg651-client.virtnet: 1 46439 0.93 MB/sec execute 184 sec latency 24.590 ms oleg651-client.virtnet: 1 46610 0.93 MB/sec execute 185 sec latency 32.762 ms oleg651-client.virtnet: 1 46772 0.92 MB/sec execute 186 sec latency 48.015 ms oleg651-client.virtnet: 1 46935 0.92 MB/sec execute 187 sec latency 68.836 ms oleg651-client.virtnet: 1 47061 0.91 MB/sec execute 188 sec latency 62.419 ms oleg651-client.virtnet: 1 47213 0.92 MB/sec execute 189 sec latency 31.786 ms oleg651-client.virtnet: 1 47383 0.91 MB/sec execute 190 sec latency 28.679 ms oleg651-client.virtnet: 1 47589 0.92 MB/sec execute 191 sec latency 36.704 ms oleg651-client.virtnet: 1 47725 0.92 MB/sec execute 192 sec latency 54.948 ms oleg651-client.virtnet: 1 47859 0.91 MB/sec execute 193 sec latency 36.897 ms oleg651-client.virtnet: 1 48093 0.91 MB/sec execute 194 sec latency 19.577 ms oleg651-client.virtnet: 1 48300 0.91 MB/sec execute 195 sec latency 31.923 ms oleg651-client.virtnet: 1 48582 0.91 MB/sec execute 196 sec latency 32.704 ms oleg651-client.virtnet: 1 48909 0.93 MB/sec execute 197 sec latency 29.379 ms oleg651-client.virtnet: 1 49332 0.93 MB/sec execute 198 sec latency 34.883 ms oleg651-client.virtnet: test_70b fail mds2 6 times Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:08:06 (1743502086) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:08:26 (1743502106) targets are mounted 06:08:26 (1743502106) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 49709 0.95 MB/sec execute 199 sec latency 26.229 ms oleg651-client.virtnet: 1 49923 0.95 MB/sec execute 200 sec latency 28.432 ms oleg651-client.virtnet: 1 50122 0.95 MB/sec execute 201 sec latency 29.767 ms oleg651-client.virtnet: 1 50309 0.94 MB/sec execute 202 sec latency 32.494 ms oleg651-client.virtnet: 1 50476 0.94 MB/sec execute 203 sec latency 72.269 ms oleg651-client.virtnet: 1 50613 0.94 MB/sec execute 204 sec latency 33.505 ms oleg651-client.virtnet: 1 50771 0.94 MB/sec execute 205 sec latency 38.870 ms oleg651-client.virtnet: 1 50926 0.93 MB/sec execute 206 sec latency 31.661 ms oleg651-client.virtnet: 1 51150 0.94 MB/sec execute 207 sec latency 229.492 ms oleg651-client.virtnet: 1 51150 0.94 MB/sec execute 208 sec latency 1229.784 ms oleg651-client.virtnet: 1 51150 0.94 MB/sec execute 209 sec latency 2230.063 ms oleg651-client.virtnet: 1 51150 0.93 MB/sec execute 210 sec latency 3230.390 ms oleg651-client.virtnet: 1 51150 0.93 MB/sec execute 211 sec latency 4230.720 ms oleg651-client.virtnet: 1 51150 0.92 MB/sec execute 212 sec latency 5231.033 ms oleg651-client.virtnet: 1 51150 0.92 MB/sec execute 213 sec latency 6231.364 ms oleg651-client.virtnet: 1 51150 0.91 MB/sec execute 214 sec latency 7231.670 ms oleg651-client.virtnet: 1 51150 0.91 MB/sec execute 215 sec latency 8231.883 ms oleg651-client.virtnet: 1 51150 0.90 MB/sec execute 216 sec latency 9232.159 ms oleg651-client.virtnet: 1 51150 0.90 MB/sec execute 217 sec latency 10232.445 ms oleg651-client.virtnet: 1 51150 0.90 MB/sec execute 218 sec latency 11232.689 ms oleg651-client.virtnet: 1 51150 0.89 MB/sec execute 219 sec latency 12232.911 ms oleg651-client.virtnet: 1 51150 0.89 MB/sec execute 220 sec latency 13233.181 ms oleg651-client.virtnet: 1 51150 0.88 MB/sec execute 221 sec latency 14233.458 ms oleg651-client.virtnet: 1 51150 0.88 MB/sec execute 222 sec latency 15233.735 ms oleg651-client.virtnet: 1 51150 0.88 MB/sec execute 223 sec latency 16233.974 ms oleg651-client.virtnet: 1 51150 0.87 MB/sec execute 224 sec latency 17234.280 ms oleg651-client.virtnet: 1 51150 0.87 MB/sec execute 225 sec latency 18234.543 ms oleg651-client.virtnet: 1 51150 0.86 MB/sec execute 226 sec latency 19234.857 ms oleg651-client.virtnet: 1 51150 0.86 MB/sec execute 227 sec latency 20235.063 ms oleg651-client.virtnet: 1 51150 0.86 MB/sec execute 228 sec latency 21235.283 ms oleg651-client.virtnet: 1 51150 0.85 MB/sec execute 229 sec latency 22235.534 ms oleg651-client.virtnet: 1 51150 0.85 MB/sec execute 230 sec latency 23235.745 ms oleg651-client.virtnet: 1 51150 0.85 MB/sec execute 231 sec latency 24235.947 ms oleg651-client.virtnet: 1 51256 0.84 MB/sec execute 232 sec latency 24439.404 ms oleg651-client.virtnet: 1 51407 0.84 MB/sec execute 233 sec latency 48.133 ms oleg651-client.virtnet: 1 51562 0.84 MB/sec execute 234 sec latency 39.410 ms oleg651-client.virtnet: 1 51754 0.83 MB/sec execute 235 sec latency 29.786 ms oleg651-client.virtnet: 1 51944 0.83 MB/sec execute 236 sec latency 25.679 ms oleg651-client.virtnet: 1 52118 0.83 MB/sec execute 237 sec latency 41.574 ms oleg651-client.virtnet: 1 52287 0.83 MB/sec execute 238 sec latency 33.268 ms oleg651-client.virtnet: 1 52677 0.85 MB/sec execute 239 sec latency 36.238 ms oleg651-client.virtnet: 1 53189 0.87 MB/sec execute 240 sec latency 24.800 ms oleg651-client.virtnet: 1 53386 0.86 MB/sec execute 241 sec latency 26.102 ms oleg651-client.virtnet: 1 53629 0.87 MB/sec execute 242 sec latency 31.107 ms oleg651-client.virtnet: 1 53807 0.86 MB/sec execute 243 sec latency 27.500 ms oleg651-client.virtnet: 1 53995 0.86 MB/sec execute 244 sec latency 78.870 ms oleg651-client.virtnet: 1 54198 0.86 MB/sec execute 245 sec latency 24.338 ms oleg651-client.virtnet: 1 54465 0.86 MB/sec execute 246 sec latency 23.991 ms oleg651-client.virtnet: 1 54759 0.87 MB/sec execute 247 sec latency 36.507 ms oleg651-client.virtnet: 1 54967 0.87 MB/sec execute 248 sec latency 24.275 ms oleg651-client.virtnet: 1 55192 0.86 MB/sec execute 249 sec latency 41.703 ms oleg651-client.virtnet: 1 55537 0.87 MB/sec execute 250 sec latency 15.833 ms oleg651-client.virtnet: 1 55821 0.87 MB/sec execute 251 sec latency 32.773 ms oleg651-client.virtnet: 1 56411 0.88 MB/sec execute 252 sec latency 25.450 ms oleg651-client.virtnet: 1 56800 0.90 MB/sec execute 253 sec latency 24.979 ms oleg651-client.virtnet: 1 57064 0.90 MB/sec execute 254 sec latency 19.090 ms oleg651-client.virtnet: 1 57303 0.90 MB/sec execute 255 sec latency 19.960 ms oleg651-client.virtnet: 1 57534 0.89 MB/sec execute 256 sec latency 65.320 ms oleg651-client.virtnet: 1 57775 0.89 MB/sec execute 257 sec latency 23.571 ms oleg651-client.virtnet: 1 58020 0.89 MB/sec execute 258 sec latency 21.847 ms oleg651-client.virtnet: 1 58325 0.90 MB/sec execute 259 sec latency 35.899 ms oleg651-client.virtnet: 1 58545 0.90 MB/sec execute 260 sec latency 21.989 ms oleg651-client.virtnet: 1 58790 0.90 MB/sec execute 261 sec latency 25.662 ms oleg651-client.virtnet: 1 59108 0.90 MB/sec execute 262 sec latency 20.476 ms oleg651-client.virtnet: 1 59345 0.90 MB/sec execute 263 sec latency 34.855 ms oleg651-client.virtnet: 1 59751 0.91 MB/sec execute 264 sec latency 33.708 ms oleg651-client.virtnet: 1 60160 0.92 MB/sec execute 265 sec latency 41.470 ms oleg651-client.virtnet: 1 60458 0.93 MB/sec execute 266 sec latency 26.244 ms oleg651-client.virtnet: 1 60678 0.93 MB/sec execute 267 sec latency 29.494 ms oleg651-client.virtnet: 1 60980 0.92 MB/sec execute 268 sec latency 17.763 ms oleg651-client.virtnet: 1 61164 0.92 MB/sec execute 269 sec latency 76.739 ms oleg651-client.virtnet: 1 61401 0.92 MB/sec execute 270 sec latency 34.654 ms oleg651-client.virtnet: 1 61739 0.93 MB/sec execute 271 sec latency 22.037 ms oleg651-client.virtnet: 1 61940 0.93 MB/sec execute 272 sec latency 56.139 ms oleg651-client.virtnet: 1 62132 0.93 MB/sec execute 273 sec latency 35.724 ms oleg651-client.virtnet: 1 62375 0.92 MB/sec execute 274 sec latency 30.270 ms oleg651-client.virtnet: 1 62683 0.93 MB/sec execute 275 sec latency 28.761 ms oleg651-client.virtnet: 1 62896 0.93 MB/sec execute 276 sec latency 27.478 ms oleg651-client.virtnet: 1 63258 0.94 MB/sec execute 277 sec latency 38.404 ms oleg651-client.virtnet: 1 63689 0.94 MB/sec execute 278 sec latency 27.428 ms oleg651-client.virtnet: 1 64013 0.95 MB/sec execute 279 sec latency 19.988 ms oleg651-client.virtnet: 1 64217 0.95 MB/sec execute 280 sec latency 31.611 ms oleg651-client.virtnet: 1 64428 0.95 MB/sec execute 281 sec latency 29.326 ms oleg651-client.virtnet: 1 64675 0.95 MB/sec execute 282 sec latency 61.589 ms oleg651-client.virtnet: 1 64938 0.95 MB/sec execute 283 sec latency 21.804 ms oleg651-client.virtnet: 1 65265 0.96 MB/sec execute 284 sec latency 22.570 ms oleg651-client.virtnet: 1 65469 0.95 MB/sec execute 285 sec latency 32.759 ms oleg651-client.virtnet: 1 65721 0.95 MB/sec execute 286 sec latency 20.265 ms oleg651-client.virtnet: 1 66019 0.95 MB/sec execute 287 sec latency 21.689 ms oleg651-client.virtnet: 1 66300 0.95 MB/sec execute 288 sec latency 23.843 ms oleg651-client.virtnet: 1 66642 0.96 MB/sec execute 289 sec latency 35.274 ms oleg651-client.virtnet: 1 67173 0.97 MB/sec execute 290 sec latency 28.278 ms oleg651-client.virtnet: 1 67515 0.98 MB/sec execute 291 sec latency 22.340 ms oleg651-client.virtnet: 1 67763 0.98 MB/sec execute 292 sec latency 22.088 ms oleg651-client.virtnet: 1 67973 0.98 MB/sec execute 293 sec latency 21.358 ms oleg651-client.virtnet: 1 68167 0.97 MB/sec execute 294 sec latency 77.174 ms oleg651-client.virtnet: 1 68352 0.97 MB/sec execute 295 sec latency 26.173 ms oleg651-client.virtnet: 1 68616 0.97 MB/sec execute 296 sec latency 26.308 ms oleg651-client.virtnet: 1 68929 0.98 MB/sec execute 297 sec latency 30.328 ms oleg651-client.virtnet: 1 69151 0.98 MB/sec execute 298 sec latency 23.782 ms oleg651-client.virtnet: 1 69400 0.97 MB/sec execute 299 sec latency 26.404 ms oleg651-client.virtnet: 1 cleanup 300 sec oleg651-client.virtnet: 0 cleanup 301 sec oleg651-client.virtnet: oleg651-client.virtnet: Operation Count AvgLat MaxLat oleg651-client.virtnet: ---------------------------------------- oleg651-client.virtnet: NTCreateX 9468 18.098 24563.332 oleg651-client.virtnet: Close 6976 2.466 26.119 oleg651-client.virtnet: Rename 398 19.037 62.291 oleg651-client.virtnet: Unlink 1881 5.280 30.031 oleg651-client.virtnet: Qpathinfo 8609 6.198 24053.355 oleg651-client.virtnet: Qfileinfo 1498 0.484 5.770 oleg651-client.virtnet: Qfsinfo 1548 0.225 16.726 oleg651-client.virtnet: Sfileinfo 771 12.724 40.080 oleg651-client.virtnet: Find 3299 1.651 70.764 oleg651-client.virtnet: WriteX 4656 2.333 42.368 oleg651-client.virtnet: ReadX 14721 0.056 11.520 oleg651-client.virtnet: LockX 30 2.912 7.726 oleg651-client.virtnet: UnlockX 30 2.469 4.582 oleg651-client.virtnet: Flush 654 17.807 985.428 oleg651-client.virtnet: oleg651-client.virtnet: Throughput 0.974758 MB/sec 1 clients 1 procs max_latency=24563.371 ms oleg651-client.virtnet: stopping dbench on /mnt/lustre/d70b.replay-single/oleg651-client.virtnet at Tue Apr 1 06:09:38 EDT 2025 with return code 0 oleg651-client.virtnet: clean dbench files on /mnt/lustre/d70b.replay-single/oleg651-client.virtnet oleg651-client.virtnet: /mnt/lustre/d70b.replay-single/oleg651-client.virtnet /mnt/lustre/d70b.replay-single/oleg651-client.virtnet oleg651-client.virtnet: removed directory 'clients/client0/~dmtmp/COREL' oleg651-client.virtnet: removed directory 'clients/client0/~dmtmp/SEED' oleg651-client.virtnet: removed directory 'clients/client0/~dmtmp/WORD' oleg651-client.virtnet: removed directory 'clients/client0/~dmtmp/EXCEL' oleg651-client.virtnet: removed directory 'clients/client0/~dmtmp/PARADOX' oleg651-client.virtnet: removed directory 'clients/client0/~dmtmp/WORDPRO' oleg651-client.virtnet: removed directory 'clients/client0/~dmtmp/PM' oleg651-client.virtnet: removed directory 'clients/client0/~dmtmp/PWRPNT' oleg651-client.virtnet: removed directory 'clients/client0/~dmtmp/ACCESS' oleg651-client.virtnet: removed directory 'clients/client0/~dmtmp' oleg651-client.virtnet: removed directory 'clients/client0' oleg651-client.virtnet: removed directory 'clients' oleg651-client.virtnet: removed 'client.txt' oleg651-client.virtnet: /mnt/lustre/d70b.replay-single/oleg651-client.virtnet oleg651-client.virtnet: dbench successfully finished PASS 70b (372s) == replay-single test 70c: tar 2mdts recovery ============ 06:09:45 (1743502185) Starting client oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Started clients oleg651-client.virtnet: 192.168.206.151@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,encrypt,statfs_project) Started tar 139902 striped dir -i0 -c2 -H crush /mnt/lustre/d70c.replay-single tar: Removing leading `/' from member names tar: Removing leading `/' from hard link targets tar: Removing leading `/' from member names tar: Removing leading `/' from hard link targets tar: Removing leading `/' from member names tar: Removing leading `/' from hard link targets tar: Removing leading `/' from member names tar: Removing leading `/' from hard link targets UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 5324 1282364 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 4980 1282708 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 10412 3587416 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 11572 3587316 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 21984 7174732 1% /mnt/lustre test_70c fail mds1 1 times Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:12:09 (1743502329) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:12:31 (1743502351) targets are mounted 06:12:31 (1743502351) facet_failover done tar: Removing leading `/' from member names tar: Removing leading `/' from hard link targets oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec tar: Removing leading `/' from member names tar: Removing leading `/' from hard link targets tar: Removing leading `/' from member names tar: Removing leading `/' from hard link targets tar: Removing leading `/' from member names tar: Removing leading `/' from hard link targets UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 5592 1282096 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 5232 1282456 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 9768 3587564 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 9764 3588072 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19532 7175636 1% /mnt/lustre tar: Removing leading `/' from member names tar: Removing leading `/' from hard link targets test_70c fail mds1 2 times Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:15:21 (1743502521) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:15:51 (1743502551) targets are mounted 06:15:51 (1743502551) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 70c (409s) == replay-single test 70d: mkdir/rmdir striped dir 2mdts recovery ========================================================== 06:16:34 (1743502594) Starting client oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Started clients oleg651-client.virtnet: 192.168.206.151@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,encrypt,statfs_project) Started 142808 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4348 1283340 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 4004 1283684 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3604800 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3604776 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7209576 1% /mnt/lustre test_70d fail mds2 1 times Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:18:59 (1743502739) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:19:19 (1743502759) targets are mounted 06:19:19 (1743502759) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4536 1283152 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 4336 1283352 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3604800 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3604776 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7209576 1% /mnt/lustre test_70d fail mds1 2 times Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:22:00 (1743502920) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:22:29 (1743502949) targets are mounted 06:22:29 (1743502949) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /home/green/git/lustre-release/lustre/tests/test-framework.sh: line 4866: 142808 Killed ( while true; do $LFS mkdir -i0 -c2 $DIR/$tdir/test || { echo "mkdir fails"; break; }; $LFS mkdir -i1 -c2 $DIR/$tdir/test1 || { echo "mkdir fails"; break; }; touch $DIR/$tdir/test/a || { echo "touch fails"; break; }; mkdir $DIR/$tdir/test/b || { echo "mkdir fails"; break; }; rm -rf $DIR/$tdir/test || { echo "rmdir fails"; ls -lR $DIR/$tdir; break; }; touch $DIR/$tdir/test1/a || { echo "touch fails"; break; }; mkdir $DIR/$tdir/test1/b || { echo "mkdir fails"; break; }; rm -rf $DIR/$tdir/test1 || { echo "rmdir fails"; ls -lR $DIR/$tdir/test1; break; }; done ) (wd: ~) PASS 70d (376s) == replay-single test 70e: rename cross-MDT with random fails ========================================================== 06:22:50 (1743502970) debug=+ha Starting client oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Started clients oleg651-client.virtnet: 192.168.206.151@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,encrypt,statfs_project) Started PID=156440 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4336 1283352 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3740 1283948 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3604776 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210220 1% /mnt/lustre test_70e fail mds2 1 times Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:25:15 (1743503115) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:25:35 (1743503135) targets are mounted 06:25:35 (1743503135) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2848 1284840 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2124 1285564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3604776 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210220 1% /mnt/lustre test_70e fail mds2 2 times Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:28:09 (1743503289) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:28:29 (1743503309) targets are mounted 06:28:29 (1743503309) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 70e (362s) == replay-single test 70f: OSS O_DIRECT recovery with 1 clients ========================================================== 06:28:52 (1743503332) mount clients oleg651-client.virtnet ... Starting client oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Started clients oleg651-client.virtnet: 192.168.206.151@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,encrypt,statfs_project) Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2868 1284820 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2124 1285564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 2600 3599180 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3597136 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 4172 7196316 1% /mnt/lustre ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... test_70f failing OST 1 times Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 06:29:07 (1743503347) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... Started lustre-OST0000 06:29:28 (1743503368) targets are mounted 06:29:28 (1743503368) facet_failover done ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2868 1284820 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2124 1285564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3601324 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3601328 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7202652 1% /mnt/lustre ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... test_70f failing OST 2 times Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 06:29:55 (1743503395) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Started lustre-OST0000 06:30:17 (1743503417) targets are mounted 06:30:17 (1743503417) facet_failover done Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2868 1284820 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2124 1285564 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... test_70f failing OST 3 times Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 06:30:45 (1743503445) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... Started lustre-OST0000 06:31:06 (1743503466) targets are mounted 06:31:06 (1743503466) facet_failover done ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear Write/read files in: '/mnt/lustre/d70f.replay-single', clients: 'oleg651-client.virtnet' ... ldlm.namespaces.MGC192.168.206.151@tcp.lru_size=clear ldlm.namespaces.lustre-MDT0000-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-MDT0001-mdc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0000-osc-ffff95e5981e1000.lru_size=clear ldlm.namespaces.lustre-OST0001-osc-ffff95e5981e1000.lru_size=clear PASS 70f (156s) == replay-single test 71a: mkdir/rmdir striped dir with 2 mdts recovery ========================================================== 06:31:28 (1743503488) Starting client oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Started clients oleg651-client.virtnet: 192.168.206.151@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,encrypt,statfs_project) Started 179996 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4196 1283492 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3964 1283724 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4328 1283360 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 4164 1283524 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre fail mds2 mds1 1 times Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:34:01 (1743503641) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server Failover mds2 to oleg651-server mount facets: mds2 mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 06:34:32 (1743503672) targets are mounted 06:34:32 (1743503672) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2872 1284816 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3176 1284512 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3020 1284668 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3340 1284348 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre fail mds2 mds1 2 times Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:37:22 (1743503842) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server Failover mds2 to oleg651-server mount facets: mds2 mount facets: mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 06:37:54 (1743503874) targets are mounted 06:37:54 (1743503874) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec /home/green/git/lustre-release/lustre/tests/test-framework.sh: line 4866: 179996 Killed ( while true; do $LFS mkdir -i0 -c2 $DIR/$tdir/test; rmdir $DIR/$tdir/test; done ) (wd: ~) PASS 71a (415s) == replay-single test 73a: open(O_CREAT), unlink, replay, reconnect before open replay, close ========================================================== 06:38:23 (1743503903) multiop /mnt/lustre/f73a.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3544 1284144 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3312 1284376 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre fail_loc=0x80000302 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:38:32 (1743503912) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:39:00 (1743503940) targets are mounted 06:39:00 (1743503940) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 73a (65s) == replay-single test 73b: open(O_CREAT), unlink, replay, reconnect at open_replay reply, close ========================================================== 06:39:28 (1743503968) multiop /mnt/lustre/f73b.replay-single vO_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7954 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2420 1285268 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2028 1285660 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre fail_loc=0x80000157 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:39:37 (1743503977) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:39:57 (1743503997) targets are mounted 06:39:57 (1743503997) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 73b (56s) == replay-single test 74: Ensure applications don't fail waiting for OST recovery ========================================================== 06:40:24 (1743504024) Stopping clients: oleg651-client.virtnet /mnt/lustre (opts:) Stopping client oleg651-client.virtnet /mnt/lustre opts: Stopping /mnt/lustre-ost1 (opts:) on oleg651-server Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:40:32 (1743504032) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:40:51 (1743504051) targets are mounted 06:40:51 (1743504051) facet_failover done Starting client oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Started clients oleg651-client.virtnet: 192.168.206.151@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,encrypt,statfs_project) Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 PASS 74 (48s) == replay-single test 80a: DNE: create remote dir, drop update rep from MDT0, fail MDT0 ========================================================== 06:41:12 (1743504072) fail_loc=0x1701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2424 1285264 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:41:21 (1743504081) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:41:50 (1743504110) targets are mounted 06:41:50 (1743504110) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec total: 20 open/close in 0.17 seconds: 114.82 ops/second PASS 80a (53s) == replay-single test 80b: DNE: create remote dir, drop update rep from MDT0, fail MDT1 ========================================================== 06:42:05 (1743504125) fail_loc=0x1701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2464 1285224 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2464 1285224 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:42:24 (1743504144) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:42:44 (1743504164) targets are mounted 06:42:44 (1743504164) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec total: 20 open/close in 0.17 seconds: 120.73 ops/second PASS 80b (56s) == replay-single test 80c: DNE: create remote dir, drop update rep from MDT1, fail MDT[0,1] ========================================================== 06:43:01 (1743504181) fail_loc=0x1701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2500 1285188 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2104 1285584 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2500 1285188 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2104 1285584 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:43:15 (1743504195) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:43:35 (1743504215) targets are mounted 06:43:35 (1743504215) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:43:47 (1743504227) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:44:07 (1743504247) targets are mounted 06:44:07 (1743504247) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec total: 20 open/close in 0.20 seconds: 100.22 ops/second PASS 80c (82s) == replay-single test 80d: DNE: create remote dir, drop update rep from MDT1, fail 2 MDTs ========================================================== 06:44:23 (1743504263) fail_loc=0x1701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2536 1285152 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2536 1285152 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:44:50 (1743504290) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 06:45:34 (1743504334) targets are mounted 06:45:34 (1743504334) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec total: 20 open/close in 0.14 seconds: 144.96 ops/second PASS 80d (104s) == replay-single test 80e: DNE: create remote dir, drop MDT1 rep, fail MDT0 ========================================================== 06:46:07 (1743504367) fail_loc=0x119 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2464 1285224 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:46:19 (1743504379) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:46:40 (1743504400) targets are mounted 06:46:40 (1743504400) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec total: 20 open/close in 0.15 seconds: 131.77 ops/second PASS 80e (47s) == replay-single test 80f: DNE: create remote dir, drop MDT1 rep, fail MDT1 ========================================================== 06:46:54 (1743504414) fail_loc=0x119 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2464 1285224 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:47:03 (1743504423) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:47:23 (1743504443) targets are mounted 06:47:23 (1743504443) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec total: 20 open/close in 0.21 seconds: 97.13 ops/second PASS 80f (44s) == replay-single test 80g: DNE: create remote dir, drop MDT1 rep, fail MDT0, then MDT1 ========================================================== 06:47:38 (1743504458) fail_loc=0x119 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2464 1285224 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2464 1285224 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:47:54 (1743504474) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:48:14 (1743504494) targets are mounted 06:48:14 (1743504494) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:48:26 (1743504506) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:48:46 (1743504526) targets are mounted 06:48:46 (1743504526) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec total: 20 open/close in 0.17 seconds: 118.60 ops/second PASS 80g (84s) == replay-single test 80h: DNE: create remote dir, drop MDT1 rep, fail 2 MDTs ========================================================== 06:49:02 (1743504542) fail_loc=0x119 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2464 1285224 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2464 1285224 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:49:21 (1743504561) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 06:49:50 (1743504590) targets are mounted 06:49:50 (1743504590) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec total: 20 open/close in 0.14 seconds: 139.81 ops/second PASS 80h (70s) == replay-single test 81a: DNE: unlink remote dir, drop MDT0 update rep, fail MDT1 ========================================================== 06:50:12 (1743504612) fail_loc=0x1701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2068 1285620 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:50:27 (1743504627) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:50:48 (1743504648) targets are mounted 06:50:48 (1743504648) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 81a (52s) == replay-single test 81b: DNE: unlink remote dir, drop MDT0 update reply, fail MDT0 ========================================================== 06:51:05 (1743504665) fail_loc=0x1701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2496 1285192 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2072 1285616 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:51:14 (1743504674) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:51:33 (1743504693) targets are mounted 06:51:33 (1743504693) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 81b (43s) == replay-single test 81c: DNE: unlink remote dir, drop MDT0 update reply, fail MDT0,MDT1 ========================================================== 06:51:47 (1743504707) fail_loc=0x1701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2072 1285616 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2072 1285616 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:52:00 (1743504720) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:52:20 (1743504740) targets are mounted 06:52:20 (1743504740) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:52:31 (1743504751) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:52:50 (1743504770) targets are mounted 06:52:50 (1743504770) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 81c (77s) == replay-single test 81d: DNE: unlink remote dir, drop MDT0 update reply, fail 2 MDTs ========================================================== 06:53:04 (1743504784) fail_loc=0x1701 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2496 1285192 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2076 1285612 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2496 1285192 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2076 1285612 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:53:26 (1743504806) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 06:54:10 (1743504850) targets are mounted 06:54:10 (1743504850) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 81d (82s) == replay-single test 81e: DNE: unlink remote dir, drop MDT1 req reply, fail MDT0 ========================================================== 06:54:26 (1743504866) fail_loc=0x119 fail_loc=0 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2076 1285612 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:54:36 (1743504876) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:54:56 (1743504896) targets are mounted 06:54:56 (1743504896) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 81e (45s) == replay-single test 81f: DNE: unlink remote dir, drop MDT1 req reply, fail MDT1 ========================================================== 06:55:12 (1743504912) fail_loc=0x119 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2076 1285612 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:55:21 (1743504921) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:55:41 (1743504941) targets are mounted 06:55:41 (1743504941) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 81f (45s) == replay-single test 81g: DNE: unlink remote dir, drop req reply, fail M0, then M1 ========================================================== 06:55:57 (1743504957) fail_loc=0x119 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2076 1285612 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2076 1285612 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:56:11 (1743504971) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:56:31 (1743504991) targets are mounted 06:56:31 (1743504991) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:56:43 (1743505003) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 06:57:03 (1743505023) targets are mounted 06:57:03 (1743505023) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 81g (81s) == replay-single test 81h: DNE: unlink remote dir, drop request reply, fail 2 MDTs ========================================================== 06:57:18 (1743505038) fail_loc=0x119 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2076 1285612 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2076 1285612 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1576 3605444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1572 3605448 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3148 7210892 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 06:57:34 (1743505054) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 06:58:03 (1743505083) targets are mounted 06:58:04 (1743505084) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 81h (71s) == replay-single test 84a: stale open during export disconnect ========================================================== 06:58:29 (1743505109) fail_loc=0x80000144 total: 1 open/close in 0.01 seconds: 73.89 ops/second pdsh@oleg651-client: oleg651-client: ssh exited with exit code 5 PASS 84a (11s) == replay-single test 85a: check the cancellation of unused locks during recovery(IBITS) ========================================================== 06:58:40 (1743505120) before recovery: unused locks count = 201 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 06:58:47 (1743505127) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 06:59:05 (1743505145) targets are mounted 06:59:05 (1743505145) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec after recovery: unused locks count = 101 PASS 85a (40s) == replay-single test 85b: check the cancellation of unused locks during recovery(EXTENT) ========================================================== 06:59:20 (1743505160) before recovery: unused locks count = 100 Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 06:59:32 (1743505172) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 06:59:51 (1743505191) targets are mounted 06:59:51 (1743505191) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec after recovery: unused locks count = 0 PASS 85b (45s) == replay-single test 86: umount server after clear nid_stats should not hit LBUG ========================================================== 07:00:05 (1743505205) Stopping clients: oleg651-client.virtnet /mnt/lustre (opts:) Stopping client oleg651-client.virtnet /mnt/lustre opts: mdt.lustre-MDT0000.exports.clear=0 mdt.lustre-MDT0001.exports.clear=0 Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting client oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Started clients oleg651-client.virtnet: 192.168.206.151@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,encrypt,statfs_project) PASS 86 (27s) == replay-single test 87a: write replay ================== 07:00:32 (1743505232) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2456 1285232 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2080 1285608 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1776 3605244 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3548 7210492 1% /mnt/lustre 8+0 records in 8+0 records out 8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.33419 s, 25.1 MB/s Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 07:00:42 (1743505242) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 07:01:04 (1743505264) targets are mounted 07:01:04 (1743505264) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec 8+0 records in 8+0 records out 8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.310037 s, 27.1 MB/s PASS 87a (47s) == replay-single test 87b: write replay with changed data (checksum resend) ========================================================== 07:01:19 (1743505279) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2456 1285232 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2080 1285608 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 9968 3597052 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 11740 7202300 1% /mnt/lustre 8+0 records in 8+0 records out 8388608 bytes (8.4 MB, 8.0 MiB) copied, 0.273084 s, 30.7 MB/s 8+0 records in 8+0 records out 8 bytes copied, 0.00357732 s, 2.2 kB/s Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 07:01:29 (1743505289) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 07:01:50 (1743505310) targets are mounted 07:01:50 (1743505310) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec 0+1 records in 0+1 records out 72 bytes copied, 0.0101041 s, 7.1 kB/s PASS 87b (47s) == replay-single test 88: MDS should not assign same objid to different files ========================================================== 07:02:06 (1743505326) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2080 1285608 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 9972 3597048 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 11744 7202296 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2460 1285228 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2080 1285608 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 9972 3597048 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 11744 7202296 1% /mnt/lustre before test: last_id = 9441, next_id = 9412 Creating to objid 9441 on ost lustre-OST0000... total: 31 open/close in 0.21 seconds: 148.41 ops/second total: 8 open/close in 0.04 seconds: 178.77 ops/second before recovery: last_id = 9473, next_id = 9450 Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 Failover ost1 to oleg651-server oleg651-server.virtnet Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 after recovery: last_id = 9481, next_id = 9450 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0557098 s, 9.4 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0581995 s, 9.0 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.060058 s, 8.7 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.068413 s, 7.7 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0548536 s, 9.6 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.059961 s, 8.7 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0674237 s, 7.8 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0609857 s, 8.6 MB/s -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9412 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9413 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9414 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9415 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9416 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9417 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9418 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9419 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9420 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9421 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9422 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9423 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9424 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9425 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9426 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9427 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9428 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9429 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9430 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9431 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9432 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9433 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9434 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9435 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9436 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9437 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9438 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9439 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9440 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9441 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9442 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9443 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9444 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9445 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9446 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9447 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9448 -rw-r--r-- 1 root root 0 Apr 1 07:02 /mnt/lustre/d88.replay-single/f-9449 -rw-r--r-- 1 root root 524288 Apr 1 07:03 /mnt/lustre/d88.replay-single/f-9453 -rw-r--r-- 1 root root 524288 Apr 1 07:03 /mnt/lustre/d88.replay-single/f-9454 -rw-r--r-- 1 root root 524288 Apr 1 07:03 /mnt/lustre/d88.replay-single/f-9455 -rw-r--r-- 1 root root 524288 Apr 1 07:03 /mnt/lustre/d88.replay-single/f-9456 -rw-r--r-- 1 root root 524288 Apr 1 07:03 /mnt/lustre/d88.replay-single/f-9457 -rw-r--r-- 1 root root 524288 Apr 1 07:03 /mnt/lustre/d88.replay-single/f-9458 -rw-r--r-- 1 root root 524288 Apr 1 07:03 /mnt/lustre/d88.replay-single/f-9459 -rw-r--r-- 1 root root 524288 Apr 1 07:03 /mnt/lustre/d88.replay-single/f-9460 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0587239 s, 8.9 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0484135 s, 10.8 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0646566 s, 8.1 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0693338 s, 7.6 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0505183 s, 10.4 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0661373 s, 7.9 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0501671 s, 10.5 MB/s 128+0 records in 128+0 records out 524288 bytes (524 kB, 512 KiB) copied, 0.0653196 s, 8.0 MB/s PASS 88 (80s) == replay-single test 89: no disk space leak on late ost connection ========================================================== 07:03:26 (1743505406) Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0000-osc-MDT0001.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0001.old_sync_processed wait 40 secs maximumly for oleg651-server mds-ost sync done. Waiting for MDT destroys to complete /mnt/lustre-ost1: 194.5 MiB (203886592 bytes) trimmed /mnt/lustre-ost2: 205.8 MiB (215748608 bytes) trimmed 10+0 records in 10+0 records out 10485760 bytes (10 MB, 10 MiB) copied, 0.214953 s, 48.8 MB/s Stopping /mnt/lustre-ost1 (opts:) on oleg651-server Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:03:39 (1743505419) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:03:56 (1743505436) targets are mounted 07:03:56 (1743505436) facet_failover done Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 59 sec Waiting for orphan cleanup... osp.lustre-OST0000-osc-MDT0000.old_sync_processed osp.lustre-OST0000-osc-MDT0001.old_sync_processed osp.lustre-OST0001-osc-MDT0000.old_sync_processed osp.lustre-OST0001-osc-MDT0001.old_sync_processed wait 40 secs maximumly for oleg651-server mds-ost sync done. Waiting for MDT destroys to complete /mnt/lustre-ost2: 0 B (0 bytes) trimmed /mnt/lustre-ost1: 66.5 MiB (69713920 bytes) trimmed free_before: 7646296 free_after: 7646296 PASS 89 (126s) == replay-single test 90: lfs find identifies the missing striped file segments ========================================================== 07:05:32 (1743505532) Create the files Fail ost1 lustre-OST0000_UUID, display the list of affected files Stopping /mnt/lustre-ost1 (opts:) on oleg651-server General Query: lfs find /mnt/lustre/d90.replay-single /mnt/lustre/d90.replay-single /mnt/lustre/d90.replay-single/f0 /mnt/lustre/d90.replay-single/all /mnt/lustre/d90.replay-single/f1 Querying files on shutdown ost1: lfs find --obd lustre-OST0000_UUID /mnt/lustre/d90.replay-single/f0 /mnt/lustre/d90.replay-single/all Check getstripe: /home/green/git/lustre-release/lustre/utils/lfs getstripe -r --obd lustre-OST0000_UUID /mnt/lustre/d90.replay-single/f0 lmm_stripe_count: 1 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 9483 0x250b 0x280000401 * /mnt/lustre/d90.replay-single/all lmm_stripe_count: 2 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 0 obdidx objid objid group 0 9482 0x250a 0x280000401 * /mnt/lustre/d90.replay-single/all /mnt/lustre/d90.replay-single/f0 Failover ost1 to oleg651-server oleg651-server.virtnet Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 PASS 90 (36s) == replay-single test 93a: replay + reconnect ============ 07:06:08 (1743505568) 1+0 records in 1+0 records out 1024 bytes (1.0 kB, 1.0 KiB) copied, 0.00238195 s, 430 kB/s fail_val=40 fail_loc=0x715 Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 07:06:14 (1743505574) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 07:06:34 (1743505594) targets are mounted 07:06:34 (1743505594) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 93a (78s) == replay-single test 93b: replay + reconnect on mds ===== 07:07:26 (1743505646) total: 20 open/close in 0.18 seconds: 112.45 ops/second fail_val=80 fail_loc=0x715 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:07:32 (1743505652) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:07:51 (1743505671) targets are mounted 07:07:51 (1743505671) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 93b (121s) == replay-single test 100a: DNE: create striped dir, drop update rep from MDT1, fail MDT1 ========================================================== 07:09:27 (1743505767) fail_loc=0x1701 Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:09:32 (1743505772) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 07:09:51 (1743505791) targets are mounted 07:09:51 (1743505791) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec lmv_stripe_count: 2 lmv_stripe_offset: 0 lmv_hash_type: crush mdtidx FID[seq:oid:ver] 0 [0x2000320e0:0x1:0x0] 1 [0x24000a42a:0x1:0x0] total: 20 open/close in 0.13 seconds: 149.74 ops/second PASS 100a (41s) == replay-single test 100b: DNE: create striped dir, fail MDT0 ========================================================== 07:10:08 (1743505808) fail_loc=0x119 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:10:13 (1743505813) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:10:31 (1743505831) targets are mounted 07:10:31 (1743505831) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec lmv_stripe_count: 2 lmv_stripe_offset: 0 lmv_hash_type: crush mdtidx FID[seq:oid:ver] 0 [0x2000320e0:0x4:0x0] 1 [0x24000a42a:0x4:0x0] total: 20 open/close in 0.14 seconds: 141.47 ops/second PASS 100b (37s) == replay-single test 100c: DNE: create striped dir, abort_recov_mdt mds2 ========================================================== 07:10:45 (1743505845) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2552 1285136 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2156 1285532 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Stopping /mnt/lustre-mds2 (opts:) on oleg651-server Failover mds2 to oleg651-server oleg651-server.virtnet Starting mds2: -o localrecov -o abort_recov_mdt /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 total: 20 open/close in 0.14 seconds: 137.96 ops/second Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:11:18 (1743505878) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 07:11:36 (1743505896) targets are mounted 07:11:36 (1743505896) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec lmv_stripe_count: 2 lmv_stripe_offset: 1 lmv_hash_type: crush mdtidx FID[seq:oid:ver] 1 [0x24000abf8:0x3:0x0] 0 [0x2000320e1:0x3:0x0] total: 20 open/close in 0.10 seconds: 200.24 ops/second PASS 100c (65s) == replay-single test 100d: DNE: cancel update logs upon recovery abort ========================================================== 07:11:50 (1743505910) striped dir -i0 -c2 -H crush2 /mnt/lustre/d100d.replay-single total: 100 mkdir in 0.62 seconds: 162.26 ops/second lustre-MDT0000-osd [catalog]: [0x20001a210:0x1:0x0] [index]: 00015 [logid]: [0x2000328b0:0x1:0x0] lustre-MDT0001-osp-MDT0000 [catalog]: [0x2400007ec:0x1:0x0] [index]: 00023 [logid]: [0x24000abf9:0x1:0x0] [index]: 00024 [logid]: [0x24000abf9:0x2:0x0] Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet Starting mds1: -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 find: '/mnt/lustre/d100d.replay-single': No such file or directory find: '/mnt/lustre/d100d.replay-single': No such file or directory PASS 100d (40s) == replay-single test 100e: DNE: create striped dir on MDT0 and MDT1, fail MDT0, MDT1 ========================================================== 07:12:30 (1743505950) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2968 1284720 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2540 1285148 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2968 1284720 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2540 1285148 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre lmv_stripe_count: 2 lmv_stripe_offset: 0 lmv_hash_type: crush mdtidx FID[seq:oid:ver] 0 [0x200033080:0x2:0x0] 1 [0x24000bb99:0x2:0x0] Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:12:45 (1743505965) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 07:13:14 (1743505994) targets are mounted 07:13:14 (1743505994) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec lmv_stripe_count: 2 lmv_stripe_offset: 0 lmv_hash_type: crush mdtidx FID[seq:oid:ver] 0 [0x200033080:0x2:0x0] 1 [0x24000bb99:0x2:0x0] PASS 100e (62s) == replay-single test 101: Shouldn't reassign precreated objs to other files after recovery ========================================================== 07:13:32 (1743506012) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2888 1284800 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2644 1285044 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet Starting mds1: -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 PASS 101 (78s) == replay-single test 102a: check resend (request lost) with multiple modify RPCs in flight ========================================================== 07:14:50 (1743506090) creating 7 files ... fail_loc=0x159 launch 7 chmod in parallel (07:14:53) ... fail_loc=0 done (07:15:09) /mnt/lustre/d102a.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102a.replay-single/file-7 has perms 0600 OK PASS 102a (26s) == replay-single test 102b: check resend (reply lost) with multiple modify RPCs in flight ========================================================== 07:15:16 (1743506116) creating 7 files ... fail_loc=0x15a launch 7 chmod in parallel (07:15:19) ... fail_loc=0 done (07:15:35) /mnt/lustre/d102b.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102b.replay-single/file-7 has perms 0600 OK PASS 102b (26s) == replay-single test 102c: check replay w/o reconstruction with multiple mod RPCs in flight ========================================================== 07:15:42 (1743506142) creating 7 files ... UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2940 1284748 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2428 1285260 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3580648 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3597044 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7177692 1% /mnt/lustre fail_loc=0x15a launch 7 chmod in parallel (07:15:50) ... fail_loc=0 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:15:55 (1743506155) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:16:16 (1743506176) targets are mounted 07:16:16 (1743506176) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec done (07:16:26) /mnt/lustre/d102c.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102c.replay-single/file-7 has perms 0600 OK PASS 102c (52s) == replay-single test 102d: check replay & reconstruction with multiple mod RPCs in flight ========================================================== 07:16:34 (1743506194) creating 7 files ... fail_loc=0x15a launch 7 chmod in parallel (07:16:37) ... fail_loc=0 Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:16:42 (1743506202) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 07:17:00 (1743506220) targets are mounted 07:17:00 (1743506220) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec done (07:17:09) /mnt/lustre/d102d.replay-single/file-1 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-2 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-3 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-4 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-5 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-6 has perms 0600 OK /mnt/lustre/d102d.replay-single/file-7 has perms 0600 OK PASS 102d (41s) == replay-single test 103: Check otr_next_id overflow ==== 07:17:16 (1743506236) fail_loc=0x80000162 total: 30 open/close in 0.16 seconds: 185.84 ops/second Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:17:22 (1743506242) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:17:40 (1743506260) targets are mounted 07:17:41 (1743506261) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 103 (43s) == replay-single test 110a: DNE: create striped dir, fail MDT1 ========================================================== 07:17:58 (1743506278) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2888 1284800 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2432 1285256 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3580648 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3597044 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7177692 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:18:07 (1743506287) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:18:37 (1743506317) targets are mounted 07:18:38 (1743506318) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d110a.replay-single/striped_dir has type dir OK PASS 110a (55s) == replay-single test 110b: DNE: create striped dir, fail MDT1 and client ========================================================== 07:18:53 (1743506333) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2892 1284796 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2436 1285252 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:19:03 (1743506343) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:19:24 (1743506364) targets are mounted 07:19:24 (1743506364) facet_failover done pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre /mnt/lustre/d110b.replay-single/striped_dir has type dir OK PASS 110b (114s) == replay-single test 110c: DNE: create striped dir, fail MDT2 ========================================================== 07:20:47 (1743506447) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2896 1284792 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2436 1285252 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:20:57 (1743506457) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 07:21:18 (1743506478) targets are mounted 07:21:18 (1743506478) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d110c.replay-single/striped_dir has type dir OK PASS 110c (46s) == replay-single test 110d: DNE: create striped dir, fail MDT2 and client ========================================================== 07:21:33 (1743506493) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2888 1284800 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2436 1285252 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:21:43 (1743506503) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 07:22:05 (1743506525) targets are mounted 07:22:05 (1743506525) facet_failover done pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre /mnt/lustre/d110d.replay-single/striped_dir has type dir OK PASS 110d (113s) == replay-single test 110e: DNE: create striped dir, uncommit on MDT2, fail client/MDT1/MDT2 ========================================================== 07:23:26 (1743506606) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2888 1284800 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2428 1285260 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre df:/mnt/lustre Not a Lustre filesystem Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:23:43 (1743506623) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 07:24:13 (1743506653) targets are mounted 07:24:13 (1743506653) facet_failover done pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre /mnt/lustre/d110e.replay-single/striped_dir has type dir OK PASS 110e (123s) SKIP: replay-single test_110f skipping excluded test 110f == replay-single test 110g: DNE: create striped dir, uncommit on MDT1, fail client/MDT1/MDT2 ========================================================== 07:25:31 (1743506731) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2924 1284764 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2428 1285260 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre df:/mnt/lustre Not a Lustre filesystem Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:25:47 (1743506747) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 07:26:17 (1743506777) targets are mounted 07:26:17 (1743506777) facet_failover done pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre /mnt/lustre/d110g.replay-single/striped_dir has type dir OK PASS 110g (123s) == replay-single test 111a: DNE: unlink striped dir, fail MDT1 ========================================================== 07:27:35 (1743506855) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2900 1284788 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2440 1285248 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:27:45 (1743506865) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:28:06 (1743506886) targets are mounted 07:28:06 (1743506886) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/d111a.replay-single/striped_dir: No such file or directory PASS 111a (48s) == replay-single test 111b: DNE: unlink striped dir, fail MDT2 ========================================================== 07:28:24 (1743506904) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2904 1284784 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2448 1285240 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:28:34 (1743506914) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 07:28:55 (1743506935) targets are mounted 07:28:55 (1743506935) facet_failover done pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Can't lstat /mnt/lustre/d111b.replay-single/striped_dir: No such file or directory PASS 111b (114s) == replay-single test 111c: DNE: unlink striped dir, uncommit on MDT1, fail client/MDT1/MDT2 ========================================================== 07:30:17 (1743507017) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2900 1284788 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2444 1285244 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre df:/mnt/lustre Not a Lustre filesystem Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:30:40 (1743507040) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 07:31:24 (1743507084) targets are mounted 07:31:24 (1743507084) facet_failover done pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Can't lstat /mnt/lustre/d111c.replay-single/striped_dir: No such file or directory PASS 111c (146s) == replay-single test 111d: DNE: unlink striped dir, uncommit on MDT2, fail client/MDT1/MDT2 ========================================================== 07:32:43 (1743507163) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2908 1284780 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2444 1285244 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre df:/mnt/lustre Not a Lustre filesystem Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:33:01 (1743507181) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 07:33:29 (1743507209) targets are mounted 07:33:30 (1743507210) facet_failover done pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid pdsh@oleg651-client: oleg651-client: ssh exited with exit code 95 Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre Can't lstat /mnt/lustre/d111d.replay-single/striped_dir: No such file or directory PASS 111d (124s) == replay-single test 111e: DNE: unlink striped dir, uncommit on MDT2, fail MDT1/MDT2 ========================================================== 07:34:47 (1743507287) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2952 1284736 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2480 1285208 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2944 1284744 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2472 1285216 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:35:04 (1743507304) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 07:35:33 (1743507333) targets are mounted 07:35:33 (1743507333) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/d111e.replay-single/striped_dir: No such file or directory PASS 111e (70s) == replay-single test 111f: DNE: unlink striped dir, uncommit on MDT1, fail MDT1/MDT2 ========================================================== 07:35:57 (1743507357) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2992 1284696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2516 1285172 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2908 1284780 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2436 1285252 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:36:15 (1743507375) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 07:36:44 (1743507404) targets are mounted 07:36:44 (1743507404) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/d111f.replay-single/striped_dir: No such file or directory PASS 111f (66s) == replay-single test 111g: DNE: unlink striped dir, fail MDT1/MDT2 ========================================================== 07:37:03 (1743507423) UUID Inodes IUsed IFree IUse% Mounted on lustre-MDT0000_UUID 1024000 553 1023447 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1024000 320 1023680 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 262144 557 261587 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 262144 487 261657 1% /mnt/lustre[OST:1] filesystem_summary: 524117 873 523244 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2920 1284768 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2444 1285244 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2920 1284768 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2444 1285244 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:37:19 (1743507439) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 07:37:49 (1743507469) targets are mounted 07:37:49 (1743507469) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Can't lstat /mnt/lustre/d111g.replay-single/striped_dir: No such file or directory PASS 111g (70s) == replay-single test 112a: DNE: cross MDT rename, fail MDT1 ========================================================== 07:38:13 (1743507493) SKIP: replay-single test_112a needs >= 4 MDTs SKIP 112a (3s) == replay-single test 112b: DNE: cross MDT rename, fail MDT2 ========================================================== 07:38:16 (1743507496) SKIP: replay-single test_112b needs >= 4 MDTs SKIP 112b (3s) == replay-single test 112c: DNE: cross MDT rename, fail MDT3 ========================================================== 07:38:20 (1743507500) SKIP: replay-single test_112c needs >= 4 MDTs SKIP 112c (3s) == replay-single test 112d: DNE: cross MDT rename, fail MDT4 ========================================================== 07:38:23 (1743507503) SKIP: replay-single test_112d needs >= 4 MDTs SKIP 112d (4s) == replay-single test 112e: DNE: cross MDT rename, fail MDT1 and MDT2 ========================================================== 07:38:27 (1743507507) SKIP: replay-single test_112e needs >= 4 MDTs SKIP 112e (4s) == replay-single test 112f: DNE: cross MDT rename, fail MDT1 and MDT3 ========================================================== 07:38:31 (1743507511) SKIP: replay-single test_112f needs >= 4 MDTs SKIP 112f (4s) == replay-single test 112g: DNE: cross MDT rename, fail MDT1 and MDT4 ========================================================== 07:38:35 (1743507515) SKIP: replay-single test_112g needs >= 4 MDTs SKIP 112g (5s) == replay-single test 112h: DNE: cross MDT rename, fail MDT2 and MDT3 ========================================================== 07:38:40 (1743507520) SKIP: replay-single test_112h needs >= 4 MDTs SKIP 112h (4s) == replay-single test 112i: DNE: cross MDT rename, fail MDT2 and MDT4 ========================================================== 07:38:45 (1743507525) SKIP: replay-single test_112i needs >= 4 MDTs SKIP 112i (4s) == replay-single test 112j: DNE: cross MDT rename, fail MDT3 and MDT4 ========================================================== 07:38:49 (1743507529) SKIP: replay-single test_112j needs >= 4 MDTs SKIP 112j (4s) == replay-single test 112k: DNE: cross MDT rename, fail MDT1,MDT2,MDT3 ========================================================== 07:38:53 (1743507533) SKIP: replay-single test_112k needs >= 4 MDTs SKIP 112k (5s) == replay-single test 112l: DNE: cross MDT rename, fail MDT1,MDT2,MDT4 ========================================================== 07:38:58 (1743507538) SKIP: replay-single test_112l needs >= 4 MDTs SKIP 112l (4s) == replay-single test 112m: DNE: cross MDT rename, fail MDT1,MDT3,MDT4 ========================================================== 07:39:03 (1743507543) SKIP: replay-single test_112m needs >= 4 MDTs SKIP 112m (4s) == replay-single test 112n: DNE: cross MDT rename, fail MDT2,MDT3,MDT4 ========================================================== 07:39:07 (1743507547) SKIP: replay-single test_112n needs >= 4 MDTs SKIP 112n (4s) == replay-single test 115: failover for create/unlink striped directory ========================================================== 07:39:11 (1743507551) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2916 1284772 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2432 1285256 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre striped dir -i1 -c2 -H fnv_1a_64 /mnt/lustre/d115.replay-single/test_0 striped dir -i1 -c2 -H fnv_1a_64 /mnt/lustre/d115.replay-single/test_1 striped dir -i1 -c2 -H all_char /mnt/lustre/d115.replay-single/test_2 striped dir -i1 -c2 -H crush2 /mnt/lustre/d115.replay-single/test_3 striped dir -i1 -c2 -H all_char /mnt/lustre/d115.replay-single/test_4 Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:39:21 (1743507561) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 07:39:41 (1743507581) targets are mounted 07:39:41 (1743507581) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2948 1284740 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2540 1285148 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre striped dir -i0 -c2 -H all_char /mnt/lustre/d115.replay-single/test_0 striped dir -i0 -c2 -H crush2 /mnt/lustre/d115.replay-single/test_1 striped dir -i0 -c2 -H fnv_1a_64 /mnt/lustre/d115.replay-single/test_2 striped dir -i0 -c2 -H fnv_1a_64 /mnt/lustre/d115.replay-single/test_3 striped dir -i0 -c2 -H crush /mnt/lustre/d115.replay-single/test_4 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:40:00 (1743507600) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:40:19 (1743507619) targets are mounted 07:40:19 (1743507619) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 115 (83s) == replay-single test 116a: large update log master MDT recovery ========================================================== 07:40:35 (1743507635) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3008 1284680 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2568 1285120 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre fail_loc=0x80001702 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:40:44 (1743507644) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:41:04 (1743507664) targets are mounted 07:41:04 (1743507664) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d116a.replay-single/striped_dir has type dir OK PASS 116a (44s) == replay-single test 116b: large update log slave MDT recovery ========================================================== 07:41:20 (1743507680) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3012 1284676 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2524 1285164 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre fail_loc=0x80001702 Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:41:29 (1743507689) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 07:41:49 (1743507709) targets are mounted 07:41:49 (1743507709) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d116b.replay-single/striped_dir has type dir OK PASS 116b (46s) == replay-single test 117: DNE: cross MDT unlink, fail MDT1 and MDT2 ========================================================== 07:42:06 (1743507726) SKIP: replay-single test_117 needs >= 4 MDTs SKIP 117 (4s) == replay-single test 118: invalidate osp update will not cause update log corruption ========================================================== 07:42:10 (1743507730) fail_loc=0x1705 lfs setdirstripe: dirstripe error on '/mnt/lustre/d118.replay-single/striped_dir': Input/output error lfs setdirstripe: cannot create dir '/mnt/lustre/d118.replay-single/striped_dir': Input/output error lfs setdirstripe: dirstripe error on '/mnt/lustre/d118.replay-single/striped_dir1': Input/output error lfs setdirstripe: cannot create dir '/mnt/lustre/d118.replay-single/striped_dir1': Input/output error fail_loc=0x0 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3156 1284532 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2620 1285068 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:42:21 (1743507741) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:42:51 (1743507771) targets are mounted 07:42:51 (1743507771) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 118 (55s) == replay-single test 119: timeout of normal replay does not cause DNE replay fails ========================================================== 07:43:05 (1743507785) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3072 1284616 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2544 1285144 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet fail_loc=0x80000714 fail_val=65 Starting mds1: -o localrecov -o recovery_time_hard=60 /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg651-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 62 sec mdt.lustre-MDT0000.recovery_time_hard=180 PASS 119 (103s) == replay-single test 120: DNE fail abort should stop both normal and DNE replay ========================================================== 07:44:48 (1743507888) Replay barrier on lustre-MDT0000 Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet Starting mds1: -o localrecov -o abort_recovery /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 find: '/mnt/lustre/d120.replay-single': No such file or directory find: '/mnt/lustre/d120.replay-single': No such file or directory PASS 120 (44s) == replay-single test 121: lock replay timed out and race ========================================================== 07:45:32 (1743507932) multiop /mnt/lustre/f121.replay-single vs_s TMPPIPE=/tmp/multiop_open_wait_pipe.7954 Stopping /mnt/lustre-mds1 (opts:) on oleg651-server Failover mds1 to oleg651-server oleg651-server.virtnet fail_loc=0x721 fail_val=0 at_max=0 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 pdsh@oleg651-client: oleg651-client: ssh exited with exit code 5 fail_loc=0x0 at_max=600 PASS 121 (214s) == replay-single test 130a: DoM file create (setstripe) replay ========================================================== 07:49:06 (1743508146) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3428 1284260 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2848 1284840 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:49:15 (1743508155) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:49:36 (1743508176) targets are mounted 07:49:36 (1743508176) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 130a (44s) == replay-single test 130b: DoM file create (inherited) replay ========================================================== 07:49:50 (1743508190) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3432 1284256 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2848 1284840 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:49:59 (1743508199) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:50:19 (1743508219) targets are mounted 07:50:19 (1743508219) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 130b (42s) == replay-single test 131a: DoM file write lock replay === 07:50:33 (1743508233) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3432 1284256 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2848 1284840 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre 1+0 records in 1+0 records out 8 bytes copied, 0.00265778 s, 3.0 kB/s Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:50:41 (1743508241) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:51:01 (1743508261) targets are mounted 07:51:01 (1743508261) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 131a (42s) SKIP: replay-single test_131b skipping excluded test 131b == replay-single test 132a: PFL new component instantiate replay ========================================================== 07:51:17 (1743508277) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3436 1284252 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2848 1284840 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1772 3605248 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 19940 7194100 1% /mnt/lustre 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0371761 s, 28.2 MB/s /mnt/lustre/f132a.replay-single lcm_layout_gen: 3 lcm_mirror_count: 1 lcm_entry_count: 2 lcme_id: 1 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: 1048576 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 0 lmm_objects: - 0: { l_ost_idx: 0, l_fid: [0x280000401:0x2c4a:0x0] } lcme_id: 2 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 1048576 lcme_extent.e_end: EOF lmm_stripe_count: 2 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 0 lmm_stripe_offset: 1 lmm_objects: - 0: { l_ost_idx: 1, l_fid: [0x2c0000401:0x2c22:0x0] } - 1: { l_ost_idx: 0, l_fid: [0x280000401:0x2c4b:0x0] } Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:51:25 (1743508285) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:51:45 (1743508305) targets are mounted 07:51:45 (1743508305) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/f132a.replay-single lcm_layout_gen: 4 lcm_mirror_count: 1 lcm_entry_count: 2 lcme_id: 1 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 0 lcme_extent.e_end: 1048576 lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen: 65535 lmm_stripe_offset: 0 lmm_objects: - 0: { l_ost_idx: 0, l_fid: [0x280000401:0x2c4a:0x0] } lcme_id: 2 lcme_mirror_id: 0 lcme_flags: init lcme_extent.e_start: 1048576 lcme_extent.e_end: EOF lmm_stripe_count: 2 lmm_stripe_size: 4194304 lmm_pattern: raid0 lmm_layout_gen: 65535 lmm_stripe_offset: 1 lmm_objects: - 0: { l_ost_idx: 1, l_fid: [0x2c0000401:0x2c22:0x0] } - 1: { l_ost_idx: 0, l_fid: [0x280000401:0x2c4b:0x0] } PASS 132a (42s) == replay-single test 133: check resend of ongoing requests for lwp during failover ========================================================== 07:51:59 (1743508319) seq.srv-lustre-MDT0001.space=clear Starting client: oleg651-client.virtnet: -o user_xattr,flock 192.168.206.151@tcp:/lustre /mnt/lustre fail_val=700 fail_loc=0x80000123 Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:52:09 (1743508329) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:52:26 (1743508346) targets are mounted 07:52:26 (1743508346) facet_failover done PASS 133 (36s) == replay-single test 134: replay creation of a file created in a pool ========================================================== 07:52:36 (1743508356) Creating new pool pool_134 oleg651-server: Pool lustre.pool_134 created Adding targets to pool oleg651-server: OST lustre-OST0001_UUID added to pool lustre.pool_134 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3492 1284196 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2860 1284828 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2796 3604224 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 20964 7193076 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:52:51 (1743508371) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:53:11 (1743508391) targets are mounted 07:53:11 (1743508391) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Destroy the created pools: pool_134 lustre.pool_134 oleg651-server: OST lustre-OST0001_UUID removed from pool lustre.pool_134 oleg651-server: Pool lustre.pool_134 destroyed Waiting 90s for 'foo' PASS 134 (56s) == replay-single test 135: Server failure in lock replay phase ========================================================== 07:53:32 (1743508412) Failing ost1 on oleg651-server Stopping /mnt/lustre-ost1 (opts:) on oleg651-server 07:53:36 (1743508416) shut down facet: ost1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover ost1 to oleg651-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 07:53:55 (1743508435) targets are mounted 07:53:55 (1743508435) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3472 1284216 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2860 1284828 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2796 3604224 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 20964 7193076 1% /mnt/lustre ldlm.cancel_unused_locks_before_replay=0 debug_mb=100 debug=+info +ha +dlmtrace Stopping /mnt/lustre-ost1 (opts:) on oleg651-server Failover ost1 to oleg651-server oleg651-server.virtnet oleg651-server: oleg651-server.virtnet: executing load_module ../libcfs/libcfs/libcfs fail_loc=0x32d fail_val=20 debug_mb=100 debug=+info +ha +dlmtrace Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 oleg651-client.virtnet: executing wait_import_state_mount REPLAY_LOCKS osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in REPLAY_LOCKS state after 0 sec Stopping /mnt/lustre-ost1 (opts:) on oleg651-server Failover ost1 to oleg651-server oleg651-server.virtnet oleg651-server: oleg651-server.virtnet: executing load_module ../libcfs/libcfs/libcfs fail_loc=0 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 End of sync oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 Stopping /mnt/lustre-ost1 (opts:-f) on oleg651-server Stopping /mnt/lustre-ost2 (opts:-f) on oleg651-server Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0000 Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-OST0001 pdsh@oleg651-client: oleg651-client: ssh exited with exit code 5 debug_mb=21 debug=super ioctl neterror warning dlmtrace error emerg ha rpctrace vfstrace config console lfsck ldlm.cancel_unused_locks_before_replay=1 PASS 135 (139s) == replay-single test 136: MDS to disconnect all OSPs first, then cleanup ldlm ========================================================== 07:55:51 (1743508551) SKIP: replay-single test_136 needs > 2 MDTs SKIP 136 (3s) == replay-single test 137a: DNE: create under striped dir, fail MDT1 ========================================================== 07:55:54 (1743508554) llite.lustre-ffff95e5981e1000.intent_mkdir=1 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3480 1284208 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2880 1284808 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2800 3604220 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 20968 7193072 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:56:03 (1743508563) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:56:22 (1743508582) targets are mounted 07:56:22 (1743508582) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d137a.replay-single/striped_dir/dir0 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none mdtidx FID[seq:oid:ver] /mnt/lustre/d137a.replay-single/striped_dir/dir1 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none mdtidx FID[seq:oid:ver] PASS 137a (42s) == replay-single test 137b: DNE: create under striped dir, fail MDT2 ========================================================== 07:56:36 (1743508596) llite.lustre-ffff95e5981e1000.intent_mkdir=1 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3484 1284204 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2884 1284804 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2800 3604220 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 20968 7193072 1% /mnt/lustre Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server 07:56:44 (1743508604) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds2 to oleg651-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 07:57:03 (1743508623) targets are mounted 07:57:03 (1743508623) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d137b.replay-single/striped_dir/dir0 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none mdtidx FID[seq:oid:ver] /mnt/lustre/d137b.replay-single/striped_dir/dir1 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none mdtidx FID[seq:oid:ver] PASS 137b (42s) == replay-single test 137c: DNE: create under striped dir, fail MDT1/MDT2 ========================================================== 07:57:18 (1743508638) llite.lustre-ffff95e5981e1000.intent_mkdir=1 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3480 1284208 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2868 1284820 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2800 3604220 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 20968 7193072 1% /mnt/lustre UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3480 1284208 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2868 1284820 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2800 3604220 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 20968 7193072 1% /mnt/lustre Failing mds2 on oleg651-server Stopping /mnt/lustre-mds2 (opts:) on oleg651-server Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:57:33 (1743508653) shut down facet: mds2 facet_host: oleg651-server facet_failover_host: oleg651-server facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server Failover mds2 to oleg651-server mount facets: mds2 mount facets: mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 07:57:57 (1743508677) targets are mounted 07:57:57 (1743508677) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec /mnt/lustre/d137c.replay-single/striped_dir/dir0 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 0 lmv_hash_type: none mdtidx FID[seq:oid:ver] /mnt/lustre/d137c.replay-single/striped_dir/dir1 has type dir OK lmv_stripe_count: 0 lmv_stripe_offset: 1 lmv_hash_type: none mdtidx FID[seq:oid:ver] PASS 137c (55s) == replay-single test 200: Dropping one OBD_PING should not cause disconnect ========================================================== 07:58:13 (1743508693) SKIP: replay-single test_200 Need remote client SKIP 200 (3s) == replay-single test 201: MDT umount cascading disconnects timeouts ========================================================== 07:58:16 (1743508696) fail_loc=0x245 fail_val=8 fail_loc=0x245 fail_val=8 Stopping /mnt/lustre-mds2 (opts:) on oleg651-server Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0001 Umount took 18 seconds PASS 201 (35s) == replay-single test 202: pfl replay should recovery layout generation ========================================================== 07:58:51 (1743508731) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3432 1284256 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2820 1284868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18168 3588852 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 2800 3604220 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 20968 7193072 1% /mnt/lustre Failing mds1 on oleg651-server Stopping /mnt/lustre-mds1 (opts:) on oleg651-server 07:59:01 (1743508741) shut down facet: mds1 facet_host: oleg651-server facet_failover_host: oleg651-server Failover mds1 to oleg651-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg651-server: oleg651-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg651-client: oleg651-server: ssh exited with exit code 1 Started lustre-MDT0000 07:59:20 (1743508760) targets are mounted 07:59:20 (1743508760) facet_failover done oleg651-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 202 (42s) == replay-single test complete, duration 15267 sec ======= 07:59:33 (1743508773) === replay-single: start cleanup 07:59:35 (1743508775) === === replay-single: finish cleanup 07:59:43 (1743508783) ===