-----============= acceptance-small: insanity ============----- Tue Apr 1 03:48:09 EDT 2025 mgs: Rocky Linux release 8.10 (Green Obsidian) MGS_OS_ID_LIKE=rhel centos fedora rocky MGS_OS_VERSION_ID=8.10 MGS_OS_ID=rocky MGS_OS_VERSION_CODE=134873088 mds1: Rocky Linux release 8.10 (Green Obsidian) MDS1_OS_VERSION_ID=8.10 MDS1_OS_VERSION_CODE=134873088 MDS1_OS_ID_LIKE=rhel centos fedora rocky MDS1_OS_ID=rocky ost1: Rocky Linux release 8.10 (Green Obsidian) OST1_OS_VERSION_CODE=134873088 OST1_OS_ID_LIKE=rhel centos fedora rocky OST1_OS_VERSION_ID=8.10 OST1_OS_ID=rocky client: Rocky Linux release 8.10 (Green Obsidian) CLIENT_OS_ID=rocky CLIENT_OS_VERSION_CODE=134873088 CLIENT_OS_VERSION_ID=8.10 CLIENT_OS_ID_LIKE=rhel centos fedora rocky oleg607-server: ls: cannot access '/home/green/git/lustre-release/lustre/tests/except/insanity.*ex': No such file or directory excepting tests: === insanity: start setup 03:49:01 (1743493741) === oleg607-client.virtnet: executing check_config_client /mnt/lustre oleg607-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg607-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff90f3c656c000.idle_timeout=debug osc.lustre-OST0001-osc-ffff90f3c656c000.idle_timeout=debug disable quota as required oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all === insanity: finish setup 03:50:24 (1743493824) === == insanity test 0: Fail all nodes, independently ======== 03:50:29 (1743493829) Failing mds1 on oleg607-server Stopping /mnt/lustre-mds1 (opts:) on oleg607-server 03:50:44 (1743493844) shut down facet: mds1 facet_host: oleg607-server facet_failover_host: oleg607-server Failover mds1 to oleg607-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-MDT0000 03:51:36 (1743493896) targets are mounted 03:51:37 (1743493897) facet_failover done oleg607-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing ost1 on oleg607-server Stopping /mnt/lustre-ost1 (opts:) on oleg607-server 03:52:09 (1743493929) shut down facet: ost1 facet_host: oleg607-server facet_failover_host: oleg607-server Failover ost1 to oleg607-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-OST0000 03:52:53 (1743493973) targets are mounted 03:52:53 (1743493973) facet_failover done oleg607-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec Failing ost2 on oleg607-server Stopping /mnt/lustre-ost2 (opts:) on oleg607-server 03:53:24 (1743494004) shut down facet: ost2 facet_host: oleg607-server facet_failover_host: oleg607-server Failover ost2 to oleg607-server mount facets: ost2 Starting ost2: -o localrecov lustre-ost2/ost2 /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-OST0001 03:54:06 (1743494046) targets are mounted 03:54:06 (1743494046) facet_failover done oleg607-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 0 (253s) == insanity test 1: MDS/MDS failure ====================== 03:54:41 (1743494081) SKIP: insanity test_1 needs >= 2 MDTs SKIP 1 (8s) == insanity test 2: Second Failure Mode: MDS/OST Tue Apr 1 03:54:50 EDT 2025 ========================================================== 03:54:50 (1743494090) Verify Lustre filesystem is up and running Stopping /mnt/lustre-mds1 (opts:) on oleg607-server Failover mds1 to oleg607-server Stopping /mnt/lustre-ost1 (opts:) on oleg607-server Reintegrating OST oleg607-server.virtnet Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-OST0000 oleg607-server.virtnet Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-MDT0000 Verify reintegration PASS 2 (245s) == insanity test 3: Third Failure Mode: MDS/CLIENT Tue Apr 1 03:58:55 EDT 2025 ========================================================== 03:58:56 (1743494336) Verify Lustre filesystem is up and running Failing mds1 on oleg607-server Stopping /mnt/lustre-mds1 (opts:) on oleg607-server 03:59:11 (1743494351) shut down facet: mds1 facet_host: oleg607-server facet_failover_host: oleg607-server Failover mds1 to oleg607-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-MDT0000 04:00:01 (1743494401) targets are mounted 04:00:01 (1743494401) facet_failover done oleg607-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Test Lustre stability after MDS failover Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS PASS 3 (121s) == insanity test 4: Fourth Failure Mode: OST/MDS Tue Apr 1 04:00:57 EDT 2025 ========================================================== 04:00:57 (1743494457) Fourth Failure Mode: OST/MDS Tue Apr 1 04:01:01 EDT 2025 Stopping /mnt/lustre-ost1 (opts:) on oleg607-server Test Lustre stability after OST failure Stopping /mnt/lustre-mds1 (opts:) on oleg607-server Failover mds1 to oleg607-server Reintegrating OST oleg607-server.virtnet Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-OST0000 oleg607-server.virtnet Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-MDT0000 Test Lustre stability after MDS failover PASS 4 (234s) == insanity test 5: Fifth Failure Mode: OST/OST Tue Apr 1 04:04:51 EDT 2025 ========================================================== 04:04:51 (1743494691) Fifth Failure Mode: OST/OST Tue Apr 1 04:04:56 EDT 2025 Verify Lustre filesystem is up and running Stopping /mnt/lustre-ost1 (opts:) on oleg607-server Test Lustre stability after OST failure Stopping /mnt/lustre-ost2 (opts:) on oleg607-server Test Lustre stability after OST failure Reintegrating OSTs oleg607-server.virtnet Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-OST0000 oleg607-server.virtnet Starting ost2: -o localrecov lustre-ost2/ost2 /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-OST0001 PASS 5 (173s) == insanity test 6: Sixth Failure Mode: OST/CLIENT Tue Apr 1 04:07:44 EDT 2025 ========================================================== 04:07:44 (1743494864) Sixth Failure Mode: OST/CLIENT Tue Apr 1 04:07:48 EDT 2025 Verify Lustre filesystem is up and running Stopping /mnt/lustre-ost1 (opts:) on oleg607-server Test Lustre stability after OST failure DFPIDA=18829 Failing CLIENTs Request fail clients: , to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure DFPIDB=19147 Reintegrating OST/CLIENTs oleg607-server.virtnet Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-OST0000 Verifying mount PASS 6 (106s) == insanity test 7: Seventh Failure Mode: CLIENT/MDS Tue Apr 1 04:09:30 EDT 2025 ========================================================== 04:09:30 (1743494970) Seventh Failure Mode: CLIENT/MDS Tue Apr 1 04:09:34 EDT 2025 Verify Lustre filesystem is up and running Part 1: Failing CLIENT Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg607-client: total 1 oleg607-client: -rw-r--r-- 1 root root 0 Apr 1 04:09 oleg607-client.virtnet_testfile Wait 1 minutes Verify Lustre filesystem is up and running oleg607-client: rm: cannot remove '/mnt/lustre/d0.insanity/oleg607-client.virtnet_testfile': No such file or directory pdsh@oleg607-client: oleg607-client: ssh exited with exit code 1 Failing mds1 on oleg607-server Stopping /mnt/lustre-mds1 (opts:) on oleg607-server 04:11:09 (1743495069) shut down facet: mds1 facet_host: oleg607-server facet_failover_host: oleg607-server Failover mds1 to oleg607-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-MDT0000 04:11:47 (1743495107) targets are mounted 04:11:47 (1743495107) facet_failover done oleg607-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec oleg607-client: total 0 Reintegrating CLIENTs wait 1 minutes PASS 7 (238s) == insanity test 8: Eighth Failure Mode: CLIENT/OST Tue Apr 1 04:13:28 EDT 2025 ========================================================== 04:13:28 (1743495208) Eighth Failure Mode: CLIENT/OST Tue Apr 1 04:13:31 EDT 2025 Verify Lustre filesystem is up and running Failing CLIENTs Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg607-client: total 1 oleg607-client: -rw-r--r-- 1 root root 0 Apr 1 04:13 oleg607-client.virtnet_testfile Wait 1 minutes Verify Lustre filesystem is up and running Stopping /mnt/lustre-ost1 (opts:) on oleg607-server Test Lustre stability after OST failure Reintegrating CLIENTs/OST oleg607-server.virtnet Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg607-server: oleg607-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg607-client: oleg607-server: ssh exited with exit code 1 Started lustre-OST0000 Wait 1 minutes PASS 8 (227s) == insanity test 9: Ninth Failure Mode: CLIENT/CLIENT Tue Apr 1 04:17:15 EDT 2025 ========================================================== 04:17:16 (1743495436) Verify Lustre filesystem is up and running Failing CLIENTs Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg607-client: total 1 oleg607-client: -rw-r--r-- 1 root root 0 Apr 1 04:17 oleg607-client.virtnet_testfile oleg607-client: -rw-r--r-- 1 root root 0 Apr 1 04:16 oleg607-client.virtnet_testfile2 Wait 1 minutes Verify Lustre filesystem is up and running Failing CLIENTs Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg607-client: total 1 oleg607-client: -rw-r--r-- 1 root root 0 Apr 1 04:18 oleg607-client.virtnet_testfile oleg607-client: -rw-r--r-- 1 root root 0 Apr 1 04:16 oleg607-client.virtnet_testfile2 Reintegrating CLIENTs/CLIENTs Wait 1 minutes PASS 9 (181s) == insanity test 10: Tenth Failure Mode: MDT0/OST/MDT1 Tue Apr 1 04:20:16 EDT 2025 ========================================================== 04:20:16 (1743495616) SKIP: insanity test_10 needs >= 2 MDTs SKIP 10 (8s) == insanity test 11: Eleventh Failure Mode: MDS0/CLIENT/MDS1 Tue Apr 1 04:20:24 EDT 2025 ========================================================== 04:20:25 (1743495625) SKIP: insanity test_11 needs >= 2 MDTs SKIP 11 (9s) == insanity test 12: Twelve Failure Mode: MDS0,MDS1/OST0, OST1/CLIENTS Tue Apr 1 04:20:33 EDT 2025 ========================================================== 04:20:33 (1743495633) SKIP: insanity test_12 needs >= 2 MDTs SKIP 12 (8s) == insanity test 13: Thirteen Failure Mode: MDS0,MDS1/CLIENTS/OST0,OST1 Tue Apr 1 04:20:41 EDT 2025 ========================================================== 04:20:41 (1743495641) SKIP: insanity test_13 needs >= 2 MDTs SKIP 13 (8s) == insanity test 14: Fourteen Failure Mode: OST0,OST1/CLIENTS/MDS0,MDS1 Tue Apr 1 04:20:49 EDT 2025 ========================================================== 04:20:49 (1743495649) SKIP: insanity test_14 needs >= 2 MDTs SKIP 14 (8s) == insanity test complete, duration 1964 sec ============= 04:20:57 (1743495657) === insanity: start cleanup 04:21:01 (1743495661) === === insanity: finish cleanup 04:21:09 (1743495669) ===