-----============= acceptance-small: insanity ============----- Tue Apr 1 03:48:49 EDT 2025 mgs: Rocky Linux release 8.10 (Green Obsidian) MGS_OS_ID_LIKE=rhel centos fedora rocky MGS_OS_VERSION_ID=8.10 MGS_OS_ID=rocky MGS_OS_VERSION_CODE=134873088 mds1: Rocky Linux release 8.10 (Green Obsidian) MDS1_OS_VERSION_ID=8.10 MDS1_OS_VERSION_CODE=134873088 MDS1_OS_ID_LIKE=rhel centos fedora rocky MDS1_OS_ID=rocky ost1: Rocky Linux release 8.10 (Green Obsidian) OST1_OS_VERSION_CODE=134873088 OST1_OS_ID_LIKE=rhel centos fedora rocky OST1_OS_VERSION_ID=8.10 OST1_OS_ID=rocky client: Rocky Linux release 8.10 (Green Obsidian) CLIENT_OS_ID=rocky CLIENT_OS_VERSION_CODE=134873088 CLIENT_OS_VERSION_ID=8.10 CLIENT_OS_ID_LIKE=rhel centos fedora rocky oleg625-server: ls: cannot access '/home/green/git/lustre-release/lustre/tests/except/insanity.*ex': No such file or directory excepting tests: === insanity: start setup 03:49:34 (1743493774) === oleg625-client.virtnet: executing check_config_client /mnt/lustre oleg625-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg625-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff9d1246791000.idle_timeout=debug osc.lustre-OST0001-osc-ffff9d1246791000.idle_timeout=debug disable quota as required oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all osd-ldiskfs.track_declares_assert=1 === insanity: finish setup 03:50:55 (1743493855) === == insanity test 0: Fail all nodes, independently ======== 03:50:59 (1743493859) Failing mds1 on oleg625-server Stopping /mnt/lustre-mds1 (opts:) on oleg625-server 03:51:10 (1743493870) shut down facet: mds1 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds1 to oleg625-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0000 03:52:00 (1743493920) targets are mounted 03:52:00 (1743493920) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing mds2 on oleg625-server Stopping /mnt/lustre-mds2 (opts:) on oleg625-server 03:52:25 (1743493945) shut down facet: mds2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds2 to oleg625-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0001 03:53:00 (1743493980) targets are mounted 03:53:00 (1743493980) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Failing ost1 on oleg625-server Stopping /mnt/lustre-ost1 (opts:) on oleg625-server 03:53:25 (1743494005) shut down facet: ost1 facet_host: oleg625-server facet_failover_host: oleg625-server Failover ost1 to oleg625-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0000 03:53:59 (1743494039) targets are mounted 03:53:59 (1743494039) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec Failing ost2 on oleg625-server Stopping /mnt/lustre-ost2 (opts:) on oleg625-server 03:54:25 (1743494065) shut down facet: ost2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover ost2 to oleg625-server mount facets: ost2 Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0001 03:55:04 (1743494104) targets are mounted 03:55:04 (1743494104) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 0 (274s) == insanity test 1: MDS/MDS failure ====================== 03:55:33 (1743494133) Stopping /mnt/lustre-mds1 (opts:) on oleg625-server Failover mds1 to oleg625-server Stopping /mnt/lustre-mds2 (opts:) on oleg625-server Reintegrating MDS2 oleg625-server.virtnet Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0001 oleg625-server.virtnet Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0000 Verify reintegration PASS 1 (251s) == insanity test 2: Second Failure Mode: MDS/OST Tue Apr 1 03:59:44 EDT 2025 ========================================================== 03:59:44 (1743494384) Verify Lustre filesystem is up and running Stopping /mnt/lustre-mds1 (opts:) on oleg625-server Failover mds1 to oleg625-server Stopping /mnt/lustre-mds2 (opts:) on oleg625-server Failover mds2 to oleg625-server Stopping /mnt/lustre-ost1 (opts:) on oleg625-server Reintegrating OST oleg625-server.virtnet Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0000 oleg625-server.virtnet Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg625-server.virtnet Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0001 Verify reintegration PASS 2 (247s) == insanity test 3: Third Failure Mode: MDS/CLIENT Tue Apr 1 04:03:51 EDT 2025 ========================================================== 04:03:51 (1743494631) Verify Lustre filesystem is up and running Failing mds1 on oleg625-server Stopping /mnt/lustre-mds1 (opts:) on oleg625-server 04:04:02 (1743494642) shut down facet: mds1 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds1 to oleg625-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0000 04:04:36 (1743494676) targets are mounted 04:04:36 (1743494676) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing mds2 on oleg625-server Stopping /mnt/lustre-mds2 (opts:) on oleg625-server 04:05:03 (1743494703) shut down facet: mds2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds2 to oleg625-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0001 04:05:41 (1743494741) targets are mounted 04:05:41 (1743494741) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Test Lustre stability after MDS failover Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS PASS 3 (160s) == insanity test 4: Fourth Failure Mode: OST/MDS Tue Apr 1 04:06:31 EDT 2025 ========================================================== 04:06:31 (1743494791) Fourth Failure Mode: OST/MDS Tue Apr 1 04:06:35 EDT 2025 Stopping /mnt/lustre-ost1 (opts:) on oleg625-server Test Lustre stability after OST failure Stopping /mnt/lustre-mds1 (opts:) on oleg625-server Failover mds1 to oleg625-server Stopping /mnt/lustre-mds2 (opts:) on oleg625-server Failover mds2 to oleg625-server Reintegrating OST oleg625-server.virtnet Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0000 oleg625-server.virtnet Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg625-server.virtnet Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0001 Test Lustre stability after MDS failover PASS 4 (260s) == insanity test 5: Fifth Failure Mode: OST/OST Tue Apr 1 04:10:52 EDT 2025 ========================================================== 04:10:52 (1743495052) Fifth Failure Mode: OST/OST Tue Apr 1 04:10:55 EDT 2025 Verify Lustre filesystem is up and running Stopping /mnt/lustre-ost1 (opts:) on oleg625-server Test Lustre stability after OST failure Stopping /mnt/lustre-ost2 (opts:) on oleg625-server Test Lustre stability after OST failure Reintegrating OSTs oleg625-server.virtnet Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0000 oleg625-server.virtnet Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0001 PASS 5 (145s) == insanity test 6: Sixth Failure Mode: OST/CLIENT Tue Apr 1 04:13:17 EDT 2025 ========================================================== 04:13:18 (1743495198) Sixth Failure Mode: OST/CLIENT Tue Apr 1 04:13:21 EDT 2025 Verify Lustre filesystem is up and running Stopping /mnt/lustre-ost1 (opts:) on oleg625-server Test Lustre stability after OST failure DFPIDA=22760 Failing CLIENTs Request fail clients: , to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure DFPIDB=23074 Reintegrating OST/CLIENTs oleg625-server.virtnet Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0000 Verifying mount PASS 6 (94s) == insanity test 7: Seventh Failure Mode: CLIENT/MDS Tue Apr 1 04:14:52 EDT 2025 ========================================================== 04:14:52 (1743495292) Seventh Failure Mode: CLIENT/MDS Tue Apr 1 04:14:56 EDT 2025 Verify Lustre filesystem is up and running Part 1: Failing CLIENT Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg625-client: total 0 oleg625-client: -rw-r--r-- 1 root root 0 Apr 1 04:15 oleg625-client.virtnet_testfile Wait 1 minutes Verify Lustre filesystem is up and running oleg625-client: rm: cannot remove '/mnt/lustre/d0.insanity/oleg625-client.virtnet_testfile': No such file or directory pdsh@oleg625-client: oleg625-client: ssh exited with exit code 1 Failing mds1 on oleg625-server Stopping /mnt/lustre-mds1 (opts:) on oleg625-server 04:16:28 (1743495388) shut down facet: mds1 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds1 to oleg625-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0000 04:17:00 (1743495420) targets are mounted 04:17:00 (1743495420) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing mds2 on oleg625-server Stopping /mnt/lustre-mds2 (opts:) on oleg625-server 04:17:25 (1743495445) shut down facet: mds2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds2 to oleg625-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0001 04:17:59 (1743495479) targets are mounted 04:17:59 (1743495479) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec oleg625-client: total 0 Reintegrating CLIENTs wait 1 minutes PASS 7 (279s) == insanity test 8: Eighth Failure Mode: CLIENT/OST Tue Apr 1 04:19:32 EDT 2025 ========================================================== 04:19:32 (1743495572) Eighth Failure Mode: CLIENT/OST Tue Apr 1 04:19:35 EDT 2025 Verify Lustre filesystem is up and running Failing CLIENTs Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg625-client: total 0 oleg625-client: -rw-r--r-- 1 root root 0 Apr 1 04:19 oleg625-client.virtnet_testfile Wait 1 minutes Verify Lustre filesystem is up and running Stopping /mnt/lustre-ost1 (opts:) on oleg625-server Test Lustre stability after OST failure Reintegrating CLIENTs/OST oleg625-server.virtnet Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0000 Wait 1 minutes PASS 8 (221s) == insanity test 9: Ninth Failure Mode: CLIENT/CLIENT Tue Apr 1 04:23:13 EDT 2025 ========================================================== 04:23:13 (1743495793) Verify Lustre filesystem is up and running Failing CLIENTs Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg625-client: total 0 oleg625-client: -rw-r--r-- 1 root root 0 Apr 1 04:23 oleg625-client.virtnet_testfile oleg625-client: -rw-r--r-- 1 root root 0 Apr 1 04:21 oleg625-client.virtnet_testfile2 Wait 1 minutes Verify Lustre filesystem is up and running Failing CLIENTs Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg625-client: total 0 oleg625-client: -rw-r--r-- 1 root root 0 Apr 1 04:24 oleg625-client.virtnet_testfile oleg625-client: -rw-r--r-- 1 root root 0 Apr 1 04:21 oleg625-client.virtnet_testfile2 Reintegrating CLIENTs/CLIENTs Wait 1 minutes PASS 9 (177s) == insanity test 10: Tenth Failure Mode: MDT0/OST/MDT1 Tue Apr 1 04:26:11 EDT 2025 ========================================================== 04:26:11 (1743495971) Stopping /mnt/lustre-mds1 (opts:) on oleg625-server Failover mds1 to oleg625-server Stopping /mnt/lustre-ost1 (opts:) on oleg625-server Reintegrating OST oleg625-server.virtnet Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0000 Stopping /mnt/lustre-mds2 (opts:) on oleg625-server Failover mds2 to oleg625-server oleg625-server.virtnet Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg625-server.virtnet Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0001 Verify reintegration PASS 10 (252s) == insanity test 11: Eleventh Failure Mode: MDS0/CLIENT/MDS1 Tue Apr 1 04:30:23 EDT 2025 ========================================================== 04:30:24 (1743496224) Verify Lustre filesystem is up and running Failing mds1 on oleg625-server Stopping /mnt/lustre-mds1 (opts:) on oleg625-server 04:30:36 (1743496236) shut down facet: mds1 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds1 to oleg625-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0000 04:31:11 (1743496271) targets are mounted 04:31:11 (1743496271) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Test Lustre stability after MDS failover Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS Failing mds2 on oleg625-server Stopping /mnt/lustre-mds2 (opts:) on oleg625-server 04:31:50 (1743496310) shut down facet: mds2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds2 to oleg625-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0001 04:32:22 (1743496342) targets are mounted 04:32:22 (1743496342) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 11 (152s) == insanity test 12: Twelve Failure Mode: MDS0,MDS1/OST0, OST1/CLIENTS Tue Apr 1 04:32:55 EDT 2025 ========================================================== 04:32:55 (1743496375) Verify Lustre filesystem is up and running Failing mds1 on oleg625-server Stopping /mnt/lustre-mds1 (opts:) on oleg625-server Failing mds2 on oleg625-server Stopping /mnt/lustre-mds2 (opts:) on oleg625-server 04:33:15 (1743496395) shut down facet: mds1 facet_host: oleg625-server facet_failover_host: oleg625-server facet: mds2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds1 to oleg625-server mount facets: mds1 Failover mds2 to oleg625-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 04:34:07 (1743496447) targets are mounted 04:34:07 (1743496447) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Failing ost1 on oleg625-server Stopping /mnt/lustre-ost1 (opts:) on oleg625-server Failing ost2 on oleg625-server Stopping /mnt/lustre-ost2 (opts:) on oleg625-server 04:34:44 (1743496484) shut down facet: ost1 facet_host: oleg625-server facet_failover_host: oleg625-server facet: ost2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover ost1 to oleg625-server mount facets: ost1 Failover ost2 to oleg625-server mount facets: ost2 Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 seq.cli-lustre-OST0001-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0000 Started lustre-OST0001 04:35:26 (1743496526) targets are mounted 04:35:26 (1743496526) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid,osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS PASS 12 (204s) == insanity test 13: Thirteen Failure Mode: MDS0,MDS1/CLIENTS/OST0,OST1 Tue Apr 1 04:36:19 EDT 2025 ========================================================== 04:36:19 (1743496579) Verify Lustre filesystem is up and running Failing mds1 on oleg625-server Stopping /mnt/lustre-mds1 (opts:) on oleg625-server Failing mds2 on oleg625-server Stopping /mnt/lustre-mds2 (opts:) on oleg625-server 04:36:37 (1743496597) shut down facet: mds1 facet_host: oleg625-server facet_failover_host: oleg625-server facet: mds2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds1 to oleg625-server mount facets: mds1 Failover mds2 to oleg625-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 04:37:31 (1743496651) targets are mounted 04:37:31 (1743496651) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS Failing ost1 on oleg625-server Stopping /mnt/lustre-ost1 (opts:) on oleg625-server Failing ost2 on oleg625-server Stopping /mnt/lustre-ost2 (opts:) on oleg625-server 04:38:25 (1743496705) shut down facet: ost1 facet_host: oleg625-server facet_failover_host: oleg625-server facet: ost2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover ost1 to oleg625-server mount facets: ost1 Failover ost2 to oleg625-server mount facets: ost2 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 seq.cli-lustre-OST0000-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0001 Started lustre-OST0000 04:39:09 (1743496749) targets are mounted 04:39:09 (1743496749) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid,osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 13 (207s) == insanity test 14: Fourteen Failure Mode: OST0,OST1/CLIENTS/MDS0,MDS1 Tue Apr 1 04:39:47 EDT 2025 ========================================================== 04:39:47 (1743496787) Verify Lustre filesystem is up and running Failing ost1 on oleg625-server Stopping /mnt/lustre-ost1 (opts:) on oleg625-server Failing ost2 on oleg625-server Stopping /mnt/lustre-ost2 (opts:) on oleg625-server 04:40:05 (1743496805) shut down facet: ost1 facet_host: oleg625-server facet_failover_host: oleg625-server facet: ost2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover ost1 to oleg625-server mount facets: ost1 Failover ost2 to oleg625-server mount facets: ost2 Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0001-super.width=65536 seq.cli-lustre-OST0000-super.width=65536 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-OST0001 Started lustre-OST0000 04:40:48 (1743496848) targets are mounted 04:40:48 (1743496848) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid,osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS Failing mds1 on oleg625-server Stopping /mnt/lustre-mds1 (opts:) on oleg625-server Failing mds2 on oleg625-server Stopping /mnt/lustre-mds2 (opts:) on oleg625-server 04:41:41 (1743496901) shut down facet: mds1 facet_host: oleg625-server facet_failover_host: oleg625-server facet: mds2 facet_host: oleg625-server facet_failover_host: oleg625-server Failover mds1 to oleg625-server mount facets: mds1 Failover mds2 to oleg625-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg625-server: oleg625-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 pdsh@oleg625-client: oleg625-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 04:42:35 (1743496955) targets are mounted 04:42:35 (1743496955) facet_failover done oleg625-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 14 (203s) == insanity test complete, duration 3256 sec ============= 04:43:10 (1743496990) === insanity: start cleanup 04:43:15 (1743496995) === === insanity: finish cleanup 04:43:23 (1743497003) ===