== replay-dual test 26: dbench and tar with mds failover ========================================================== 05:07:36 (1743498456) Starting client oleg601-client.virtnet: -o user_xattr,flock 192.168.206.101@tcp:/lustre /mnt/lustre Started clients oleg601-client.virtnet: 192.168.206.101@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,encrypt,statfs_project) Started tar loop with pid 64442 Started dbench loop with 64444 striped dir -i0 -c2 -H crush2 /mnt/lustre/d26.replay-dual/run_tar striped dir -i0 -c2 -H fnv_1a_64 /mnt/lustre2/d26.replay-dual/run_dbench looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2024 1285664 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1832 1285856 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3601308 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1536 3605484 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3080 7206792 1% /mnt/lustre '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Tue Apr 1 05:07:46 EDT 2025 waiting for dbench pid 64545 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs failed to create barrier semaphore 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients 1 31 1.05 MB/sec warmup 1 sec latency 62.218 ms 1 87 2.00 MB/sec warmup 2 sec latency 93.597 ms 1 121 1.83 MB/sec warmup 3 sec latency 209.655 ms 1 157 1.63 MB/sec warmup 4 sec latency 132.000 ms 1 199 1.56 MB/sec warmup 5 sec latency 120.923 ms 1 247 1.56 MB/sec warmup 6 sec latency 111.915 ms 1 289 1.53 MB/sec warmup 7 sec latency 113.524 ms test_26 fail mds1 1 times 1 324 1.47 MB/sec warmup 8 sec latency 115.685 ms 1 351 1.39 MB/sec warmup 9 sec latency 179.114 ms 1 394 1.39 MB/sec warmup 10 sec latency 112.083 ms Failing mds1 on oleg601-server 1 429 1.37 MB/sec warmup 11 sec latency 125.824 ms Stopping /mnt/lustre-mds1 (opts:) on oleg601-server 1 470 1.37 MB/sec warmup 12 sec latency 105.870 ms 1 509 1.36 MB/sec warmup 13 sec latency 132.810 ms 1 584 1.42 MB/sec warmup 14 sec latency 141.430 ms 1 606 1.40 MB/sec warmup 15 sec latency 802.403 ms 1 606 1.31 MB/sec warmup 16 sec latency 1802.688 ms 1 606 1.23 MB/sec warmup 17 sec latency 2804.336 ms 1 606 1.16 MB/sec warmup 18 sec latency 3805.239 ms 1 606 1.10 MB/sec warmup 19 sec latency 4807.076 ms 1 606 0.00 MB/sec execute 1 sec latency 6809.389 ms 05:08:07 (1743498487) shut down facet: mds1 facet_host: oleg601-server facet_failover_host: oleg601-server 1 606 0.00 MB/sec execute 2 sec latency 7809.610 ms 1 606 0.00 MB/sec execute 3 sec latency 8809.876 ms 1 606 0.00 MB/sec execute 4 sec latency 9810.149 ms 1 606 0.00 MB/sec execute 5 sec latency 10810.422 ms 1 606 0.00 MB/sec execute 6 sec latency 11810.756 ms 1 606 0.00 MB/sec execute 7 sec latency 12813.738 ms 1 606 0.00 MB/sec execute 8 sec latency 13814.676 ms 1 606 0.00 MB/sec execute 9 sec latency 14817.058 ms 1 606 0.00 MB/sec execute 10 sec latency 15821.903 ms 1 606 0.00 MB/sec execute 11 sec latency 16823.643 ms Failover mds1 to oleg601-server mount facets: mds1 1 606 0.00 MB/sec execute 12 sec latency 17824.738 ms 1 606 0.00 MB/sec execute 13 sec latency 18836.160 ms 1 606 0.00 MB/sec execute 14 sec latency 19840.071 ms 1 606 0.00 MB/sec execute 15 sec latency 20842.144 ms 1 606 0.00 MB/sec execute 16 sec latency 21844.508 ms 1 606 0.00 MB/sec execute 17 sec latency 22845.180 ms 1 606 0.00 MB/sec execute 18 sec latency 23857.786 ms 1 606 0.00 MB/sec execute 19 sec latency 24858.057 ms 1 606 0.00 MB/sec execute 20 sec latency 25858.358 ms 1 606 0.00 MB/sec execute 21 sec latency 26858.583 ms 1 606 0.00 MB/sec execute 22 sec latency 27858.951 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 606 0.00 MB/sec execute 23 sec latency 28859.172 ms 1 606 0.00 MB/sec execute 24 sec latency 29859.913 ms 1 606 0.00 MB/sec execute 25 sec latency 30861.437 ms 1 606 0.00 MB/sec execute 26 sec latency 31861.800 ms 1 606 0.00 MB/sec execute 27 sec latency 32862.340 ms 1 606 0.00 MB/sec execute 28 sec latency 33868.494 ms 1 606 0.00 MB/sec execute 29 sec latency 34868.786 ms 1 606 0.00 MB/sec execute 30 sec latency 35870.473 ms 1 606 0.00 MB/sec execute 31 sec latency 36870.669 ms 1 606 0.00 MB/sec execute 32 sec latency 37871.882 ms 1 606 0.00 MB/sec execute 33 sec latency 38872.121 ms oleg601-server: oleg601-server.virtnet: executing set_default_debug -1 all 1 606 0.00 MB/sec execute 34 sec latency 39872.403 ms pdsh@oleg601-client: oleg601-server: ssh exited with exit code 1 1 606 0.00 MB/sec execute 35 sec latency 40874.408 ms Started lustre-MDT0000 1 606 0.00 MB/sec execute 36 sec latency 41874.756 ms 05:08:43 (1743498523) targets are mounted 05:08:43 (1743498523) facet_failover done 1 612 0.00 MB/sec execute 37 sec latency 42723.645 ms 1 666 0.01 MB/sec execute 38 sec latency 109.937 ms 1 699 0.01 MB/sec execute 39 sec latency 113.723 ms 1 734 0.01 MB/sec execute 40 sec latency 131.556 ms 1 772 0.01 MB/sec execute 41 sec latency 115.849 ms oleg601-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 799 0.01 MB/sec execute 42 sec latency 146.005 ms 1 821 0.01 MB/sec execute 43 sec latency 142.294 ms 1 869 0.02 MB/sec execute 44 sec latency 104.512 ms 1 882 0.02 MB/sec execute 45 sec latency 307.069 ms 1 909 0.02 MB/sec execute 46 sec latency 163.148 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 934 0.02 MB/sec execute 47 sec latency 162.218 ms 1 967 0.02 MB/sec execute 48 sec latency 163.888 ms 1 986 0.02 MB/sec execute 49 sec latency 139.389 ms 1 1033 0.04 MB/sec execute 50 sec latency 123.916 ms 1 1079 0.04 MB/sec execute 51 sec latency 78.184 ms 1 1132 0.04 MB/sec execute 52 sec latency 72.470 ms 1 1164 0.04 MB/sec execute 53 sec latency 131.519 ms 1 1215 0.04 MB/sec execute 54 sec latency 78.166 ms 1 1255 0.04 MB/sec execute 55 sec latency 127.291 ms 1 1378 0.09 MB/sec execute 56 sec latency 119.299 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2320 1285368 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2108 1285580 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 43776 3562432 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 15140 3590872 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 58916 7153304 1% /mnt/lustre 1 1403 0.09 MB/sec execute 57 sec latency 102.701 ms 1 1445 0.09 MB/sec execute 58 sec latency 92.019 ms 1 1487 0.09 MB/sec execute 59 sec latency 91.619 ms 1 1516 0.08 MB/sec execute 60 sec latency 121.298 ms 1 1563 0.08 MB/sec execute 61 sec latency 83.573 ms 1 1593 0.08 MB/sec execute 62 sec latency 95.651 ms 1 1634 0.08 MB/sec execute 63 sec latency 108.338 ms 1 1675 0.08 MB/sec execute 64 sec latency 96.869 ms 1 1709 0.08 MB/sec execute 65 sec latency 157.453 ms 1 1785 0.08 MB/sec execute 66 sec latency 67.001 ms 1 1849 0.08 MB/sec execute 67 sec latency 71.930 ms test_26 fail mds2 2 times 1 1901 0.08 MB/sec execute 68 sec latency 133.129 ms 1 1948 0.08 MB/sec execute 69 sec latency 162.838 ms 1 1988 0.08 MB/sec execute 70 sec latency 104.137 ms 1 2026 0.08 MB/sec execute 71 sec latency 134.779 ms Failing mds2 on oleg601-server 1 2063 0.08 MB/sec execute 72 sec latency 109.212 ms 1 2128 0.08 MB/sec execute 73 sec latency 131.409 ms Stopping /mnt/lustre-mds2 (opts:) on oleg601-server 1 2164 0.08 MB/sec execute 74 sec latency 92.420 ms 1 2210 0.08 MB/sec execute 75 sec latency 91.916 ms 1 2259 0.08 MB/sec execute 76 sec latency 333.416 ms 1 2259 0.08 MB/sec execute 77 sec latency 1333.597 ms 05:09:24 (1743498564) shut down facet: mds2 facet_host: oleg601-server facet_failover_host: oleg601-server 1 2259 0.07 MB/sec execute 78 sec latency 2334.604 ms 1 2259 0.07 MB/sec execute 79 sec latency 3334.912 ms 1 2259 0.07 MB/sec execute 80 sec latency 4335.332 ms 1 2259 0.07 MB/sec execute 81 sec latency 5335.600 ms 1 2259 0.07 MB/sec execute 82 sec latency 6335.892 ms 1 2259 0.07 MB/sec execute 83 sec latency 7339.982 ms 1 2259 0.07 MB/sec execute 84 sec latency 8340.867 ms 1 2259 0.07 MB/sec execute 85 sec latency 9343.408 ms 1 2259 0.07 MB/sec execute 86 sec latency 10345.546 ms 1 2259 0.07 MB/sec execute 87 sec latency 11345.848 ms 1 2259 0.07 MB/sec execute 88 sec latency 12349.975 ms Failover mds2 to oleg601-server mount facets: mds2 1 2259 0.07 MB/sec execute 89 sec latency 13350.316 ms 1 2259 0.06 MB/sec execute 90 sec latency 14350.658 ms 1 2259 0.06 MB/sec execute 91 sec latency 15356.998 ms 1 2259 0.06 MB/sec execute 92 sec latency 16358.895 ms 1 2259 0.06 MB/sec execute 93 sec latency 17360.858 ms 1 2259 0.06 MB/sec execute 94 sec latency 18361.153 ms 1 2259 0.06 MB/sec execute 95 sec latency 19364.268 ms 1 2259 0.06 MB/sec execute 96 sec latency 20364.551 ms 1 2259 0.06 MB/sec execute 97 sec latency 21364.918 ms 1 2259 0.06 MB/sec execute 98 sec latency 22365.191 ms 1 2259 0.06 MB/sec execute 99 sec latency 23365.404 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 cleanup 100 sec 1 cleanup 101 sec 1 cleanup 102 sec 1 cleanup 103 sec 1 cleanup 104 sec 1 cleanup 105 sec 1 cleanup 106 sec oleg601-server: oleg601-server.virtnet: executing set_default_debug -1 all 1 cleanup 107 sec pdsh@oleg601-client: oleg601-server: ssh exited with exit code 1 1 cleanup 108 sec 1 cleanup 109 sec Started lustre-MDT0001 05:09:56 (1743498596) targets are mounted 05:09:56 (1743498596) facet_failover done 1 cleanup 110 sec 1 cleanup 111 sec 0 cleanup 112 sec Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 314 162.056 32374.918 Close 270 13.126 57.596 Rename 11 103.217 157.426 Unlink 30 30.454 59.784 Qpathinfo 295 161.432 42723.621 Qfileinfo 52 2.206 13.470 Qfsinfo 29 5.384 33.284 Sfileinfo 34 68.460 193.163 Find 91 16.806 52.503 WriteX 150 14.202 83.773 ReadX 348 4.608 109.180 Flush 30 65.377 307.046 Throughput 0.0586375 MB/sec 1 clients 1 procs max_latency=42723.645 ms stopping dbench on /mnt/lustre at Tue Apr 1 05:09:59 EDT 2025 with return code 0 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished oleg601-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid striped dir -i0 -c2 -H crush2 /mnt/lustre2/d26.replay-dual/run_dbench looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Tue Apr 1 05:10:12 EDT 2025 waiting for dbench pid 66548 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients 1 25 0.72 MB/sec warmup 1 sec latency 24.776 ms 1 84 1.90 MB/sec warmup 2 sec latency 114.191 ms 1 137 2.00 MB/sec warmup 3 sec latency 126.125 ms 1 174 1.76 MB/sec warmup 4 sec latency 121.104 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2804 1284884 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2552 1285136 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 11616 3595168 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 36868 3560584 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 48484 7155752 1% /mnt/lustre 1 215 1.67 MB/sec warmup 5 sec latency 144.445 ms 1 256 1.61 MB/sec warmup 6 sec latency 116.196 ms 1 297 1.56 MB/sec warmup 7 sec latency 114.094 ms 1 337 1.52 MB/sec warmup 8 sec latency 127.497 ms 1 373 1.48 MB/sec warmup 9 sec latency 112.864 ms 1 413 1.46 MB/sec warmup 10 sec latency 112.026 ms 1 452 1.44 MB/sec warmup 11 sec latency 124.620 ms 1 469 1.37 MB/sec warmup 12 sec latency 632.976 ms 1 506 1.34 MB/sec warmup 13 sec latency 121.484 ms 1 578 1.39 MB/sec warmup 14 sec latency 118.816 ms 1 658 1.40 MB/sec warmup 15 sec latency 69.280 ms test_26 fail mds1 3 times 1 695 1.32 MB/sec warmup 16 sec latency 140.587 ms 1 730 1.25 MB/sec warmup 17 sec latency 101.665 ms 1 762 1.18 MB/sec warmup 18 sec latency 131.167 ms Failing mds1 on oleg601-server 1 784 1.12 MB/sec warmup 19 sec latency 222.093 ms Stopping /mnt/lustre-mds1 (opts:) on oleg601-server 1 874 0.17 MB/sec execute 1 sec latency 101.046 ms 1 874 0.09 MB/sec execute 2 sec latency 1016.638 ms 1 874 0.06 MB/sec execute 3 sec latency 2017.266 ms 05:10:36 (1743498636) shut down facet: mds1 facet_host: oleg601-server facet_failover_host: oleg601-server 1 874 0.04 MB/sec execute 4 sec latency 3019.248 ms 1 874 0.03 MB/sec execute 5 sec latency 4033.324 ms 1 874 0.03 MB/sec execute 6 sec latency 5036.399 ms 1 874 0.02 MB/sec execute 7 sec latency 6036.652 ms 1 874 0.02 MB/sec execute 8 sec latency 7038.054 ms 1 874 0.02 MB/sec execute 9 sec latency 8038.497 ms 1 874 0.02 MB/sec execute 10 sec latency 9038.905 ms 1 874 0.02 MB/sec execute 11 sec latency 10039.300 ms 1 874 0.01 MB/sec execute 12 sec latency 11039.646 ms 1 874 0.01 MB/sec execute 13 sec latency 12039.990 ms 1 874 0.01 MB/sec execute 14 sec latency 13040.205 ms Failover mds1 to oleg601-server mount facets: mds1 1 874 0.01 MB/sec execute 15 sec latency 14040.514 ms 1 874 0.01 MB/sec execute 16 sec latency 15041.432 ms 1 874 0.01 MB/sec execute 17 sec latency 16042.335 ms 1 874 0.01 MB/sec execute 18 sec latency 17043.327 ms 1 874 0.01 MB/sec execute 19 sec latency 18043.603 ms 1 874 0.01 MB/sec execute 20 sec latency 19043.884 ms 1 874 0.01 MB/sec execute 21 sec latency 20044.286 ms 1 874 0.01 MB/sec execute 22 sec latency 21044.548 ms 1 874 0.01 MB/sec execute 23 sec latency 22044.906 ms 1 874 0.01 MB/sec execute 24 sec latency 23047.133 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 874 0.01 MB/sec execute 25 sec latency 24059.343 ms 1 874 0.01 MB/sec execute 26 sec latency 25059.533 ms 1 874 0.01 MB/sec execute 27 sec latency 26061.591 ms 1 874 0.01 MB/sec execute 28 sec latency 27069.768 ms 1 874 0.01 MB/sec execute 29 sec latency 28070.007 ms 1 874 0.01 MB/sec execute 30 sec latency 29070.498 ms 1 874 0.01 MB/sec execute 31 sec latency 30072.806 ms 1 874 0.01 MB/sec execute 32 sec latency 31073.370 ms 1 874 0.01 MB/sec execute 33 sec latency 32073.676 ms oleg601-server: oleg601-server.virtnet: executing set_default_debug -1 all 1 874 0.01 MB/sec execute 34 sec latency 33074.007 ms 1 874 0.00 MB/sec execute 35 sec latency 34075.564 ms pdsh@oleg601-client: oleg601-server: ssh exited with exit code 1 1 874 0.00 MB/sec execute 36 sec latency 35075.995 ms Started lustre-MDT0000 05:11:08 (1743498668) targets are mounted 05:11:08 (1743498668) facet_failover done 1 874 0.00 MB/sec execute 37 sec latency 36077.218 ms 1 890 0.01 MB/sec execute 38 sec latency 36482.512 ms 1 921 0.01 MB/sec execute 39 sec latency 110.878 ms 1 959 0.01 MB/sec execute 40 sec latency 196.743 ms 1 985 0.01 MB/sec execute 41 sec latency 122.404 ms 1 1048 0.03 MB/sec execute 42 sec latency 124.518 ms oleg601-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 1084 0.03 MB/sec execute 43 sec latency 109.748 ms 1 1132 0.03 MB/sec execute 44 sec latency 119.865 ms 1 1163 0.03 MB/sec execute 45 sec latency 123.334 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 1200 0.03 MB/sec execute 46 sec latency 108.171 ms 1 1231 0.03 MB/sec execute 47 sec latency 132.113 ms 1 1263 0.03 MB/sec execute 48 sec latency 139.678 ms 1 1387 0.09 MB/sec execute 49 sec latency 88.911 ms 1 1426 0.09 MB/sec execute 50 sec latency 86.887 ms 1 1481 0.09 MB/sec execute 51 sec latency 102.325 ms 1 1514 0.09 MB/sec execute 52 sec latency 98.945 ms 1 1567 0.09 MB/sec execute 53 sec latency 72.640 ms 1 1607 0.09 MB/sec execute 54 sec latency 80.119 ms 1 1655 0.09 MB/sec execute 55 sec latency 95.125 ms 1 1696 0.08 MB/sec execute 56 sec latency 107.657 ms 1 1771 0.08 MB/sec execute 57 sec latency 74.876 ms tar: Unexpected EOF in archive tar: Unexpected EOF in archive 1 1836 0.08 MB/sec execute 58 sec latency 80.813 ms 1 1897 0.08 MB/sec execute 59 sec latency 78.502 ms 1 1965 0.08 MB/sec execute 60 sec latency 89.172 ms 1 2014 0.08 MB/sec execute 61 sec latency 127.087 ms 1 2069 0.08 MB/sec execute 62 sec latency 87.113 ms 1 2138 0.08 MB/sec execute 63 sec latency 93.429 ms 1 2170 0.08 MB/sec execute 64 sec latency 130.186 ms 1 2233 0.08 MB/sec execute 65 sec latency 98.654 ms 1 2327 0.10 MB/sec execute 66 sec latency 80.091 ms 1 2361 0.10 MB/sec execute 67 sec latency 108.814 ms 1 2404 0.10 MB/sec execute 68 sec latency 100.947 ms tar: Error is not recoverable: exiting now dbench killed by signal 15 1 2440 0.10 MB/sec execute 69 sec latency 141.075 ms stopping dbench on /mnt/lustre at Tue Apr 1 05:11:41 EDT 2025 with return code 0 66548 pts/0 S+ 0:00 dbench -c client.txt 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100 killed dbench main pid 66548 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished