[ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 3.10.0-7.9-debug (green@centos7-base) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Sat Mar 26 23:28:42 EDT 2022 [ 0.000000] Command line: rd.shell root=nbd:192.168.200.253:centos7:ext4:ro:-p,-b4096 ro crashkernel=128M panic=1 nomodeset ipmtu=9000 ip=dhcp rd.neednet=1 init_on_free=off mitigations=off console=ttyS1,115200 audit=0 [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffcdfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000bffce000-0x00000000bfffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000013edfffff] usable [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] SMBIOS 2.8 present. [ 0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 0.000000] Hypervisor detected: KVM [ 0.000000] e820: last_pfn = 0x13ee00 max_arch_pfn = 0x400000000 [ 0.000000] PAT configuration [0-7]: WB WC UC- UC WB WP UC- UC [ 0.000000] e820: last_pfn = 0xbffce max_arch_pfn = 0x400000000 [ 0.000000] found SMP MP-table at [mem 0x000f5b30-0x000f5b3f] mapped at [ffffffffff200b30] [ 0.000000] Using GB pages for direct mapping [ 0.000000] RAMDISK: [mem 0xbc2e2000-0xbffbffff] [ 0.000000] Early table checksum verification disabled [ 0.000000] ACPI: RSDP 00000000000f5950 00014 (v00 BOCHS ) [ 0.000000] ACPI: RSDT 00000000bffe1bb7 00034 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: FACP 00000000bffe1a53 00074 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: DSDT 00000000bffe0040 01A13 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: FACS 00000000bffe0000 00040 [ 0.000000] ACPI: APIC 00000000bffe1ac7 00090 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: HPET 00000000bffe1b57 00038 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: WAET 00000000bffe1b8f 00028 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] No NUMA configuration found [ 0.000000] Faking a node at [mem 0x0000000000000000-0x000000013edfffff] [ 0.000000] NODE_DATA(0) allocated [mem 0x13e5e3000-0x13e609fff] [ 0.000000] Reserving 128MB of memory at 768MB for crashkernel (System RAM: 4077MB) [ 0.000000] kvm-clock: cpu 0, msr 1:3e592001, primary cpu clock [ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00 [ 0.000000] kvm-clock: using sched offset of 347995529 cycles [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x00001000-0x00ffffff] [ 0.000000] DMA32 [mem 0x01000000-0xffffffff] [ 0.000000] Normal [mem 0x100000000-0x13edfffff] [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x00001000-0x0009efff] [ 0.000000] node 0: [mem 0x00100000-0xbffcdfff] [ 0.000000] node 0: [mem 0x100000000-0x13edfffff] [ 0.000000] Initmem setup node 0 [mem 0x00001000-0x13edfffff] [ 0.000000] ACPI: PM-Timer IO Port: 0x608 [ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled) [ 0.000000] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled) [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1]) [ 0.000000] ACPI: IOAPIC (id[0x00] address[0xfec00000] gsi_base[0]) [ 0.000000] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) [ 0.000000] Using ACPI (MADT) for SMP configuration information [ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 [ 0.000000] smpboot: Allowing 4 CPUs, 0 hotplug CPUs [ 0.000000] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff] [ 0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000effff] [ 0.000000] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff] [ 0.000000] PM: Registered nosave memory: [mem 0xbffce000-0xbfffffff] [ 0.000000] PM: Registered nosave memory: [mem 0xc0000000-0xfeffbfff] [ 0.000000] PM: Registered nosave memory: [mem 0xfeffc000-0xfeffffff] [ 0.000000] PM: Registered nosave memory: [mem 0xff000000-0xfffbffff] [ 0.000000] PM: Registered nosave memory: [mem 0xfffc0000-0xffffffff] [ 0.000000] e820: [mem 0xc0000000-0xfeffbfff] available for PCI devices [ 0.000000] Booting paravirtualized kernel on KVM [ 0.000000] setup_percpu: NR_CPUS:5120 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1 [ 0.000000] percpu: Embedded 38 pages/cpu s115176 r8192 d32280 u524288 [ 0.000000] KVM setup async PF for cpu 0 [ 0.000000] kvm-stealtime: cpu 0, msr 13e2135c0 [ 0.000000] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes) [ 0.000000] Built 1 zonelists in Node order, mobility grouping on. Total pages: 1027487 [ 0.000000] Policy zone: Normal [ 0.000000] Kernel command line: rd.shell root=nbd:192.168.200.253:centos7:ext4:ro:-p,-b4096 ro crashkernel=128M panic=1 nomodeset ipmtu=9000 ip=dhcp rd.neednet=1 init_on_free=off mitigations=off console=ttyS1,115200 audit=0 [ 0.000000] audit: disabled (until reboot) [ 0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes) [ 0.000000] x86/fpu: xstate_offset[2]: 0240, xstate_sizes[2]: 0100 [ 0.000000] xsave: enabled xstate_bv 0x7, cntxt size 0x340 using standard form [ 0.000000] Memory: 3820268k/5224448k available (8172k kernel code, 1049168k absent, 355012k reserved, 5773k data, 2532k init) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 [ 0.000000] Hierarchical RCU implementation. [ 0.000000] RCU restricting CPUs from NR_CPUS=5120 to nr_cpu_ids=4. [ 0.000000] Offload RCU callbacks from all CPUs [ 0.000000] Offload RCU callbacks from CPUs: 0-3. [ 0.000000] NR_IRQS:327936 nr_irqs:456 0 [ 0.000000] Console: colour *CGA 80x25 [ 0.000000] console [ttyS1] enabled [ 0.000000] allocated 25165824 bytes of page_cgroup [ 0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups [ 0.000000] kmemleak: Kernel memory leak detector disabled [ 0.000000] tsc: Detected 2399.998 MHz processor [ 0.496270] Calibrating delay loop (skipped) preset value.. 4799.99 BogoMIPS (lpj=2399998) [ 0.499349] pid_max: default: 32768 minimum: 301 [ 0.501132] Security Framework initialized [ 0.502755] SELinux: Initializing. [ 0.505857] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes) [ 0.510641] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes) [ 0.513619] Mount-cache hash table entries: 8192 (order: 4, 65536 bytes) [ 0.516069] Mountpoint-cache hash table entries: 8192 (order: 4, 65536 bytes) [ 0.519029] Initializing cgroup subsys memory [ 0.520776] Initializing cgroup subsys devices [ 0.522365] Initializing cgroup subsys freezer [ 0.524101] Initializing cgroup subsys net_cls [ 0.525839] Initializing cgroup subsys blkio [ 0.527437] Initializing cgroup subsys perf_event [ 0.529203] Initializing cgroup subsys hugetlb [ 0.530865] Initializing cgroup subsys pids [ 0.532411] Initializing cgroup subsys net_prio [ 0.534276] x86/cpu: User Mode Instruction Prevention (UMIP) activated [ 0.537737] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 [ 0.539769] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0 [ 0.541789] tlb_flushall_shift: 6 [ 0.543096] FEATURE SPEC_CTRL Present [ 0.544345] FEATURE IBPB_SUPPORT Present [ 0.545922] Spectre V2 : Enabling Indirect Branch Prediction Barrier [ 0.548140] Spectre V2 : Vulnerable [ 0.549402] Speculative Store Bypass: Vulnerable [ 0.552140] debug: unmapping init [mem 0xffffffff82019000-0xffffffff8201ffff] [ 0.560708] ACPI: Core revision 20130517 [ 0.563865] ACPI: All ACPI Tables successfully acquired [ 0.565703] ftrace: allocating 30294 entries in 119 pages [ 0.620739] Enabling x2apic [ 0.621902] Enabled x2apic [ 0.623514] Switched APIC routing to physical x2apic. [ 0.628222] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.631433] smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2695 v2 @ 2.40GHz (fam: 06, model: 3e, stepping: 04) [ 0.636947] Performance Events: IvyBridge events, full-width counters, Intel PMU driver. [ 0.640768] ... version: 2 [ 0.642773] ... bit width: 48 [ 0.644776] ... generic registers: 4 [ 0.646941] ... value mask: 0000ffffffffffff [ 0.649533] ... max period: 00007fffffffffff [ 0.651846] ... fixed-purpose events: 3 [ 0.654049] ... event mask: 000000070000000f [ 0.656720] KVM setup paravirtual spinlock [ 0.661795] smpboot: Booting Node 0, Processors #1[ 0.665364] kvm-clock: cpu 1, msr 1:3e592041, secondary cpu clock [ 0.669227] KVM setup async PF for cpu 1 [ 0.670637] kvm-stealtime: cpu 1, msr 13e2935c0 #2[ 0.675011] kvm-clock: cpu 2, msr 1:3e592081, secondary cpu clock [ 0.679394] KVM setup async PF for cpu 2 [ 0.680322] kvm-clock: cpu 3, msr 1:3e5920c1, secondary cpu clock #3 OK [ 0.685838] kvm-stealtime: cpu 2, msr 13e3135c0 [ 0.688670] KVM setup async PF for cpu 3 [ 0.690137] kvm-stealtime: cpu 3, msr 13e3935c0 [ 0.691901] Brought up 4 CPUs [ 0.691902] smpboot: Max logical packages: 1 [ 0.691905] smpboot: Total of 4 processors activated (19199.98 BogoMIPS) [ 0.706681] devtmpfs: initialized [ 0.708198] x86/mm: Memory block size: 128MB [ 0.714323] EVM: security.selinux [ 0.716110] EVM: security.ima [ 0.717662] EVM: security.capability [ 0.721412] atomic64 test passed for x86-64 platform with CX8 and with SSE [ 0.724444] NET: Registered protocol family 16 [ 0.726344] cpuidle: using governor haltpoll [ 0.728538] ACPI: bus type PCI registered [ 0.730091] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 [ 0.732730] PCI: Using configuration type 1 for base access [ 0.735104] core: PMU erratum BJ122, BV98, HSD29 worked around, HT is on [ 0.749646] ACPI: Added _OSI(Module Device) [ 0.754011] ACPI: Added _OSI(Processor Device) [ 0.755747] ACPI: Added _OSI(3.0 _SCP Extensions) [ 0.757466] ACPI: Added _OSI(Processor Aggregator Device) [ 0.759546] ACPI: Added _OSI(Linux-Dell-Video) [ 0.764628] ACPI: Interpreter enabled [ 0.765920] ACPI: (supports S0 S3 S4 S5) [ 0.767560] ACPI: Using IOAPIC for interrupt routing [ 0.774240] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 0.778072] ACPI: Enabled 2 GPEs in block 00 to 0F [ 0.785894] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff]) [ 0.792285] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI] [ 0.794837] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM [ 0.797067] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. [ 0.801779] acpiphp: Slot [2] registered [ 0.804009] acpiphp: Slot [3] registered [ 0.805642] acpiphp: Slot [4] registered [ 0.809691] acpiphp: Slot [5] registered [ 0.811356] acpiphp: Slot [6] registered [ 0.812932] acpiphp: Slot [7] registered [ 0.814432] acpiphp: Slot [8] registered [ 0.816995] acpiphp: Slot [9] registered [ 0.818477] acpiphp: Slot [10] registered [ 0.819848] acpiphp: Slot [11] registered [ 0.823411] acpiphp: Slot [12] registered [ 0.825477] acpiphp: Slot [13] registered [ 0.827715] acpiphp: Slot [14] registered [ 0.829165] acpiphp: Slot [15] registered [ 0.830420] acpiphp: Slot [16] registered [ 0.832031] acpiphp: Slot [17] registered [ 0.833523] acpiphp: Slot [18] registered [ 0.834901] acpiphp: Slot [19] registered [ 0.837687] acpiphp: Slot [20] registered [ 0.840016] acpiphp: Slot [21] registered [ 0.842590] acpiphp: Slot [22] registered [ 0.843887] acpiphp: Slot [23] registered [ 0.846623] acpiphp: Slot [24] registered [ 0.848777] acpiphp: Slot [25] registered [ 0.850451] acpiphp: Slot [26] registered [ 0.851656] acpiphp: Slot [27] registered [ 0.853133] acpiphp: Slot [28] registered [ 0.854429] acpiphp: Slot [29] registered [ 0.855674] acpiphp: Slot [30] registered [ 0.856895] acpiphp: Slot [31] registered [ 0.858126] PCI host bridge to bus 0000:00 [ 0.859560] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window] [ 0.861673] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window] [ 0.863888] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window] [ 0.866606] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xfebfffff window] [ 0.869185] pci_bus 0000:00: root bus resource [mem 0x140000000-0x1bfffffff window] [ 0.873144] pci_bus 0000:00: root bus resource [bus 00-ff] [ 0.889987] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io 0x01f0-0x01f7] [ 0.893573] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io 0x03f6] [ 0.896143] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io 0x0170-0x0177] [ 0.899547] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io 0x0376] [ 0.903410] pci 0000:00:01.3: quirk: [io 0x0600-0x063f] claimed by PIIX4 ACPI [ 0.906074] pci 0000:00:01.3: quirk: [io 0x0700-0x070f] claimed by PIIX4 SMB [ 1.116209] ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11) [ 1.118920] ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11) [ 1.124019] ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11) [ 1.126172] ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11) [ 1.128790] ACPI: PCI Interrupt Link [LNKS] (IRQs *9) [ 1.132104] vgaarb: loaded [ 1.133840] SCSI subsystem initialized [ 1.135776] ACPI: bus type USB registered [ 1.137569] usbcore: registered new interface driver usbfs [ 1.141192] usbcore: registered new interface driver hub [ 1.143676] usbcore: registered new device driver usb [ 1.146219] PCI: Using ACPI for IRQ routing [ 1.149124] NetLabel: Initializing [ 1.151050] NetLabel: domain hash size = 128 [ 1.152648] NetLabel: protocols = UNLABELED CIPSOv4 [ 1.158754] NetLabel: unlabeled traffic allowed by default [ 1.161337] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 [ 1.162732] hpet0: 3 comparators, 64-bit 100.000000 MHz counter [ 1.170885] amd_nb: Cannot enumerate AMD northbridges [ 1.174098] Switched to clocksource kvm-clock [ 1.206987] pnp: PnP ACPI init [ 1.208302] ACPI: bus type PNP registered [ 1.210459] pnp: PnP ACPI: found 6 devices [ 1.211505] ACPI: bus type PNP unregistered [ 1.225556] NET: Registered protocol family 2 [ 1.227753] TCP established hash table entries: 32768 (order: 6, 262144 bytes) [ 1.231058] TCP bind hash table entries: 32768 (order: 8, 1048576 bytes) [ 1.234475] TCP: Hash tables configured (established 32768 bind 32768) [ 1.236087] TCP: reno registered [ 1.237219] UDP hash table entries: 2048 (order: 5, 196608 bytes) [ 1.239469] UDP-Lite hash table entries: 2048 (order: 5, 196608 bytes) [ 1.242673] NET: Registered protocol family 1 [ 1.244946] RPC: Registered named UNIX socket transport module. [ 1.247127] RPC: Registered udp transport module. [ 1.248647] RPC: Registered tcp transport module. [ 1.250068] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 1.252978] pci 0000:00:00.0: Limiting direct PCI/PCI transfers [ 1.255130] pci 0000:00:01.0: PIIX3: Enabling Passive Release [ 1.256954] pci 0000:00:01.0: Activating ISA DMA hang workarounds [ 1.259625] Unpacking initramfs... [ 2.700146] debug: unmapping init [mem 0xffff8800bc2e2000-0xffff8800bffbffff] [ 2.703911] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) [ 2.705242] software IO TLB [mem 0xb82e2000-0xbc2e2000] (64MB) mapped at [ffff8800b82e2000-ffff8800bc2e1fff] [ 2.708255] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 10737418240 ms ovfl timer [ 2.710537] RAPL PMU: hw unit of domain pp0-core 2^-0 Joules [ 2.712111] RAPL PMU: hw unit of domain package 2^-0 Joules [ 2.714038] RAPL PMU: hw unit of domain dram 2^-0 Joules [ 2.719271] cryptomgr_test (52) used greatest stack depth: 14128 bytes left [ 2.721723] futex hash table entries: 1024 (order: 4, 65536 bytes) [ 2.721817] Initialise system trusted keyring [ 2.754033] HugeTLB registered 1 GB page size, pre-allocated 0 pages [ 2.756760] HugeTLB registered 2 MB page size, pre-allocated 0 pages [ 2.765444] zpool: loaded [ 2.767027] zbud: loaded [ 2.768551] VFS: Disk quotas dquot_6.6.0 [ 2.770498] Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 2.775773] NFS: Registering the id_resolver key type [ 2.777902] Key type id_resolver registered [ 2.779461] Key type id_legacy registered [ 2.781072] nfs4filelayout_init: NFSv4 File Layout Driver Registering... [ 2.784869] Key type big_key registered [ 2.790509] cryptomgr_test (58) used greatest stack depth: 14048 bytes left [ 2.794031] cryptomgr_test (63) used greatest stack depth: 13984 bytes left [ 2.797187] NET: Registered protocol family 38 [ 2.800913] Key type asymmetric registered [ 2.803352] Asymmetric key parser 'x509' registered [ 2.808159] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250) [ 2.812871] io scheduler noop registered [ 2.814363] io scheduler deadline registered (default) [ 2.816484] io scheduler cfq registered [ 2.819110] io scheduler mq-deadline registered [ 2.820561] io scheduler kyber registered [ 2.824846] pci_hotplug: PCI Hot Plug PCI Core version: 0.5 [ 2.826916] pciehp: PCI Express Hot Plug Controller Driver version: 0.4 [ 2.830012] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 [ 2.832735] ACPI: Power Button [PWRF] [ 2.835295] GHES: HEST is not enabled! [ 2.895604] ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10 [ 2.959819] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11 [ 3.081751] ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 11 [ 3.154671] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10 [ 3.281668] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled [ 3.310080] 00:03: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A [ 3.341291] 00:04: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A [ 3.347139] Non-volatile memory driver v1.3 [ 3.349044] Linux agpgart interface v0.103 [ 3.351042] crash memory driver: version 1.1 [ 3.353393] nbd: registered device at major 43 [ 3.370198] virtio_blk virtio1: [vda] 59912 512-byte logical blocks (30.6 MB/29.2 MiB) [ 3.382676] virtio_blk virtio2: [vdb] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB) [ 3.398325] virtio_blk virtio3: [vdc] 5120000 512-byte logical blocks (2.62 GB/2.44 GiB) [ 3.411401] virtio_blk virtio4: [vdd] 5120000 512-byte logical blocks (2.62 GB/2.44 GiB) [ 3.429854] virtio_blk virtio5: [vde] 8388608 512-byte logical blocks (4.29 GB/4.00 GiB) [ 3.444583] virtio_blk virtio6: [vdf] 8388608 512-byte logical blocks (4.29 GB/4.00 GiB) [ 3.451456] rdac: device handler registered [ 3.453268] hp_sw: device handler registered [ 3.454839] emc: device handler registered [ 3.456627] libphy: Fixed MDIO Bus: probed [ 3.462338] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver [ 3.465216] ehci-pci: EHCI PCI platform driver [ 3.467679] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver [ 3.470443] ohci-pci: OHCI PCI platform driver [ 3.472218] uhci_hcd: USB Universal Host Controller Interface driver [ 3.475010] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12 [ 3.478807] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 3.480114] serio: i8042 AUX port at 0x60,0x64 irq 12 [ 3.482016] mousedev: PS/2 mouse device common for all mice [ 3.484506] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input1 [ 3.490902] rtc_cmos 00:05: RTC can wake from S4 [ 3.493652] rtc_cmos 00:05: rtc core: registered rtc_cmos as rtc0 [ 3.496270] rtc_cmos 00:05: alarms up to one day, y3k, 242 bytes nvram, hpet irqs [ 3.500647] hidraw: raw HID events driver (C) Jiri Kosina [ 3.503065] usbcore: registered new interface driver usbhid [ 3.504447] usbhid: USB HID core driver [ 3.505949] drop_monitor: Initializing network drop monitor service [ 3.507778] Netfilter messages via NETLINK v0.30. [ 3.509439] TCP: cubic registered [ 3.510594] Initializing XFRM netlink socket [ 3.512928] NET: Registered protocol family 10 [ 3.515989] NET: Registered protocol family 17 [ 3.517838] Key type dns_resolver registered [ 3.519872] mce: Using 10 MCE banks [ 3.521952] Loading compiled-in X.509 certificates [ 3.524656] Loaded X.509 cert 'Magrathea: Glacier signing key: e34d0e1b7fcf5b414cce75d36d8482945c781ed6' [ 3.527100] registered taskstats version 1 [ 3.531403] modprobe (71) used greatest stack depth: 13456 bytes left [ 3.539071] Key type trusted registered [ 3.556626] Key type encrypted registered [ 3.559289] IMA: No TPM chip found, activating TPM-bypass! (rc=-19) [ 3.564177] BERT: Boot Error Record Table support is disabled. Enable it by using bert_enable as kernel parameter. [ 3.572074] rtc_cmos 00:05: setting system clock to 2024-05-21 01:09:38 UTC (1716253778) [ 3.578399] debug: unmapping init [mem 0xffffffff81da0000-0xffffffff82018fff] [ 3.583108] Write protecting the kernel read-only data: 12288k [ 3.585714] debug: unmapping init [mem 0xffff8800017fe000-0xffff8800017fffff] [ 3.587721] debug: unmapping init [mem 0xffff880001b9b000-0xffff880001bfffff] [ 3.598385] random: systemd: uninitialized urandom read (16 bytes read) [ 3.603262] random: systemd: uninitialized urandom read (16 bytes read) [ 3.606019] random: systemd: uninitialized urandom read (16 bytes read) [ 3.610540] systemd[1]: systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN) [ 3.620198] systemd[1]: Detected virtualization kvm. [ 3.622067] systemd[1]: Detected architecture x86-64. [ 3.624261] systemd[1]: Running in initial RAM disk. Welcome to CentOS Linux 7 (Core) dracut-033-572.el7 (Initramfs)! [ 3.630032] systemd[1]: No hostname configured. [ 3.632153] systemd[1]: Set hostname to . [ 3.633954] random: systemd: uninitialized urandom read (16 bytes read) [ 3.636670] systemd[1]: Initializing machine ID from random generator. [ 3.695879] dracut-rootfs-g (85) used greatest stack depth: 13264 bytes left [ 3.699650] random: systemd: uninitialized urandom read (16 bytes read) [ 3.701833] random: systemd: uninitialized urandom read (16 bytes read) [ 3.704312] random: systemd: uninitialized urandom read (16 bytes read) [ 3.706163] random: systemd: uninitialized urandom read (16 bytes read) [ 3.709827] random: systemd: uninitialized urandom read (16 bytes read) [ 3.712498] random: systemd: uninitialized urandom read (16 bytes read) [ 3.719277] tsc: Refined TSC clocksource calibration: 2399.951 MHz [ 3.723842] systemd[1]: Created slice Root Slice. [ OK ] Created slice Root Slice. [ 3.728749] systemd[1]: Reached target Swap. [ OK ] Reached target Swap. [ 3.733037] systemd[1]: Listening on Journal Socket. [ OK ] Listening on Journal Socket. [ 3.738245] systemd[1]: Created slice System Slice. [ OK ] Created slice System Slice. [ 3.742723] systemd[1]: Starting Journal Service... Starting Journal Service... [ 3.749970] systemd[1]: Starting Setup Virtual Console... Starting Setup Virtual Console... [ 3.756080] systemd[1]: Starting Load Kernel Modules... Starting Load Kernel Modules... [ 3.763344] systemd[1]: Reached target Slices. [ OK ] Reached target Slices. [ 3.767832] systemd[1]: Reached target Local File Systems. [ OK ] Reached target Local File Systems. [ 3.777746] systemd[1]: Starting dracut cmdline hook... Starting dracut cmdline hook... [ 3.783432] systemd[1]: Starting Create list of required static device nodes for the current kernel... Starting Create list of required st... nodes for the current kernel... [ 3.794877] systemd[1]: Listening on udev Control Socket. [ OK ] Listening on udev Control Socket. [ 3.799606] systemd[1]: Listening on udev Kernel Socket. [ OK ] Listening on udev Kernel Socket. [ 3.804219] systemd[1]: Reached target Sockets. [ OK ] Reached target Sockets. [ 3.808597] systemd[1]: Reached target Timers. [ OK ] Reached target Timers. [ 3.813823] systemd[1]: Started Journal Service. [ OK ] Started Journal Service. [ OK ] Started Setup Virtual Console. [ OK ] Started Load Kernel Modules. [ OK ] Started Create list of required sta...ce nodes for the current kernel. Starting Create Static Device Nodes in /dev... Starting Apply Kernel Variables... [ OK ] Started Create Static Device Nodes in /dev. [ OK ] Started Apply Kernel Variables. [ 3.892099] random: fast init done [ OK ] Started dracut cmdline hook. Starting dracut pre-udev hook... [ OK ] Started dracut pre-udev hook. [ 4.329617] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input2 Starting udev Kernel Device Manager... [ OK ] Started udev Kernel Device Manager. Starting dracut pre-trigger hook... [ OK ] Started dracut pre-trigger hook. Starting udev Coldplug all Devices... Mounting Configuration File System... [ OK ] Mounted Configuration File System. [ OK ] Started udev Coldplug all Devices. [ OK ] Reached target System Initialization. Starting dracut initqueue hook... Starting Show Plymouth Boot Screen... [ 4.539696] scsi host0: ata_piix [ 4.557345] scsi host1: ata_piix [ 4.572052] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc320 irq 14 [ 4.574457] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc328 irq 15 [ OK ] Started Show Plymouth Boot Screen. [ OK ] Started Forward Password Requests to Plymouth Directory Watch. [ OK ] Reached target Paths. [ OK ] Reached target Basic System. %G[ 4.667639] ip (319) used greatest stack depth: 13080 bytes left [ 4.740592] ip (346) used greatest stack depth: 12336 bytes left [ 6.876440] dracut-initqueue[278]: RTNETLINK answers: File exists [ 7.268668] dracut-initqueue[278]: bs=4096, sz=32212254720 bytes [ OK ] Started dracut initqueue hook. [ OK ] Reached target Remote File Systems (Pre). [ OK ] Reached target Remote File Systems. Mounting /sysroot... [ OK ] Reached target Initrd Root File System. Starting Reload Configuration from the Real Root... [ OK ] Started Reload C[ 8.328671] EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null) onfiguration from the Real Root. [ OK ] Reached target Initrd File Systems. [ OK ] Reached target Initrd Default Target. [ OK ] Mounted /sysroot. Starting dracut pre-pivot and cleanup hook... [ OK ] Started dracut pre-pivot and cleanup hook. Starting Cleaning Up and Shutting Down Daemons... [ OK ] Stopped dracut pre-pivot and cleanup hook. [ OK ] Stopped target Initrd Default Target. [ OK ] Stopped target Basic System. [ OK ] Stopped target Sockets. [ OK ] Stopped target System Initialization. [ OK ] Stopped target Local File Systems. [ OK ] Stopped Apply Kernel Variables. [ OK ] Stopped target Swap. [ OK ] Stopped target Paths. [ OK ] Stopped target Slices. [ OK ] Stopped target Remote File Systems. [ OK ] Stopped target Timers. [ OK ] Stopped target Remote File Systems (Pre). [ OK ] Stopped dracut initqueue hook. [ OK ] Stopped udev Coldplug all Devices. [ OK ] Stopped dracut pre-trigger hook. Stopping udev Kernel Device Manager... Starting Plymouth switch root service... [ OK ] Stopped Load Kernel Modules. [ OK ] Started Cleaning Up and Shutting Down Daemons. [ OK ] Stopped udev Kernel Device Manager. [ OK ] Started Plymouth switch root service. [ OK ] Stopped Create Static Device Nodes in /dev. [ OK ] Stopped Create list of required sta...ce nodes for the current kernel. [ OK ] Stopped dracut pre-udev hook. [ OK ] Stopped dracut cmdline hook. [ OK ] Closed udev Control Socket. [ OK ] Closed udev Kernel Socket. Starting Cleanup udevd DB... [ OK ] Started Cleanup udevd DB. [ OK ] Reached target Switch Root. Starting Switch Root... [ 9.043407] systemd-journald[100]: Received SIGTERM from PID 1 (systemd). [ 9.784986] SELinux: Disabled at runtime. [ 9.981315] ip_tables: (C) 2000-2006 Netfilter Core Team [ 9.988898] systemd[1]: Inserted module 'ip_tables' Welcome to CentOS Linux 7 (Core)! [ OK ] Stopped Switch Root. [ OK ] Stopped Journal Service. Starting Journal Service... [ OK ] Created slice User and Session Slice. [ OK ] Listening on /dev/initctl Compatibility Named Pipe. [ OK ] Created slice system-selinux\x2dpol...grate\x2dlocal\x2dchanges.slice. Mounting POSIX Message Queue File System... [ OK ] Reached target Slices. [ OK ] Listening on udev Kernel Socket. [ OK ] Started Forward Password Requests to Wall Directory Watch. [ OK ] Listening on Delayed Shutdown Socket. Starting Read and set NIS domainname from /etc/sysconfig/network... [ OK ] Listening on udev Control Socket. Starting udev Coldplug all Devices... [ OK ] Set up automount Arbitrary Executab...ats File System Automount Point. [ OK ] Reached target rpc_pipefs.target. [ OK ] Created slice system-getty.slice. Starting Create list of required st... nodes for the current kernel... [ OK ] Created slice system-serial\x2dgetty.slice. Starting Set Up Additional Binary Formats... Starting Load Kernel Modules... Starting Remount Root and Kernel File Systems... [ OK ] Stopped target Switch Root. [ OK ] Stopped target Initrd File Systems. [ OK ] Stopped target Initrd Root File System. Mounting Huge Pages File System... Mounting Debug File System... [ OK ] Reached target Local Encrypted Volumes. [ OK ] Started Create list of required sta...ce nodes for the current kernel. Mounting Arbitrary Executable File Formats File System... Starting Create Static Device Nodes in /dev... [ OK ] Mounted POSIX Message Queue File System. [ OK ] Started Load Kernel Modules. Starting Apply Kernel Variables... [ OK ] Mounted Huge Pages File System. [ OK ] Mounted Debug File System. [ OK ] Started Read and set NIS domainname from /etc/sysconfig/network. [ OK ] Started Journal Service. [ OK ] Started udev Coldplug all Devices. [ OK ] Mounted Arbitrary Executable File Formats File System. [ OK ] Started Apply Kernel Variables. [FAILED] Failed to start Remount Root and Kernel File Systems. See 'systemctl status systemd-remount-fs.service' for details. Starting Flush Journal to Persistent Storage... Starting Configure read-only root support... [ OK ] Started Set Up Additional Binary Formats. [ OK ] Started Create Static Device Nodes in /dev. Starting udev Kernel Device Manager... [ OK ] Reached target Local File Systems (Pre). Mounting /mnt... [ 11.073107] systemd-journald[571]: Received request to flush runtime journal from PID 1 [ OK ] Started Flush Journal to Persistent Storage. [ OK ] Mounted /mnt. [ OK ] Started udev Kernel Device Manager. [ 11.407171] input: PC Speaker as /devices/platform/pcspkr/input/input3 [ OK ] Found device /dev/ttyS1. [ OK ] Found device /dev/ttyS0. [ 11.500037] piix4_smbus 0000:00:01.3: SMBus Host Controller at 0x700, revision 0 [ OK ] Found device /dev/vda. [ 11.614991] cryptd: max_cpu_qlen set to 1000 Mounting /home/green/git/lustre-release... [ OK ] Found device /dev/disk/by-label/SWAP. Activating swap /dev/disk/by-label/SWAP... [ 11.735842] Adding 1048572k swap on /dev/vdb. Priority:-2 extents:1 across:1048572k FS [ OK ] Activated swap /dev/disk/by-label/SWAP. [ 11.760534] AVX version of gcm_enc/dec engaged. [[ 11.762948] AES CTR mode by8 optimization enabled  OK ] Reached target Swap. [ 11.769163] squashfs: version 4.0 (2009/01/31) Phillip Lougher [ 11.790909] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni) [ OK ] Mounted /home/green/git/lustre-release. [ 11.804345] alg: No test for __generic-gcm-aes-aesni (__driver-generic-gcm-aes-aesni) %G[ 12.023172] EDAC MC: Ver: 3.0.0 [ 12.070293] EDAC sbridge: Ver: 1.1.2 [ 15.827213] mount.nfs (773) used greatest stack depth: 10704 bytes left [ OK ] Started Configure read-only root support. Starting Load/Save Random Seed... [ OK ] Reached target Local File Systems. Starting Rebuild Journal Catalog... Starting Preprocess NFS configuration... Starting Mark the need to relabel after reboot... Starting Tell Plymouth To Write Out Runtime Data... Starting Create Volatile Files and Directories... [ OK ] Started Load/Save Random Seed. [ OK ] Started Preprocess NFS configuration. [ OK ] Started Mark the need to relabel after reboot. [FAILED] Failed to start Create Volatile Files and Directories. See 'systemctl status systemd-tmpfiles-setup.service' for details. Starting Update UTMP about System Boot/Shutdown... [FAILED] Failed to start Rebuild Journal Catalog. See 'systemctl status systemd-journal-catalog-update.service' for details. [ OK ] Started Tell Plymouth To Write Out Runtime Data. Starting Update is Completed... [ OK ] Started Update is Completed. [ OK ] Started Update UTMP about System Boot/Shutdown. [ OK ] Reached target System Initialization. [ OK ] Listening on D-Bus System Message Bus Socket. [ OK ] Started Daily Cleanup of Temporary Directories. [ OK ] Reached target Timers. [ OK ] Listening on RPCbind Server Activation Socket. [ OK ] Reached target Sockets. [ OK ] Started Flexible branding. [ OK ] Reached target Paths. [ OK ] Reached target Basic System. Starting Login Service... Starting Dump dmesg to /var/log/dmesg... Starting GSSAPI Proxy Daemon... [ OK ] Started D-Bus System Message Bus. Starting Network Manager... [ OK ] Started Dump dmesg to /var/log/dmesg. [ OK ] Started GSSAPI Proxy Daemon. [ OK ] Reached target NFS client services. [ OK ] Reached target Remote File Systems (Pre). [ OK ] Reached target Remote File Systems. Starting Permit User Sessions... [ OK ] Started Login Service. [ OK ] Started Permit User Sessions. [ OK ] Started Network Manager. [ OK ] Reached target Network. Starting OpenSSH server daemon... Starting /etc/rc.d/rc.local Compatibility... Starting Network Manager Wait Online... Starting Hostname Service... [ OK ] Started /etc/rc.d/rc.local Compatibility. Starting Terminate Plymouth Boot Screen... Starting Wait for Plymouth Boot Screen to Quit... [ OK ] Started OpenSSH server daemon. [ OK ] Started Hostname Service. CentOS Linux 7 (Core) Kernel 3.10.0-7.9-debug on an x86_64 oleg449-server login: [ 38.781304] libcfs: loading out-of-tree module taints kernel. [ 38.783939] libcfs: module verification failed: signature and/or required key missing - tainting kernel [ 38.860822] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_hostid [ 45.972076] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing load_modules_local [ 46.261445] alg: No test for adler32 (adler32-zlib) [ 47.013212] libcfs: HW NUMA nodes: 1, HW CPU cores: 4, npartitions: 1 [ 47.231966] Lustre: Lustre: Build Version: 2.15.63_2_g37bb2f4 [ 47.568066] LNet: Added LNI 192.168.204.149@tcp [8/256/0/180] [ 47.574445] LNet: Accept secure, port 988 [ 49.146145] Key type lgssc registered [ 49.663714] Lustre: Echo OBD driver; http://www.lustre.org/ [ 50.338316] icp: module license 'CDDL' taints kernel. [ 50.339932] Disabling lock debugging due to kernel taint [ 53.427544] ZFS: Loaded module v0.8.6-1, ZFS pool version 5000, ZFS filesystem version 5 [ 56.852670] vdc: vdc1 vdc9 [ 60.969314] vde: vde1 vde9 [ 64.268467] vdf: vdf1 vdf9 [ 68.873556] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing load_modules_local [ 71.398975] Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' [ 72.536229] Lustre: Setting parameter lustre-MDT0000.mdt.identity_upcall in log lustre-MDT0000 [ 72.638418] Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space. [ 72.684193] Lustre: lustre-MDT0000: new disk, initializing [ 72.875452] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 72.912753] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400]:0:mdt [ 73.032451] mount.lustre (6905) used greatest stack depth: 10144 bytes left [ 73.979484] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all [ 76.905167] random: crng init done [ 78.101502] Lustre: lustre-OST0000: new disk, initializing [ 78.104644] Lustre: srv-lustre-OST0000: No data found on store. Initialize space. [ 78.107234] Lustre: Skipped 1 previous similar message [ 78.135347] Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 60-180 [ 79.346554] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000240000400-0x0000000280000400]:0:ost [ 79.351185] Lustre: cli-lustre-OST0000-super: Allocated super-sequence [0x0000000240000400-0x0000000280000400]:0:ost] [ 79.393484] Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x100000000 to 0x240000400 [ 79.887167] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all [ 84.121752] Lustre: lustre-OST0001: new disk, initializing [ 84.124059] Lustre: srv-lustre-OST0001: No data found on store. Initialize space. [ 84.159876] Lustre: lustre-OST0001: Imperative Recovery not enabled, recovery window 60-180 [ 85.863887] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all [ 87.580710] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000280000400-0x00000002c0000400]:1:ost [ 87.584054] Lustre: cli-lustre-OST0001-super: Allocated super-sequence [0x0000000280000400-0x00000002c0000400]:1:ost] [ 87.621138] Lustre: lustre-OST0001-osc-MDT0000: update sequence from 0x100010000 to 0x280000400 [ 90.795767] Lustre: DEBUG MARKER: Using TIMEOUT=20 [ 94.859343] Lustre: Modifying parameter general.lod.*.mdt_hash in log params [ 100.502062] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing check_logdir /tmp/testlogs/ [ 101.313504] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing yml_node [ 102.302531] Lustre: DEBUG MARKER: Client: 2.15.63.2 [ 102.873036] Lustre: DEBUG MARKER: MDS: 2.15.63.2 [ 103.430073] Lustre: DEBUG MARKER: OSS: 2.15.63.2 [ 103.770659] Lustre: DEBUG MARKER: -----============= acceptance-small: recovery-small ============----- Mon May 20 21:11:18 EDT 2024 [ 107.034839] Lustre: DEBUG MARKER: excepting tests: 136 [ 107.645181] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing check_config_client /mnt/lustre [ 111.659633] Lustre: DEBUG MARKER: Using TIMEOUT=20 [ 112.474995] Lustre: Modifying parameter general.lod.*.mdt_hash in log params [ 113.061568] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 114.496906] Lustre: DEBUG MARKER: == recovery-small test 1: create, chmod, stat: drop req, drop rep ========================================================== 21:11:29 (1716253889) [ 114.753523] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 130.765388] Lustre: lustre-MDT0000: Client b6518fdd-009b-4ee1-a989-3ff958cfcc66 (at 192.168.204.49@tcp) reconnecting [ 131.227550] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 131.229597] LustreError: 15103:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88008d99e680 x1799622561371264/t4294967298(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:752/0 lens 520/448 e 0 to 0 dl 1716253917 ref 1 fl Interpret:/200/0 rc 0/0 job:'mcreate.0' uid:0 gid:0 [ 147.242377] Lustre: lustre-MDT0000: Client b6518fdd-009b-4ee1-a989-3ff958cfcc66 (at 192.168.204.49@tcp) reconnecting [ 147.249039] Lustre: 6972:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff88008d9a6a00 x1799622561371264/t4294967298(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:13/0 lens 520/2880 e 0 to 0 dl 1716253933 ref 1 fl Interpret:/202/0 rc 0/0 job:'mcreate.0' uid:0 gid:0 [ 147.734378] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 163.754700] Lustre: lustre-MDT0000: Client b6518fdd-009b-4ee1-a989-3ff958cfcc66 (at 192.168.204.49@tcp) reconnecting [ 164.401120] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 164.404496] LustreError: 15103:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88008d459880 x1799622561372992/t4294967300(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:30/0 lens 488/456 e 0 to 0 dl 1716253950 ref 1 fl Interpret:/200/0 rc 0/0 job:'tchmod.0' uid:0 gid:0 [ 180.423677] Lustre: lustre-MDT0000: Client b6518fdd-009b-4ee1-a989-3ff958cfcc66 (at 192.168.204.49@tcp) reconnecting [ 180.438379] Lustre: 6973:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff88008d45b480 x1799622561372992/t4294967300(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:46/0 lens 488/3152 e 0 to 0 dl 1716253966 ref 1 fl Interpret:/202/0 rc 0/0 job:'tchmod.0' uid:0 gid:0 [ 181.141418] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 197.177395] Lustre: lustre-MDT0000: Client b6518fdd-009b-4ee1-a989-3ff958cfcc66 (at 192.168.204.49@tcp) reconnecting [ 197.823795] Lustre: *** cfs_fail_loc=122, val=2147483648*** [ 197.826358] LustreError: 11325:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88008d5bb800 x1799622561374272/t0(0) o34->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:63/0 lens 472/464 e 0 to 0 dl 1716253983 ref 1 fl Interpret:/200/0 rc 0/0 job:'statone.0' uid:0 gid:0 [ 213.845235] Lustre: lustre-MDT0000: Client b6518fdd-009b-4ee1-a989-3ff958cfcc66 (at 192.168.204.49@tcp) reconnecting [ 217.939261] Lustre: DEBUG MARKER: == recovery-small test 4: open: drop req, drop rep ======= 21:13:13 (1716253993) [ 218.352789] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 234.375515] Lustre: lustre-MDT0000: Client b6518fdd-009b-4ee1-a989-3ff958cfcc66 (at 192.168.204.49@tcp) reconnecting [ 235.060358] Lustre: *** cfs_fail_loc=122, val=2147483648*** [ 235.062996] LustreError: 6976:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88008d100700 x1799622561376832/t4294967306(0) o35->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:100/0 lens 392/456 e 0 to 0 dl 1716254020 ref 1 fl Interpret:/200/0 rc 0/0 job:'cat.0' uid:0 gid:0 [ 251.063918] Lustre: 6976:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff88008d103800 x1799622561376832/t4294967306(0) o35->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:116/0 lens 392/456 e 0 to 0 dl 1716254036 ref 1 fl Interpret:/202/0 rc 0/0 job:'cat.0' uid:0 gid:0 [ 255.629984] Lustre: DEBUG MARKER: == recovery-small test 5: rename: drop req, drop rep ===== 21:13:50 (1716254030) [ 255.991954] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 272.029147] Lustre: lustre-MDT0000: Client b6518fdd-009b-4ee1-a989-3ff958cfcc66 (at 192.168.204.49@tcp) reconnecting [ 272.036018] Lustre: Skipped 1 previous similar message [ 272.728855] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 272.732540] LustreError: 6987:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff880131ce3100 x1799622561379904/t4294967310(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:138/0 lens 552/456 e 0 to 0 dl 1716254058 ref 1 fl Interpret:/200/0 rc 0/0 job:'mv.0' uid:0 gid:0 [ 288.730503] Lustre: 6987:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff88008e080700 x1799622561379904/t4294967310(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:154/0 lens 552/2888 e 0 to 0 dl 1716254074 ref 1 fl Interpret:/202/0 rc 0/0 job:'mv.0' uid:0 gid:0 [ 293.402112] Lustre: DEBUG MARKER: == recovery-small test 6: link, unlink: drop req, drop rep ========================================================== 21:14:28 (1716254068) [ 293.813792] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 310.485590] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 310.488148] LustreError: 15103:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88009b610000 x1799622561383552/t4294967316(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:176/0 lens 512/440 e 0 to 0 dl 1716254096 ref 1 fl Interpret:/200/0 rc 0/0 job:'link.0' uid:0 gid:0 [ 326.486906] Lustre: 11325:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff880135fa4e00 x1799622561383552/t4294967316(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:192/0 lens 512/440 e 0 to 0 dl 1716254112 ref 1 fl Interpret:/202/0 rc 0/0 job:'link.0' uid:0 gid:0 [ 327.168756] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 343.207014] Lustre: lustre-MDT0000: Client b6518fdd-009b-4ee1-a989-3ff958cfcc66 (at 192.168.204.49@tcp) reconnecting [ 343.212218] Lustre: Skipped 3 previous similar messages [ 343.905177] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 343.908476] LustreError: 11325:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88008cd3c000 x1799622561385856/t4294967318(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:209/0 lens 504/456 e 0 to 0 dl 1716254129 ref 1 fl Interpret:/200/0 rc 0/0 job:'unlink.0' uid:0 gid:0 [ 359.907327] Lustre: 6974:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff88008cd3ea00 x1799622561385856/t4294967318(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:225/0 lens 504/2888 e 0 to 0 dl 1716254145 ref 1 fl Interpret:/202/0 rc 0/0 job:'unlink.0' uid:0 gid:0 [ 364.507079] Lustre: DEBUG MARKER: == recovery-small test 8: touch: drop rep (bug 1423) ===== 21:15:39 (1716254139) [ 380.858535] Lustre: 6974:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff88008d87f800 x1799622561387328/t4294967321(0) o36->b6518fdd-009b-4ee1-a989-3ff958cfcc66@192.168.204.49@tcp:246/0 lens 488/3152 e 0 to 0 dl 1716254166 ref 1 fl Interpret:/202/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 385.537831] Lustre: DEBUG MARKER: == recovery-small test 9: pause bulk on OST (bug 1420) === 21:16:00 (1716254160) [ 386.254251] LustreError: 20871:0:(tgt_handler.c:2714:tgt_brw_write()) cfs_fail_timeout id 214 sleeping for 5000ms [ 391.258232] LustreError: 20871:0:(tgt_handler.c:2714:tgt_brw_write()) cfs_fail_timeout id 214 awake [ 396.143844] Lustre: DEBUG MARKER: == recovery-small test 10a: finish request on server after client eviction (bug 1521) ========================================================== 21:16:11 (1716254171) [ 412.261248] Lustre: 6973:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716254171/real 1716254171] req@ffff880131d1d500 x1799622567704320/t0(0) o104->lustre-MDT0000@192.168.204.49@tcp:15/16 lens 328/224 e 0 to 1 dl 1716254187 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 413.890238] Lustre: 8416:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716254172/real 1716254172] req@ffff880135fa4000 x1799622567704448/t0(0) o104->lustre-OST0001@192.168.204.49@tcp:15/16 lens 328/224 e 0 to 1 dl 1716254188 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 428.275238] Lustre: 6973:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716254187/real 1716254187] req@ffff880131d1d500 x1799622567704320/t0(0) o104->lustre-MDT0000@192.168.204.49@tcp:15/16 lens 328/224 e 0 to 1 dl 1716254203 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 436.391203] Lustre: mdt00_001: service thread pid 6973 was inactive for 40.130 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 436.402112] Pid: 6973, comm: mdt00_001 3.10.0-7.9-debug #1 SMP Sat Mar 26 23:28:42 EDT 2022 [ 436.405717] Call Trace: [ 436.407285] [<0>] ptlrpc_set_wait+0x7cf/0x850 [ptlrpc] [ 436.409632] [<0>] ldlm_run_ast_work+0xe3/0x400 [ptlrpc] [ 436.412299] [<0>] ldlm_handle_conflict_lock+0x70/0x300 [ptlrpc] [ 436.415721] [<0>] ldlm_lock_enqueue+0x5c2/0xbb0 [ptlrpc] [ 436.418453] [<0>] ldlm_cli_enqueue_local+0x1ec/0x880 [ptlrpc] [ 436.421285] [<0>] mdt_object_lock_internal+0x1a9/0x420 [mdt] [ 436.424040] [<0>] mdt_object_lock+0x88/0x1c0 [mdt] [ 436.426664] [<0>] mdt_object_stripes_lock+0x126/0x660 [mdt] [ 436.429778] [<0>] mdt_reint_setattr+0x73b/0x15f0 [mdt] [ 436.432871] [<0>] mdt_reint_rec+0x87/0x240 [mdt] [ 436.435327] [<0>] mdt_reint_internal+0x74c/0xbc0 [mdt] [ 436.438266] [<0>] mdt_reint+0x67/0x150 [mdt] [ 436.440966] [<0>] tgt_request_handle+0x74e/0x1a50 [ptlrpc] [ 436.444223] [<0>] ptlrpc_server_handle_request+0x26c/0xcb0 [ptlrpc] [ 436.447678] [<0>] ptlrpc_main+0xc76/0x1690 [ptlrpc] [ 436.450593] [<0>] kthread+0xe4/0xf0 [ 436.452814] [<0>] ret_from_fork_nospec_begin+0x7/0x21 [ 436.455783] [<0>] 0xfffffffffffffffe [ 437.927204] Lustre: ll_ost00_001: service thread pid 8416 was inactive for 40.037 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 437.936389] Pid: 8416, comm: ll_ost00_001 3.10.0-7.9-debug #1 SMP Sat Mar 26 23:28:42 EDT 2022 [ 437.940959] Call Trace: [ 437.942875] [<0>] ptlrpc_set_wait+0x7cf/0x850 [ptlrpc] [ 437.946215] [<0>] ldlm_run_ast_work+0xe3/0x400 [ptlrpc] [ 437.949293] [<0>] ldlm_handle_conflict_lock+0x70/0x300 [ptlrpc] [ 437.952265] [<0>] ldlm_lock_enqueue+0x5c2/0xbb0 [ptlrpc] [ 437.954904] [<0>] ldlm_cli_enqueue_local+0x377/0x880 [ptlrpc] [ 437.957711] [<0>] ofd_destroy_by_fid+0x1d1/0x520 [ofd] [ 437.960206] [<0>] ofd_destroy_hdl+0x20c/0xae0 [ofd] [ 437.962808] [<0>] tgt_request_handle+0x74e/0x1a50 [ptlrpc] [ 437.965501] [<0>] ptlrpc_server_handle_request+0x26c/0xcb0 [ptlrpc] [ 437.968532] [<0>] ptlrpc_main+0xc76/0x1690 [ptlrpc] [ 437.970896] [<0>] kthread+0xe4/0xf0 [ 437.972674] [<0>] ret_from_fork_nospec_begin+0x7/0x21 [ 437.975127] [<0>] 0xfffffffffffffffe [ 444.282281] Lustre: 6973:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716254203/real 1716254203] req@ffff880131d1d500 x1799622567704320/t0(0) o104->lustre-MDT0000@192.168.204.49@tcp:15/16 lens 328/224 e 0 to 1 dl 1716254219 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 444.295663] Lustre: 6973:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 1 previous similar message [ 460.300241] Lustre: 6973:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716254219/real 1716254219] req@ffff880131d1d500 x1799622567704320/t0(0) o104->lustre-MDT0000@192.168.204.49@tcp:15/16 lens 328/224 e 0 to 1 dl 1716254235 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 460.314259] Lustre: 6973:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 1 previous similar message [ 476.319239] Lustre: 6973:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716254235/real 1716254235] req@ffff880131d1d500 x1799622567704320/t0(0) o104->lustre-MDT0000@192.168.204.49@tcp:15/16 lens 328/224 e 0 to 1 dl 1716254251 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 476.332890] Lustre: 6973:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 1 previous similar message [ 492.337234] Lustre: 6973:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716254251/real 1716254251] req@ffff880131d1d500 x1799622567704320/t0(0) o104->lustre-MDT0000@192.168.204.49@tcp:15/16 lens 328/224 e 0 to 1 dl 1716254267 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 492.351465] Lustre: 6973:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 1 previous similar message [ 508.356336] LustreError: 6973:0:(ldlm_lockd.c:780:ldlm_handle_ast_error()) ### client (nid 192.168.204.49@tcp) failed to reply to blocking AST (req@ffff880131d1d500 x1799622567704320 status 0 rc -110), evict it ns: mdt-lustre-MDT0000_UUID lock: ffff8800a5925680/0xc8a46ac832f0c94e lrc: 4/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.204.49@tcp remote: 0x443fa76813616a1e expref: 9 pid: 6973 timeout: 591 lvb_type: 0 [ 508.373054] LustreError: 138-a: lustre-MDT0000: A client on nid 192.168.204.49@tcp was evicted due to a lock blocking callback time out: rc -110 [ 508.379680] LustreError: 6963:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 16s: evicting client at 192.168.204.49@tcp ns: mdt-lustre-MDT0000_UUID lock: ffff8800a5925680/0xc8a46ac832f0c94e lrc: 3/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.204.49@tcp remote: 0x443fa76813616a1e expref: 10 pid: 6973 timeout: 0 lvb_type: 0 [ 509.903437] LustreError: 8416:0:(ldlm_lockd.c:780:ldlm_handle_ast_error()) ### client (nid 192.168.204.49@tcp) failed to reply to blocking AST (req@ffff880135fa4000 x1799622567704448 status 0 rc -110), evict it ns: filter-lustre-OST0001_UUID lock: ffff8800a6a7b180/0xc8a46ac832f0c80c lrc: 4/0,0 mode: PW/PW res: [0x280000400:0x4:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4194303) gid 0 flags: 0x60000400030020 nid: 192.168.204.49@tcp remote: 0x443fa768136169bc expref: 6 pid: 9086 timeout: 593 lvb_type: 0 [ 509.937418] LustreError: 138-a: lustre-OST0001: A client on nid 192.168.204.49@tcp was evicted due to a lock blocking callback time out: rc -110 [ 509.945242] LustreError: 6963:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 16s: evicting client at 192.168.204.49@tcp ns: filter-lustre-OST0001_UUID lock: ffff8800a6a7b180/0xc8a46ac832f0c80c lrc: 3/0,0 mode: PW/PW res: [0x280000400:0x4:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4194303) gid 0 flags: 0x60000400030020 nid: 192.168.204.49@tcp remote: 0x443fa768136169bc expref: 7 pid: 9086 timeout: 0 lvb_type: 0 [ 512.969564] Lustre: DEBUG MARKER: == recovery-small test 10b: re-send BL AST =============== 21:18:08 (1716254288) [ 529.090315] Lustre: 21848:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716254288/real 1716254288] req@ffff88008b8a3480 x1799622567717248/t0(0) o104->lustre-MDT0000@192.168.204.49@tcp:15/16 lens 328/224 e 0 to 1 dl 1716254304 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 529.101395] Lustre: 21848:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [ 533.591151] Lustre: DEBUG MARKER: == recovery-small test 10c: re-send BL AST vs reconnect race (LU-5569) ========================================================== 21:18:28 (1716254308) [ 534.693515] Lustre: lustre-MDT0000: Client b6518fdd-009b-4ee1-a989-3ff958cfcc66 (at 192.168.204.49@tcp) reconnecting [ 534.699761] Lustre: Skipped 2 previous similar messages [ 539.006449] Lustre: DEBUG MARKER: == recovery-small test 10d: test failed blocking ast ===== 21:18:34 (1716254314) [ 540.788987] LustreError: 21788:0:(ldlm_lockd.c:780:ldlm_handle_ast_error()) ### client (nid 192.168.204.49@tcp) returned error from blocking AST (req@ffff88008c200e00 x1799622567721408 status -71 rc -71), evict it ns: filter-lustre-OST0000_UUID lock: ffff88009c455200/0xc8a46ac832f0ccce lrc: 4/0,0 mode: PW/PW res: [0x240000400:0x7:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x60000480000020 nid: 192.168.204.49@tcp remote: 0x443fa76813616bec expref: 5 pid: 21788 timeout: 640 lvb_type: 0 [ 540.816066] LustreError: 138-a: lustre-OST0000: A client on nid 192.168.204.49@tcp was evicted due to a lock blocking callback time out: rc -71 [ 540.821694] LustreError: 6963:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 192.168.204.49@tcp ns: filter-lustre-OST0000_UUID lock: ffff88009c455200/0xc8a46ac832f0ccce lrc: 3/0,0 mode: PW/PW res: [0x240000400:0x7:0x0].0x0 rrc: 4 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x60000480000020 nid: 192.168.204.49@tcp remote: 0x443fa76813616bec expref: 6 pid: 21788 timeout: 0 lvb_type: 0 [ 545.630031] Lustre: DEBUG MARKER: == recovery-small test 10e: re-send BL AST vs reconnect race 2 ========================================================== 21:18:40 (1716254320) [ 546.129074] Lustre: DEBUG MARKER: SKIP: recovery-small test_10e need two clients [ 548.824862] Lustre: DEBUG MARKER: == recovery-small test 11: wake up a thread waiting for completion after eviction (b=2460) ========================================================== 21:18:43 (1716254323) [ 570.881638] Lustre: DEBUG MARKER: == recovery-small test 12: recover from timed out resend in ptlrpcd (b=2494) ========================================================== 21:19:05 (1716254345) [ 571.254288] Lustre: *** cfs_fail_loc=115, val=2147483648*** [ 614.235524] Lustre: DEBUG MARKER: == recovery-small test 13: mdc_readpage restart test (bug 1138) ========================================================== 21:19:49 (1716254389) [ 635.314913] Lustre: DEBUG MARKER: == recovery-small test 14: mdc_readpage resend test (bug 1138) ========================================================== 21:20:10 (1716254410) [ 635.755152] Lustre: *** cfs_fail_loc=106, val=0*** [ 635.758143] Lustre: Skipped 1 previous similar message [ 640.323178] Lustre: DEBUG MARKER: == recovery-small test 15: failed open (-ENOMEM) ========= 21:20:15 (1716254415) [ 640.656300] Lustre: *** cfs_fail_loc=128, val=0*** [ 644.933238] Lustre: DEBUG MARKER: == recovery-small test 16: timeout bulk put, don't evict client (2732) ========================================================== 21:20:19 (1716254419) [ 645.510047] Lustre: *** cfs_fail_loc=504, val=0*** [ 645.512293] LustreError: 20870:0:(ldlm_lib.c:3601:target_bulk_io()) @@@ truncated bulk READ 0(102400) req@ffff88008d87f100 x1799622561427200/t0(0) o3->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:511/0 lens 488/440 e 0 to 0 dl 1716254431 ref 1 fl Interpret:/200/0 rc 0/0 job:'cmp.0' uid:0 gid:0 [ 645.523895] Lustre: lustre-OST0000: Bulk IO read error with 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp), client will retry: rc -110 [ 686.371797] Lustre: DEBUG MARKER: == recovery-small test 17a: timeout bulk get, don't evict client (2732) ========================================================== 21:21:01 (1716254461) [ 733.144018] Lustre: DEBUG MARKER: == recovery-small test 17b: timeout bulk get, dont evict client (3582) ========================================================== 21:21:48 (1716254508) [ 733.653819] Lustre: DEBUG MARKER: SKIP: recovery-small test_17b Needs multiple clients [ 736.398561] Lustre: DEBUG MARKER: == recovery-small test 18a: manual ost invalidate clears page cache immediately ========================================================== 21:21:51 (1716254511) [ 740.931937] Lustre: DEBUG MARKER: == recovery-small test 18b: eviction and reconnect clears page cache (2766) ========================================================== 21:21:55 (1716254515) [ 741.564498] Lustre: 468:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-OST0000: evicting 2334086a-d9b9-46aa-b8af-be07c8ec2467 at adminstrative request [ 767.962184] Lustre: DEBUG MARKER: == recovery-small test 18c: Dropped connect reply after eviction handing (14755) ========================================================== 21:22:23 (1716254543) [ 768.582229] Lustre: 1359:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-OST0000: evicting 2334086a-d9b9-46aa-b8af-be07c8ec2467 at adminstrative request [ 769.957570] Lustre: *** cfs_fail_loc=225, val=0*** [ 769.959598] Lustre: Skipped 1 previous similar message [ 786.220023] Lustre: DEBUG MARKER: == recovery-small test 19a: test expired_lock_main on mds (2867) ========================================================== 21:22:41 (1716254561) [ 786.948927] Lustre: *** cfs_fail_loc=304, val=0*** [ 802.950499] Lustre: lustre-MDT0000: Client 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp) reconnecting [ 802.958645] Lustre: Skipped 5 previous similar messages [ 802.973047] Lustre: *** cfs_fail_loc=304, val=0*** [ 818.977054] Lustre: *** cfs_fail_loc=304, val=0*** [ 827.047202] Lustre: mdt00_000: service thread pid 6972 was inactive for 40.101 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 827.055288] Pid: 6972, comm: mdt00_000 3.10.0-7.9-debug #1 SMP Sat Mar 26 23:28:42 EDT 2022 [ 827.058872] Call Trace: [ 827.060308] [<0>] ldlm_completion_ast+0x963/0xd00 [ptlrpc] [ 827.062916] [<0>] ldlm_cli_enqueue_local+0x259/0x880 [ptlrpc] [ 827.065606] [<0>] mdt_object_lock_internal+0x1a9/0x420 [mdt] [ 827.068258] [<0>] mdt_object_lock+0x88/0x1c0 [mdt] [ 827.070574] [<0>] mdt_object_stripes_lock+0x126/0x660 [mdt] [ 827.073210] [<0>] mdt_reint_setattr+0x73b/0x15f0 [mdt] [ 827.075607] [<0>] mdt_reint_rec+0x87/0x240 [mdt] [ 827.077771] [<0>] mdt_reint_internal+0x74c/0xbc0 [mdt] [ 827.080126] [<0>] mdt_reint+0x67/0x150 [mdt] [ 827.082246] [<0>] tgt_request_handle+0x74e/0x1a50 [ptlrpc] [ 827.084895] [<0>] ptlrpc_server_handle_request+0x26c/0xcb0 [ptlrpc] [ 827.087827] [<0>] ptlrpc_main+0xc76/0x1690 [ptlrpc] [ 827.090114] [<0>] kthread+0xe4/0xf0 [ 827.091852] [<0>] ret_from_fork_nospec_begin+0x7/0x21 [ 827.094183] [<0>] 0xfffffffffffffffe [ 834.980981] Lustre: *** cfs_fail_loc=304, val=0*** [ 851.002382] Lustre: *** cfs_fail_loc=304, val=0*** [ 867.005645] Lustre: *** cfs_fail_loc=304, val=0*** [ 883.009998] Lustre: *** cfs_fail_loc=304, val=0*** [ 886.951366] LustreError: 6963:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 100s: evicting client at 192.168.204.49@tcp ns: mdt-lustre-MDT0000_UUID lock: ffff8800a7265440/0xc8a46ac832f0d517 lrc: 3/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.204.49@tcp remote: 0x443fa76813616e93 expref: 16 pid: 11325 timeout: 886 lvb_type: 0 [ 891.965939] Lustre: DEBUG MARKER: == recovery-small test 19b: test expired_lock_main on ost (2867) ========================================================== 21:24:27 (1716254667) [ 925.131846] Lustre: *** cfs_fail_loc=304, val=0*** [ 925.134250] Lustre: Skipped 2 previous similar messages [ 989.203748] Lustre: *** cfs_fail_loc=304, val=0*** [ 989.205952] Lustre: Skipped 3 previous similar messages [ 992.935267] LustreError: 6963:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 101s: evicting client at 192.168.204.49@tcp ns: filter-lustre-OST0001_UUID lock: ffff8800a5927cc0/0xc8a46ac832f0d7b0 lrc: 3/0,0 mode: PW/PW res: [0x280000400:0xc:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) gid 0 flags: 0x60000400000020 nid: 192.168.204.49@tcp remote: 0x443fa76813617006 expref: 6 pid: 9086 timeout: 991 lvb_type: 0 [ 998.032400] Lustre: DEBUG MARKER: == recovery-small test 19c: check reconnect and lock resend do not trigger expired_lock_main ========================================================== 21:26:13 (1716254773) [ 1010.022632] Lustre: DEBUG MARKER: == recovery-small test 20a: ldlm_handle_enqueue error (should return error) ========================================================== 21:26:25 (1716254785) [ 1014.706937] Lustre: DEBUG MARKER: == recovery-small test 20b: ldlm_handle_enqueue error (should return error) ========================================================== 21:26:29 (1716254789) [ 1019.434204] Lustre: DEBUG MARKER: == recovery-small test 21a: drop close request while close and open are both in flight ========================================================== 21:26:34 (1716254794) [ 1019.716471] LustreError: 6974:0:(mdt_open.c:1392:mdt_reint_open()) cfs_fail_timeout id 129 sleeping for 5000ms [ 1021.118166] LustreError: 6974:0:(mdt_open.c:1392:mdt_reint_open()) cfs_fail_timeout interrupted [ 1021.358824] Lustre: *** cfs_fail_loc=115, val=2147483648*** [ 1042.141299] Lustre: DEBUG MARKER: == recovery-small test 21b: drop open request while close and open are both in flight ========================================================== 21:26:57 (1716254817) [ 1181.912061] Lustre: DEBUG MARKER: == recovery-small test 21c: drop both request while close and open are both in flight ========================================================== 21:29:16 (1716254956) [ 1206.675263] Lustre: DEBUG MARKER: == recovery-small test 21d: drop close reply while close and open are both in flight ========================================================== 21:29:41 (1716254981) [ 1207.082901] LustreError: 6974:0:(mdt_open.c:1392:mdt_reint_open()) cfs_fail_timeout id 129 sleeping for 5000ms [ 1208.387231] LustreError: 6974:0:(mdt_open.c:1392:mdt_reint_open()) cfs_fail_timeout interrupted [ 1208.705266] Lustre: *** cfs_fail_loc=122, val=2147483648*** [ 1208.708432] LustreError: 6976:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff880131a6d880 x1799622561493888/t4294967553(0) o35->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:319/0 lens 392/456 e 0 to 0 dl 1716254994 ref 1 fl Interpret:/200/0 rc 0/0 job:'multiop.0' uid:0 gid:0 [ 1208.723239] LustreError: 6976:0:(ldlm_lib.c:3271:target_send_reply_msg()) Skipped 1 previous similar message [ 1224.708237] Lustre: 6976:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff8800a1315850 x1799622561493888/t4294967553(0) o35->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:335/0 lens 392/456 e 0 to 0 dl 1716255010 ref 1 fl Interpret:/202/0 rc 0/0 job:'multiop.0' uid:0 gid:0 [ 1229.309954] Lustre: DEBUG MARKER: == recovery-small test 21e: drop open reply while close and open are both in flight ========================================================== 21:30:04 (1716255004) [ 1229.718518] LustreError: 6973:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88008b89ed80 x1799622561498752/t4294967570(0) o36->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:459/0 lens 488/456 e 0 to 0 dl 1716255134 ref 1 fl Interpret:/200/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 1364.746553] Lustre: lustre-MDT0000: Client 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp) reconnecting [ 1364.751699] Lustre: Skipped 15 previous similar messages [ 1364.769321] Lustre: 6974:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff88008d87f800 x1799622561498752/t4294967570(0) o36->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:594/0 lens 488/3152 e 0 to 0 dl 1716255269 ref 1 fl Interpret:/202/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 1367.481751] Lustre: DEBUG MARKER: == recovery-small test 21f: drop both reply while close and open are both in flight ========================================================== 21:32:22 (1716255142) [ 1367.906045] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 1367.908613] Lustre: Skipped 1 previous similar message [ 1367.911185] LustreError: 15103:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88008d100700 x1799622561508800/t4294967589(0) o36->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:597/0 lens 488/456 e 0 to 0 dl 1716255272 ref 1 fl Interpret:/200/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 1385.548189] Lustre: 6973:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff880134bdad80 x1799622561508800/t4294967589(0) o36->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:615/0 lens 488/3152 e 0 to 0 dl 1716255290 ref 1 fl Interpret:/202/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 1385.559912] Lustre: 6973:0:(mdt_recovery.c:148:mdt_req_from_lrd()) Skipped 1 previous similar message [ 1390.263583] Lustre: DEBUG MARKER: == recovery-small test 21g: drop open reply and close request while close and open are both in flight ========================================================== 21:32:45 (1716255165) [ 1390.682136] LustreError: 11325:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff880131d27b80 x1799622561514112/t4294967608(0) o36->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:620/0 lens 488/456 e 0 to 0 dl 1716255295 ref 1 fl Interpret:/200/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 1390.701548] LustreError: 11325:0:(ldlm_lib.c:3271:target_send_reply_msg()) Skipped 1 previous similar message [ 1392.328502] Lustre: *** cfs_fail_loc=115, val=2147483648*** [ 1392.331123] Lustre: Skipped 3 previous similar messages [ 1408.331429] Lustre: 15103:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff8801326ab100 x1799622561514112/t4294967608(0) o36->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:638/0 lens 488/3152 e 0 to 0 dl 1716255313 ref 1 fl Interpret:/202/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 1413.079009] Lustre: DEBUG MARKER: == recovery-small test 21h: drop open request and close reply while close and open are both in flight ========================================================== 21:33:08 (1716255188) [ 1435.687371] Lustre: DEBUG MARKER: == recovery-small test 22: drop close request and do mknod ========================================================== 21:33:30 (1716255210) [ 1456.299293] Lustre: DEBUG MARKER: == recovery-small test 23: client hang when close a file after mds crash ========================================================== 21:33:51 (1716255231) [ 1462.669471] Lustre: Failing over lustre-MDT0000 [ 1462.811625] Lustre: server umount lustre-MDT0000 complete [ 1475.220881] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 1475.429715] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 1475.462933] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 1476.520742] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1477.065910] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 1477.091008] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 1477.115219] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:22 to 0x280000400:65) [ 1477.115222] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:25 to 0x240000400:65) [ 1479.345609] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 1479.895850] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 1480.388235] Lustre: 3230:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716255239/real 1716255239] req@ffff88008c0a6300 x1799622567821888/t0(0) o400->lustre-MDT0000-lwp-OST0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716255255 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 1480.395293] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 1480.398710] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 1480.411517] Lustre: 3230:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [ 1485.471948] Lustre: DEBUG MARKER: == recovery-small test 24a: fsync error (should return error) ========================================================== 21:34:20 (1716255260) [ 1486.011436] Lustre: 19169:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-OST0000: evicting 2334086a-d9b9-46aa-b8af-be07c8ec2467 at adminstrative request [ 1490.399227] Lustre: 3229:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716255249/real 1716255249] req@ffff88008c2f9f80 x1799622567822336/t0(0) o400->lustre-MDT0000-lwp-OST0001@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716255265 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 1490.413269] Lustre: 3229:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [ 1490.459723] Lustre: DEBUG MARKER: == recovery-small test 24b: test dirty page discard due to client eviction ========================================================== 21:34:25 (1716255265) [ 1490.992739] Lustre: 19969:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-OST0000: evicting 2334086a-d9b9-46aa-b8af-be07c8ec2467 at adminstrative request [ 1495.397259] Lustre: DEBUG MARKER: == recovery-small test 26a: evict dead exports =========== 21:34:30 (1716255270) [ 1496.097436] Lustre: DEBUG MARKER: SKIP: recovery-small test_26a msg and ost1 are at the same node [ 1498.811662] Lustre: DEBUG MARKER: == recovery-small test 26b: evict dead exports =========== 21:34:33 (1716255273) [ 1499.390893] Lustre: DEBUG MARKER: SKIP: recovery-small test_26b msg and ost1 are at the same node [ 1502.074324] Lustre: DEBUG MARKER: == recovery-small test 27: fail LOV while using OSC's ==== 21:34:37 (1716255277) [ 1503.560801] Lustre: Failing over lustre-MDT0000 [ 1503.666031] Lustre: server umount lustre-MDT0000 complete [ 1516.018781] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 1516.190464] Lustre: lustre-MDT0000-lwp-OST0000: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 1516.198279] Lustre: Skipped 2 previous similar messages [ 1516.283363] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 1516.331539] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 1517.335501] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1521.242763] Lustre: lustre-MDT0000-lwp-OST0000: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 1521.249984] Lustre: Skipped 1 previous similar message [ 1521.327210] Lustre: 3231:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716255280/real 1716255280] req@ffff88009c8f4a80 x1799622567831360/t0(0) o400->lustre-MDT0000-lwp-OST0001@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716255296 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 1521.341355] Lustre: 3231:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [ 1522.144483] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 1522.196622] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 1522.221063] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:179 to 0x240000400:225) [ 1522.221126] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:176 to 0x280000400:193) [ 1609.707918] Lustre: Failing over lustre-MDT0000 [ 1609.827116] Lustre: server umount lustre-MDT0000 complete [ 1622.180922] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 1622.351687] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 1622.362592] Lustre: Skipped 1 previous similar message [ 1622.439889] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 1622.480293] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 1623.595491] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1627.389559] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 1627.402605] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 1627.408741] Lustre: Skipped 1 previous similar message [ 1627.464018] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 1627.470702] Lustre: 32702:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff88008c098e00 x1799622567400128/t12884936139(0) o36->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:738/0 lens 512/2888 e 0 to 0 dl 1716255413 ref 1 fl Interpret:/202/0 rc 0/0 job:'writemany.0' uid:0 gid:0 [ 1627.470705] Lustre: 32702:0:(mdt_recovery.c:148:mdt_req_from_lrd()) Skipped 1 previous similar message [ 1627.521920] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:5874 to 0x280000400:5889) [ 1627.522315] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:5906 to 0x240000400:5921) [ 1627.637245] Lustre: 3228:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716255386/real 1716255386] req@ffff88008aec9500 x1799622569050560/t0(0) o400->lustre-MDT0000-lwp-OST0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716255402 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 1627.650429] Lustre: 3228:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 4 previous similar messages [ 1631.825836] Lustre: DEBUG MARKER: == recovery-small test 28: handle error adding new clients (bug 6086) ========================================================== 21:36:46 (1716255406) [ 1650.379600] Lustre: Failing over lustre-MDT0000 [ 1650.536698] Lustre: server umount lustre-MDT0000 complete [ 1662.942918] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 1663.098843] Lustre: *** cfs_fail_loc=12f, val=0*** [ 1663.099029] Lustre: lustre-MDT0000-lwp-OST0000: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 1663.108600] LustreError: 24734:0:(tgt_lastrcvd.c:1071:tgt_client_new()) lustre-OST0000: no room for 0 clients - fix LR_MAX_CLIENTS [ 1663.114892] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_connect to node 0@lo failed: rc = -75 [ 1663.158029] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 1663.195901] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 1664.289255] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1666.312841] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 1666.334081] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 1666.358511] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:5874 to 0x280000400:5921) [ 1667.164748] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 1667.734680] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 1668.154383] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 1668.158555] Lustre: Skipped 1 previous similar message [ 1668.161723] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:5906 to 0x240000400:5953) [ 1673.413009] Lustre: DEBUG MARKER: == recovery-small test 29a: error adding new clients doesn't cause LBUG (bug 22273) ========================================================== 21:37:28 (1716255448) [ 1674.412306] Lustre: Failing over lustre-MDT0000 [ 1674.560198] Lustre: server umount lustre-MDT0000 complete [ 1677.312781] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 1677.472699] Lustre: lustre-MDT0000: Not available for connect from 192.168.204.49@tcp (not set up) [ 1677.494996] Lustre: *** cfs_fail_loc=711, val=0*** [ 1677.495614] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 1677.495618] Lustre: Skipped 1 previous similar message [ 1677.555009] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 1677.587509] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 1677.587691] Lustre: lustre-MDT0000: Aborting client recovery [ 1677.587695] LustreError: 5124:0:(ldlm_lib.c:2927:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery [ 1677.598382] Lustre: 5282:0:(ldlm_lib.c:2310:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 1677.603004] Lustre: 5282:0:(genops.c:1528:class_disconnect_stale_exports()) lustre-MDT0000: disconnect stale client 2334086a-d9b9-46aa-b8af-be07c8ec2467@ [ 1677.609911] Lustre: lustre-MDT0000: disconnecting 1 stale clients [ 1677.625275] Lustre: lustre-MDT0000-osd: cancel update llog [0x200000400:0x1:0x0] [ 1677.679682] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:5874 to 0x280000400:5953) [ 1678.763859] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1682.554194] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 1682.561510] Lustre: Skipped 1 previous similar message [ 1682.567178] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:5906 to 0x240000400:5985) [ 1684.561623] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0000.ost_server_uuid 50 [ 1684.647076] Lustre: DEBUG MARKER: os[cp].lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 0 sec [ 1689.097805] Lustre: DEBUG MARKER: == recovery-small test 29b: error adding new clients doesn't cause LBUG (bug 22273) ========================================================== 21:37:44 (1716255464) [ 1690.126263] Lustre: Failing over lustre-OST0000 [ 1690.146818] Lustre: server umount lustre-OST0000 complete [ 1692.489412] LustreError: 137-5: lustre-OST0000: not available for connect from 192.168.204.49@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 1692.567974] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 1692.574236] Lustre: Skipped 1 previous similar message [ 1692.950054] Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 60-180 [ 1692.960535] Lustre: lustre-OST0000: in recovery but waiting for the first client to connect [ 1692.960672] Lustre: lustre-OST0000: Aborting recovery [ 1692.960680] LustreError: 7717:0:(ldlm_lib.c:2927:target_stop_recovery_thread()) lustre-OST0000: Aborting recovery [ 1692.976159] Lustre: Skipped 2 previous similar messages [ 1692.979585] Lustre: 7754:0:(ldlm_lib.c:2310:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 1692.986702] Lustre: 7754:0:(ldlm_lib.c:2310:target_recovery_overseer()) Skipped 2 previous similar messages [ 1692.993330] Lustre: 7754:0:(genops.c:1528:class_disconnect_stale_exports()) lustre-OST0000: disconnect stale client 2334086a-d9b9-46aa-b8af-be07c8ec2467@ [ 1693.002743] Lustre: lustre-OST0000: disconnecting 2 stale clients [ 1693.034168] LustreError: 7754:0:(ofd_obd.c:1315:ofd_iocontrol()) lustre-OST0000: iocontrol from 'tgt_recover_0' cmd=c00866c1 _IOWR('f', 193, 8) unrecognized: rc = -25 [ 1693.047715] mount.lustre (7717) used greatest stack depth: 9608 bytes left [ 1694.075803] Lustre: *** cfs_fail_loc=711, val=0*** [ 1694.671614] LustreError: 167-0: lustre-OST0000-osc-MDT0000: This client was evicted by lustre-OST0000; in progress operations using this service will fail. [ 1694.678372] Lustre: lustre-OST0000-osc-MDT0000: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 1694.682786] Lustre: Skipped 1 previous similar message [ 1694.688212] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1704.476715] Lustre: DEBUG MARKER: == recovery-small test 50: failover MDS under load ======= 21:37:59 (1716255479) [ 1709.171158] Lustre: 3229:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716255427/real 1716255427] req@ffff880089da6680 x1799622569140672/t0(0) o400->lustre-MDT0000-lwp-OST0001@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716255482 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 1709.183680] Lustre: 3229:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [ 1715.026024] Lustre: Failing over lustre-MDT0000 [ 1715.178342] Lustre: server umount lustre-MDT0000 complete [ 1727.586204] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 1727.759939] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 1727.766807] Lustre: Skipped 1 previous similar message [ 1727.856812] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 1727.888567] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 1727.890120] Lustre: Skipped 2 previous similar messages [ 1728.902672] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1732.527985] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 1732.584737] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 1732.615401] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:6718 to 0x280000400:6753) [ 1732.615658] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:6750 to 0x240000400:6785) [ 1732.808940] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 1733.170205] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 1733.546412] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 1795.238181] Lustre: Failing over lustre-MDT0000 [ 1795.346035] Lustre: server umount lustre-MDT0000 complete [ 1807.784984] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 1807.927598] Lustre: lustre-MDT0000-lwp-OST0000: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 1807.942881] Lustre: Skipped 1 previous similar message [ 1808.016845] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 1808.101562] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 1809.122260] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1810.704434] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 1810.791362] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 1810.819184] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:11547 to 0x240000400:11585) [ 1810.819205] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:11515 to 0x280000400:11553) [ 1811.631238] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 1811.998684] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 1812.985521] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 1812.987500] Lustre: Skipped 1 previous similar message [ 1873.693814] Lustre: Failing over lustre-MDT0000 [ 1873.796081] Lustre: server umount lustre-MDT0000 complete [ 1886.175265] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 1886.355361] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 1886.485150] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 1886.569358] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 1887.522013] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1890.863665] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 1890.940323] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 1890.969037] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:16415 to 0x280000400:16449) [ 1890.969137] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:16447 to 0x240000400:16481) [ 1891.418610] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 1891.423139] Lustre: Skipped 1 previous similar message [ 1891.572253] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 1891.947600] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 1894.383148] Lustre: 3229:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716255653/real 1716255653] req@ffff88012df7ca80 x1799622571407040/t0(0) o400->lustre-MDT0000-lwp-OST0001@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716255669 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 1894.397881] Lustre: 3229:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [ 1917.542926] Lustre: DEBUG MARKER: == recovery-small test 51: failover MDS during recovery == 21:41:32 (1716255692) [ 1919.428819] Lustre: Failing over lustre-MDT0000 [ 1919.549987] Lustre: server umount lustre-MDT0000 complete [ 1933.113949] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1933.998962] Lustre: DEBUG MARKER: test_51: failover in 1 sec [ 1935.699603] Lustre: Failing over lustre-MDT0000 [ 1935.716159] LustreError: 21874:0:(ldlm_lib.c:2927:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery [ 1935.720925] Lustre: 21226:0:(ldlm_lib.c:2310:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 1935.726329] Lustre: 21226:0:(ldlm_lib.c:2310:target_recovery_overseer()) Skipped 2 previous similar messages [ 1935.730955] Lustre: lustre-MDT0000-osd: cancel update llog [0x200002b10:0x1:0x0] [ 1935.784946] Lustre: 21226:0:(mdt_handler.c:7962:mdt_postrecov()) lustre-MDT0000: auto trigger paused LFSCK failed: rc = -6 [ 1935.899073] Lustre: server umount lustre-MDT0000 complete [ 1949.299298] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1950.549256] Lustre: DEBUG MARKER: test_51: failover in 5 sec [ 1955.921674] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 1955.987001] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 1956.010840] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:18286 to 0x280000400:18305) [ 1956.010865] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:18318 to 0x240000400:18337) [ 1956.240011] Lustre: Failing over lustre-MDT0000 [ 1956.354625] Lustre: server umount lustre-MDT0000 complete [ 1968.541188] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 1968.543948] LustreError: Skipped 2 previous similar messages [ 1969.644470] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1970.795930] Lustre: DEBUG MARKER: test_51: failover in 10 sec [ 1970.979487] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:18315 to 0x280000400:18337) [ 1970.979650] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:18347 to 0x240000400:18369) [ 1981.466170] Lustre: Failing over lustre-MDT0000 [ 1981.592728] Lustre: server umount lustre-MDT0000 complete [ 1995.442769] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 1996.750489] Lustre: DEBUG MARKER: test_51: failover in 20 sec [ 1997.507906] Lustre: 26041:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff880098abb100 x1799622580709760/t47244645388(0) o101->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:353/0 lens 672/3488 e 0 to 0 dl 1716255783 ref 1 fl Interpret:/202/0 rc 0/0 job:'writemany.0' uid:0 gid:0 [ 1997.522226] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:19223 to 0x240000400:19297) [ 1997.522298] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:19190 to 0x280000400:19233) [ 2017.290733] Lustre: Failing over lustre-MDT0000 [ 2017.416979] Lustre: server umount lustre-MDT0000 complete [ 2029.959592] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 2029.970315] Lustre: Skipped 4 previous similar messages [ 2030.068219] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 2030.070472] Lustre: Skipped 4 previous similar messages [ 2030.125184] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 2030.127187] Lustre: Skipped 6 previous similar messages [ 2031.239803] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 2032.492484] Lustre: DEBUG MARKER: test_51: failover in 25 sec [ 2035.021954] Lustre: lustre-MDT0000-lwp-OST0000: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 2035.026896] Lustre: Skipped 3 previous similar messages [ 2037.553325] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 2037.557523] Lustre: Skipped 2 previous similar messages [ 2037.652149] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 2037.656892] Lustre: Skipped 2 previous similar messages [ 2037.678418] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:20829 to 0x240000400:20865) [ 2037.678422] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:20765 to 0x280000400:20801) [ 2058.182468] Lustre: Failing over lustre-MDT0000 [ 2058.321069] Lustre: server umount lustre-MDT0000 complete [ 2071.905581] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 2073.072289] Lustre: DEBUG MARKER: test_51: failover in 30 sec [ 2077.663446] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:22283 to 0x240000400:22305) [ 2077.663556] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:22219 to 0x280000400:22241) [ 2103.533171] Lustre: Failing over lustre-MDT0000 [ 2103.644428] Lustre: server umount lustre-MDT0000 complete [ 2115.854933] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 2115.861012] LustreError: Skipped 3 previous similar messages [ 2115.912314] LustreError: 137-5: lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2115.917955] LustreError: Skipped 2 previous similar messages [ 2116.842537] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 2122.792489] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000400:24353 to 0x240000400:24385) [ 2122.792615] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000400:24290 to 0x280000400:24321) [ 2141.828501] Lustre: DEBUG MARKER: == recovery-small test 52: failover OST under load ======= 21:45:16 (1716255916) [ 2152.485276] Lustre: Failing over lustre-OST0000 [ 2152.644404] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_create to node 0@lo failed: rc = -107 [ 2152.647299] Lustre: lustre-OST0000: Not available for connect from 0@lo (stopping) [ 2154.540760] Lustre: server umount lustre-OST0000 complete [ 2156.104146] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2161.111920] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2161.119919] LustreError: Skipped 1 previous similar message [ 2166.120304] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2166.129639] LustreError: Skipped 1 previous similar message [ 2167.879477] Lustre: lustre-OST0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 2167.885277] Lustre: Skipped 2 previous similar messages [ 2168.540538] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 2168.689612] Lustre: lustre-OST0000: Recovery over after 0:01, of 2 clients 2 recovered and 0 were evicted. [ 2168.699815] Lustre: Skipped 2 previous similar messages [ 2170.812417] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 2171.196458] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 2483.064794] Lustre: Failing over lustre-OST0000 [ 2483.096747] Lustre: server umount lustre-OST0000 complete [ 2484.452153] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_create to node 0@lo failed: rc = -107 [ 2484.457164] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 2484.465258] Lustre: Skipped 5 previous similar messages [ 2484.468326] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2495.556693] Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 60-180 [ 2495.558506] Lustre: Skipped 3 previous similar messages [ 2495.561274] Lustre: lustre-OST0000: in recovery but waiting for the first client to connect [ 2495.563164] Lustre: Skipped 3 previous similar messages [ 2496.657884] Lustre: lustre-OST0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 2496.699807] Lustre: lustre-OST0000-osc-MDT0000: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 2496.699975] Lustre: lustre-OST0000: Recovery over after 0:01, of 2 clients 2 recovered and 0 were evicted. [ 2496.708331] Lustre: Skipped 6 previous similar messages [ 2497.057742] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 2499.255016] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 2499.628534] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 2718.804808] Lustre: lustre-OST0001-osc-MDT0000: update sequence from 0x280000400 to 0x280000401 [ 2737.621760] Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x240000400 to 0x240000bd0 [ 2814.090034] Lustre: Failing over lustre-OST0000 [ 2814.134377] Lustre: server umount lustre-OST0000 complete [ 2816.057181] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2816.064730] LustreError: Skipped 4 previous similar messages [ 2826.484016] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 2828.193966] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 2830.540654] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 2830.912129] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 3109.030051] Lustre: DEBUG MARKER: == recovery-small test 53a: touch: drop rep ============== 22:01:24 (1716256884) [ 3109.611229] Lustre: *** cfs_fail_loc=157, val=2147483648*** [ 3109.613461] Lustre: Skipped 3 previous similar messages [ 3109.615069] LustreError: 1850:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88009e596680 x1799622653655168/t0(0) o101->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:710/0 lens 576/688 e 0 to 0 dl 1716256895 ref 1 fl Interpret:/200/0 rc 0/0 job:'openfile.0' uid:0 gid:0 [ 3109.621972] LustreError: 1850:0:(ldlm_lib.c:3271:target_send_reply_msg()) Skipped 1 previous similar message [ 3125.632153] Lustre: lustre-MDT0000: Client 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp) reconnecting [ 3125.637907] Lustre: Skipped 4 previous similar messages [ 3130.405773] Lustre: DEBUG MARKER: == recovery-small test 53b: touch: drop rep ============== 22:01:45 (1716256905) [ 3130.827706] LustreError: 1045:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff8800a06ae300 x1799622653659712/t0(0) o101->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:731/0 lens 576/688 e 0 to 0 dl 1716256916 ref 1 fl Interpret:/200/0 rc 0/0 job:'openfile.0' uid:0 gid:0 [ 3151.459405] Lustre: DEBUG MARKER: == recovery-small test 53c: touch: drop rep ============== 22:02:06 (1716256926) [ 3151.965404] Lustre: *** cfs_fail_loc=157, val=2147483648*** [ 3151.967980] Lustre: Skipped 1 previous similar message [ 3151.970259] LustreError: 1043:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88009eba8000 x1799622653661312/t64424914435(0) o101->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:752/0 lens 664/664 e 0 to 0 dl 1716256937 ref 1 fl Interpret:/200/0 rc 0/0 job:'openfile.0' uid:0 gid:0 [ 3167.966454] Lustre: 1045:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff88009ead3b80 x1799622653661312/t64424914435(0) o101->2334086a-d9b9-46aa-b8af-be07c8ec2467@192.168.204.49@tcp:13/0 lens 664/3488 e 0 to 0 dl 1716256953 ref 1 fl Interpret:H/202/0 rc 0/0 job:'openfile.0' uid:0 gid:0 [ 3167.980413] Lustre: 1045:0:(mdt_recovery.c:148:mdt_req_from_lrd()) Skipped 1 previous similar message [ 3172.744137] Lustre: DEBUG MARKER: == recovery-small test 54: back in time ================== 22:02:27 (1716256947) [ 3183.681933] Lustre: Failing over lustre-MDT0000 [ 3183.812434] Lustre: server umount lustre-MDT0000 complete [ 3196.238487] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3196.393288] Lustre: lustre-MDT0000-lwp-OST0000: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 3196.401268] Lustre: Skipped 1 previous similar message [ 3196.435691] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 3196.467756] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 3196.470715] Lustre: Skipped 1 previous similar message [ 3197.522514] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3197.907497] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 3197.911223] Lustre: Skipped 1 previous similar message [ 3197.934055] Lustre: lustre-MDT0000: Recovery over after 0:01, of 2 clients 2 recovered and 0 were evicted. [ 3197.939110] Lustre: Skipped 1 previous similar message [ 3197.958354] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:27035 to 0x280000401:27073) [ 3197.958365] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:24989 to 0x240000bd0:25345) [ 3200.313825] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 3200.864709] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 3201.433804] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 3201.437813] Lustre: Skipped 1 previous similar message [ 3202.401243] Lustre: 3231:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716256961/real 1716256961] req@ffff880091811f80 x1799622587627264/t0(0) o400->lustre-MDT0000-lwp-OST0001@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716256977 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3202.422336] Lustre: 3231:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 23 previous similar messages [ 3206.530283] Lustre: DEBUG MARKER: == recovery-small test 55: ost_brw_read/write drops timed-out read/write request ========================================================== 22:03:01 (1716256981) [ 3210.482438] Lustre: *** cfs_fail_loc=21d, val=0*** [ 3210.483440] Lustre: Skipped 3 previous similar messages [ 3210.484342] LustreError: 21850:0:(tgt_handler.c:2780:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.204.49@tcp because locking object 0x240000bd0:25347 took 0 seconds (limit was 11). [ 3210.487734] Lustre: lustre-OST0000: Bulk IO write error with 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp), client will retry: rc = -110 [ 3210.490482] Lustre: Skipped 1 previous similar message [ 3226.520842] Lustre: lustre-OST0000: Client 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp) reconnecting [ 3226.526179] Lustre: Skipped 2 previous similar messages [ 3226.532764] LustreError: 21850:0:(tgt_handler.c:2780:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.204.49@tcp because locking object 0x240000bd0:25347 took 0 seconds (limit was 11). [ 3226.532866] Lustre: lustre-OST0000: Bulk IO write error with 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp), client will retry: rc = -110 [ 3226.532868] Lustre: Skipped 7 previous similar messages [ 3226.552614] LustreError: 21850:0:(tgt_handler.c:2780:tgt_brw_write()) Skipped 16 previous similar messages [ 3242.536344] LustreError: 21850:0:(tgt_handler.c:2780:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.204.49@tcp because locking object 0x240000bd0:25347 took 0 seconds (limit was 11). [ 3242.536630] Lustre: lustre-OST0000: Bulk IO write error with 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp), client will retry: rc = -110 [ 3242.536632] Lustre: Skipped 8 previous similar messages [ 3242.552627] LustreError: 21850:0:(tgt_handler.c:2780:tgt_brw_write()) Skipped 8 previous similar messages [ 3258.548387] LustreError: 30242:0:(tgt_handler.c:2780:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.204.49@tcp because locking object 0x240000bd0:25347 took 0 seconds (limit was 11). [ 3258.548422] Lustre: lustre-OST0000: Bulk IO write error with 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp), client will retry: rc = -110 [ 3258.548424] Lustre: Skipped 8 previous similar messages [ 3258.565463] LustreError: 30242:0:(tgt_handler.c:2780:tgt_brw_write()) Skipped 8 previous similar messages [ 3274.552907] Lustre: *** cfs_fail_loc=21d, val=0*** [ 3274.552986] LustreError: 30242:0:(tgt_handler.c:2780:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.204.49@tcp because locking object 0x240000bd0:25347 took 0 seconds (limit was 11). [ 3274.552989] LustreError: 30242:0:(tgt_handler.c:2780:tgt_brw_write()) Skipped 11 previous similar messages [ 3274.553009] Lustre: lustre-OST0000: Bulk IO write error with 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp), client will retry: rc = -110 [ 3274.553011] Lustre: Skipped 19 previous similar messages [ 3274.575768] Lustre: Skipped 64 previous similar messages [ 3290.556027] LustreError: 8422:0:(tgt_handler.c:2780:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.204.49@tcp because locking object 0x240000bd0:25347 took 0 seconds (limit was 11). [ 3290.556158] Lustre: lustre-OST0000: Bulk IO write error with 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp), client will retry: rc = -110 [ 3290.556160] Lustre: Skipped 19 previous similar messages [ 3290.572508] LustreError: 8422:0:(tgt_handler.c:2780:tgt_brw_write()) Skipped 38 previous similar messages [ 3306.561060] LustreError: 8422:0:(tgt_handler.c:2780:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.204.49@tcp because locking object 0x240000bd0:25347 took 0 seconds (limit was 11). [ 3306.561166] Lustre: lustre-OST0000: Bulk IO write error with 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp), client will retry: rc = -110 [ 3306.561168] Lustre: Skipped 19 previous similar messages [ 3306.580975] LustreError: 8422:0:(tgt_handler.c:2780:tgt_brw_write()) Skipped 19 previous similar messages [ 3338.569506] LustreError: 8421:0:(tgt_handler.c:2780:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.204.49@tcp because locking object 0x240000bd0:25347 took 0 seconds (limit was 11). [ 3338.570752] Lustre: lustre-OST0000: Bulk IO write error with 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp), client will retry: rc = -110 [ 3338.570754] Lustre: Skipped 39 previous similar messages [ 3338.585512] LustreError: 8421:0:(tgt_handler.c:2780:tgt_brw_write()) Skipped 39 previous similar messages [ 3386.592858] Lustre: lustre-OST0000: Client 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp) reconnecting [ 3386.598398] Lustre: Skipped 9 previous similar messages [ 3402.609534] Lustre: *** cfs_fail_loc=21d, val=0*** [ 3402.609910] LustreError: 30242:0:(tgt_handler.c:2780:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.204.49@tcp because locking object 0x240000bd0:25347 took 0 seconds (limit was 11). [ 3402.609914] LustreError: 30242:0:(tgt_handler.c:2780:tgt_brw_write()) Skipped 60 previous similar messages [ 3402.609933] Lustre: lustre-OST0000: Bulk IO write error with 2334086a-d9b9-46aa-b8af-be07c8ec2467 (at 192.168.204.49@tcp), client will retry: rc = -110 [ 3402.609935] Lustre: Skipped 79 previous similar messages [ 3402.631782] Lustre: Skipped 159 previous similar messages [ 3491.000378] Lustre: DEBUG MARKER: == recovery-small test 56: do not fail on getattr resend ========================================================== 22:07:46 (1716257266) [ 3491.398517] LustreError: 20362:0:(mdt_handler.c:2320:mdt_getattr_name_lock()) cfs_fail_timeout id 136 sleeping for 40000ms [ 3531.405204] LustreError: 20362:0:(mdt_handler.c:2320:mdt_getattr_name_lock()) cfs_fail_timeout id 136 awake [ 3536.151727] Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash ========================================================== 22:08:31 (1716257311) [ 3538.063636] Lustre: Failing over lustre-MDT0000 [ 3538.226923] Lustre: server umount lustre-MDT0000 complete [ 3540.959160] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3541.201185] Lustre: lustre-MDT0000: Aborting client recovery [ 3541.201427] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:27035 to 0x280000401:27105) [ 3541.202348] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:25349 to 0x240000bd0:25377) [ 3542.255634] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3549.995622] Lustre: DEBUG MARKER: == recovery-small test 58: Eviction in the middle of open RPC reply processing ========================================================== 22:08:45 (1716257325) [ 3567.148276] Lustre: 25221:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716257326/real 1716257326] req@ffff880135cc5500 x1799622587663104/t0(0) o104->lustre-MDT0000@192.168.204.49@tcp:15/16 lens 328/224 e 0 to 1 dl 1716257342 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 3567.165368] Lustre: 25221:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 3 previous similar messages [ 3571.546633] Lustre: DEBUG MARKER: == recovery-small test 59: Read cancel race on client eviction ========================================================== 22:09:06 (1716257346) [ 3582.026015] LustreError: 21857:0:(ldlm_lockd.c:780:ldlm_handle_ast_error()) ### client (nid 192.168.204.49@tcp) returned error from blocking AST (req@ffff88012c96e680 x1799622587666176 status -107 rc -107), evict it ns: filter-lustre-OST0000_UUID lock: ffff880133584480/0xc8a46ac833c03dc9 lrc: 4/0,0 mode: PW/PW res: [0x240000bd0:0x6322:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) gid 0 flags: 0x60000400000020 nid: 192.168.204.49@tcp remote: 0x443fa76813816490 expref: 5 pid: 21857 timeout: 3681 lvb_type: 0 [ 3582.045473] LustreError: 138-a: lustre-OST0000: A client on nid 192.168.204.49@tcp was evicted due to a lock blocking callback time out: rc -107 [ 3582.051255] LustreError: 6963:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 192.168.204.49@tcp ns: filter-lustre-OST0000_UUID lock: ffff880133584480/0xc8a46ac833c03dc9 lrc: 3/0,0 mode: PW/PW res: [0x240000bd0:0x6322:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) gid 0 flags: 0x60000400000020 nid: 192.168.204.49@tcp remote: 0x443fa76813816490 expref: 6 pid: 21857 timeout: 0 lvb_type: 0 [ 3586.284871] Lustre: DEBUG MARKER: == recovery-small test 60: Add Changelog entries during MDS failover ========================================================== 22:09:21 (1716257361) [ 3586.322789] LustreError: 26579:0:(ldlm_lockd.c:780:ldlm_handle_ast_error()) ### client (nid 192.168.204.49@tcp) returned error from blocking AST (req@ffff88012cf32680 x1799622587666688 status -107 rc -107), evict it ns: mdt-lustre-MDT0000_UUID lock: ffff880133776880/0xc8a46ac833c03de5 lrc: 4/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.204.49@tcp remote: 0x443fa7681381649e expref: 6 pid: 25221 timeout: 3685 lvb_type: 0 [ 3586.343884] LustreError: 138-a: lustre-MDT0000: A client on nid 192.168.204.49@tcp was evicted due to a lock blocking callback time out: rc -107 [ 3586.349964] LustreError: 6963:0:(ldlm_lockd.c:261:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 192.168.204.49@tcp ns: mdt-lustre-MDT0000_UUID lock: ffff880133776880/0xc8a46ac833c03de5 lrc: 3/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.204.49@tcp remote: 0x443fa7681381649e expref: 7 pid: 25221 timeout: 0 lvb_type: 0 [ 3587.314794] Lustre: lustre-MDD0000: changelog on [ 3602.283918] Lustre: lustre-OST0001: haven't heard from client 0a567fcb-db78-42ce-953e-02796119ccfc (at 192.168.204.49@tcp) in 31 seconds. I think it's dead, and I am evicting it. exp ffff8800a71ca000, cur 1716257377 expire 1716257347 last 1716257346 [ 3617.763875] Lustre: Failing over lustre-MDT0000 [ 3617.963062] Lustre: server umount lustre-MDT0000 complete [ 3630.299454] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3630.523164] Lustre: lustre-MDD0000: changelog on [ 3631.384301] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3636.874258] Lustre: 30747:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff880132453b80 x1799622655085888/t73014459427(0) o36->36b8705c-07f5-4184-8ff0-707ace5868f9@192.168.204.49@tcp:521/0 lens 504/2888 e 0 to 0 dl 1716257461 ref 1 fl Interpret:/202/0 rc 0/0 job:'unlinkmany.0' uid:0 gid:0 [ 3636.887797] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:27879 to 0x240000bd0:27905) [ 3636.887885] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:29607 to 0x280000401:29633) [ 3676.535237] Lustre: 3229:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716257396/real 1716257396] req@ffff88008fa57100 x1799622587679744/t0(0) o400->lustre-MDT0000-lwp-OST0001@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716257451 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3676.626275] Lustre: lustre-MDD0000: changelog off [ 3681.841536] Lustre: DEBUG MARKER: == recovery-small test 61: Verify to not reuse orphan objects - bug 17025 ========================================================== 22:10:56 (1716257456) [ 3683.311429] LustreError: 1904:0:(osd_handler.c:717:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** [ 3683.637472] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 [ 3684.734827] Lustre: Failing over lustre-MDT0000 [ 3684.874209] Lustre: server umount lustre-MDT0000 complete [ 3687.876707] Lustre: lustre-MDT0000: Aborting client recovery [ 3687.879124] LustreError: 2983:0:(ldlm_lib.c:2927:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery [ 3687.883306] Lustre: 3117:0:(ldlm_lib.c:2310:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 3687.888152] Lustre: 3117:0:(ldlm_lib.c:2310:target_recovery_overseer()) Skipped 2 previous similar messages [ 3687.892352] Lustre: 3117:0:(genops.c:1528:class_disconnect_stale_exports()) lustre-MDT0000: disconnect stale client 36b8705c-07f5-4184-8ff0-707ace5868f9@ [ 3687.898631] Lustre: 3117:0:(genops.c:1528:class_disconnect_stale_exports()) Skipped 1 previous similar message [ 3687.903044] Lustre: lustre-MDT0000: disconnecting 1 stale clients [ 3687.920607] Lustre: lustre-MDT0000-osd: cancel update llog [0x200004a50:0x1:0x0] [ 3687.974604] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:29607 to 0x280000401:29665) [ 3687.974796] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:27879 to 0x240000bd0:27937) [ 3689.030698] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3696.748141] Lustre: DEBUG MARKER: == recovery-small test 65: lock enqueue for destroyed export ========================================================== 22:11:11 (1716257471) [ 3697.303388] LustreError: 22954:0:(ldlm_lockd.c:1477:ldlm_handle_enqueue()) cfs_fail_timeout id 31e sleeping for 6000ms [ 3697.331215] Lustre: *** cfs_fail_loc=31e, val=0*** [ 3697.333449] Lustre: Skipped 2 previous similar messages [ 3699.305584] LustreError: 22900:0:(ldlm_lockd.c:1477:ldlm_handle_enqueue()) cfs_fail_timeout id 31e sleeping for 6000ms [ 3701.625277] Lustre: 4695:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-OST0000: evicting 36b8705c-07f5-4184-8ff0-707ace5868f9 at adminstrative request [ 3701.631591] LustreError: 6961:0:(ldlm_lockd.c:2996:ldlm_bl_thread_exports()) cfs_fail_timeout id 31e sleeping for 4000ms [ 3703.309199] LustreError: 22954:0:(ldlm_lockd.c:1477:ldlm_handle_enqueue()) cfs_fail_timeout id 31e awake [ 3703.313376] LustreError: 22954:0:(ldlm_lockd.c:1499:ldlm_handle_enqueue()) ### lock on destroyed export ffff8800a5b76800 ns: filter-lustre-OST0000_UUID lock: ffff88009e815680/0xc8a46ac833c799b0 lrc: 3/0,0 mode: --/PW res: [0x240000bd0:0x6d23:0x0].0x0 rrc: 4 type: EXT [0->4095] (req 0->4095) gid 0 flags: 0x70000000020020 nid: 192.168.204.49@tcp remote: 0x443fa76813826d5c expref: 4 pid: 22954 timeout: 0 lvb_type: 0 [ 3704.013281] LustreError: 22900:0:(ldlm_lockd.c:1477:ldlm_handle_enqueue()) cfs_fail_timeout interrupted [ 3713.349664] Lustre: lustre-OST0000: Client 731e5dbc-c38f-4739-96ac-ef83c848c202 (at 192.168.204.49@tcp) reconnecting [ 3713.354787] Lustre: Skipped 8 previous similar messages [ 3717.862525] Lustre: DEBUG MARKER: == recovery-small test 66: lock enqueue re-send vs client eviction ========================================================== 22:11:32 (1716257492) [ 3718.436536] Lustre: *** cfs_fail_loc=157, val=2147483648*** [ 3718.439130] LustreError: 3027:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88012df1b100 x1799622656012544/t0(0) o101->36b8705c-07f5-4184-8ff0-707ace5868f9@192.168.204.49@tcp:564/0 lens 576/688 e 0 to 0 dl 1716257504 ref 1 fl Interpret:/200/0 rc 0/0 job:'stat.0' uid:0 gid:0 [ 3720.379790] LustreError: 3027:0:(mdt_handler.c:2320:mdt_getattr_name_lock()) cfs_fail_timeout id 136 sleeping for 40000ms [ 3722.704751] Lustre: 5782:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 36b8705c-07f5-4184-8ff0-707ace5868f9 at adminstrative request [ 3723.184195] LustreError: 3027:0:(mdt_handler.c:2320:mdt_getattr_name_lock()) cfs_fail_timeout interrupted [ 3723.188802] LustreError: 3027:0:(mdt_handler.c:2320:mdt_getattr_name_lock()) Skipped 1 previous similar message [ 3727.395947] Lustre: DEBUG MARKER: == recovery-small test 67: connect vs import invalidate race ========================================================== 22:11:42 (1716257502) [ 3732.710625] Lustre: 6679:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 36b8705c-07f5-4184-8ff0-707ace5868f9 at adminstrative request [ 3748.645392] Lustre: DEBUG MARKER: == recovery-small test 100: IR: Make sure normal recovery still works w/o IR ========================================================== 22:12:03 (1716257523) [ 3750.335671] Lustre: Failing over lustre-OST0000 [ 3750.356514] Lustre: server umount lustre-OST0000 complete [ 3752.904123] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_statfs to node 0@lo failed: rc = -107 [ 3752.909432] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 3752.915661] LustreError: Skipped 4 previous similar messages [ 3757.944065] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 3757.952639] LustreError: Skipped 1 previous similar message [ 3764.491172] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3768.588210] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 3769.144131] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 3775.447723] Lustre: DEBUG MARKER: == recovery-small test 101a: IR: Make sure IR works w/o normal recovery ========================================================== 22:12:30 (1716257550) [ 3776.762226] Lustre: Failing over lustre-OST0000 [ 3776.779453] Lustre: server umount lustre-OST0000 complete [ 3777.784811] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 3777.791693] LustreError: Skipped 1 previous similar message [ 3789.121502] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 3790.844618] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3793.637000] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 3794.186305] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 3800.468884] Lustre: DEBUG MARKER: == recovery-small test 101b: IR: Make sure IR works w/o normal recovery and proceed EAGAIN ========================================================== 22:12:55 (1716257575) [ 3802.114387] Lustre: Failing over lustre-OST0000 [ 3802.132454] Lustre: server umount lustre-OST0000 complete [ 3804.136480] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 3804.143391] Lustre: Skipped 9 previous similar messages [ 3804.146437] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 3804.153518] LustreError: Skipped 3 previous similar messages [ 3814.473215] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 3814.481362] Lustre: lustre-OST0000: in recovery but waiting for the first client to connect [ 3814.481460] LustreError: 13494:0:(ofd_dev.c:651:ofd_prepare()) cfs_fail_timeout id 247 sleeping for 25000ms [ 3814.489187] Lustre: Skipped 6 previous similar messages [ 3839.581197] LustreError: 13494:0:(ofd_dev.c:651:ofd_prepare()) cfs_fail_timeout id 247 awake [ 3841.249228] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3844.504120] Lustre: lustre-OST0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 3844.508213] Lustre: Skipped 3 previous similar messages [ 3844.572861] Lustre: lustre-OST0000-osc-MDT0000: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 3844.572888] Lustre: lustre-OST0000: Recovery over after 0:01, of 2 clients 2 recovered and 0 were evicted. [ 3844.572890] Lustre: Skipped 3 previous similar messages [ 3844.585196] Lustre: Skipped 9 previous similar messages [ 3845.398297] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 3845.963946] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 3851.647529] Lustre: DEBUG MARKER: == recovery-small test 102: IR: New client gets updated nidtbl after MGS restart ========================================================== 22:13:46 (1716257626) [ 3852.852839] Lustre: Failing over lustre-OST0000 [ 3852.864509] Lustre: server umount lustre-OST0000 complete [ 3854.520136] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 3854.523071] LustreError: Skipped 3 previous similar messages [ 3864.163550] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 3865.089588] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3867.188736] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 3867.511911] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 3869.761023] Lustre: Failing over lustre-MDT0000 [ 3869.865187] Lustre: server umount lustre-MDT0000 complete [ 3871.150959] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3871.153542] LustreError: Skipped 1 previous similar message [ 3871.272537] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 3871.274138] Lustre: Skipped 4 previous similar messages [ 3871.296436] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:29667 to 0x280000401:29697) [ 3871.296450] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:27941 to 0x240000bd0:27969) [ 3871.910368] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3872.753104] Lustre: Failing over lustre-OST0000 [ 3872.775480] Lustre: server umount lustre-OST0000 complete [ 3876.263661] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_statfs to node 0@lo failed: rc = -107 [ 3885.148975] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3887.193964] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 3891.034183] Lustre: DEBUG MARKER: == recovery-small test 103: IR: MDS can start w/o MGS and get updated nidtbl later ========================================================== 22:14:26 (1716257666) [ 3891.553801] Lustre: DEBUG MARKER: SKIP: recovery-small test_103 needs separate mgs and mds [ 3893.181367] Lustre: DEBUG MARKER: == recovery-small test 104: IR: ost can disable IR voluntarily ========================================================== 22:14:28 (1716257668) [ 3893.983928] Lustre: Failing over lustre-OST0000 [ 3893.993586] Lustre: server umount lustre-OST0000 complete [ 3896.531263] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3902.013901] Lustre: DEBUG MARKER: == recovery-small test 105: IR: NON IR clients support === 22:14:37 (1716257677) [ 3902.528846] Lustre: DEBUG MARKER: SKIP: recovery-small test_105 Needs multiple clients [ 3905.272570] Lustre: DEBUG MARKER: == recovery-small test 106: lightweight connection support ========================================================== 22:14:40 (1716257680) [ 3907.258365] LustreError: 24270:0:(osd_handler.c:717:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** [ 3907.583323] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 [ 3908.313542] Lustre: Failing over lustre-MDT0000 [ 3908.471407] Lustre: server umount lustre-MDT0000 complete [ 3922.106264] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3922.895374] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:27941 to 0x240000bd0:28001) [ 3922.895433] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:29699 to 0x280000401:29729) [ 3923.085511] LustreError: 6959:0:(ldlm_lockd.c:2594:ldlm_cancel_handler()) ldlm_cancel from 192.168.204.49@tcp arrived at 1716257698 with bad export cookie 14457798111860859196 [ 3923.092065] LustreError: 6959:0:(ldlm_lockd.c:2594:ldlm_cancel_handler()) Skipped 2 previous similar messages [ 3927.006215] Lustre: 3231:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716257685/real 1716257685] req@ffff88012cde0000 x1799622588049152/t0(0) o400->lustre-MDT0000-lwp-OST0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716257701 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3927.019481] Lustre: 3231:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [ 3927.880739] Lustre: DEBUG MARKER: == recovery-small test 107: drop reint reply, then restart MDT ========================================================== 22:15:02 (1716257702) [ 3928.241965] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 3928.244561] LustreError: 25170:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88009da73800 x1799622656041344/t90194313220(0) o36->57183a4c-0744-4464-bd9a-e463ba6a5b37@192.168.204.49@tcp:19/0 lens 504/448 e 0 to 0 dl 1716257714 ref 1 fl Interpret:/200/0 rc 0/0 job:'mkdir.0' uid:0 gid:0 [ 3929.194754] Lustre: Failing over lustre-MDT0000 [ 3929.336634] Lustre: server umount lustre-MDT0000 complete [ 3942.980784] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 3944.292172] Lustre: 27600:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff88009da71180 x1799622656041344/t90194313220(0) o36->57183a4c-0744-4464-bd9a-e463ba6a5b37@192.168.204.49@tcp:35/0 lens 504/2880 e 0 to 0 dl 1716257730 ref 1 fl Interpret:/202/0 rc 0/0 job:'mkdir.0' uid:0 gid:0 [ 3944.308996] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:29699 to 0x280000401:29761) [ 3944.309206] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:27941 to 0x240000bd0:28033) [ 3945.786502] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 3946.325117] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 3951.902233] Lustre: DEBUG MARKER: == recovery-small test 108: client eviction don't crash == 22:15:26 (1716257726) [ 3952.273321] Lustre: 29121:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-OST0000: evicting 57183a4c-0744-4464-bd9a-e463ba6a5b37 at adminstrative request [ 3957.841863] Lustre: DEBUG MARKER: == recovery-small test 110a: create remote directory: drop client req ========================================================== 22:15:32 (1716257732) [ 3958.355075] Lustre: DEBUG MARKER: SKIP: recovery-small test_110a needs >= 2 MDTs [ 3961.056147] Lustre: DEBUG MARKER: == recovery-small test 110b: create remote directory: drop Master rep ========================================================== 22:15:36 (1716257736) [ 3961.593198] Lustre: DEBUG MARKER: SKIP: recovery-small test_110b needs >= 2 MDTs [ 3964.302745] Lustre: DEBUG MARKER: == recovery-small test 110c: create remote directory: drop update rep on slave MDT ========================================================== 22:15:39 (1716257739) [ 3964.815804] Lustre: DEBUG MARKER: SKIP: recovery-small test_110c needs >= 2 MDTs [ 3967.588574] Lustre: DEBUG MARKER: == recovery-small test 110d: remove remote directory: drop client req ========================================================== 22:15:42 (1716257742) [ 3968.123549] Lustre: DEBUG MARKER: SKIP: recovery-small test_110d needs >= 2 MDTs [ 3970.928904] Lustre: DEBUG MARKER: == recovery-small test 110e: remove remote directory: drop master rep ========================================================== 22:15:45 (1716257745) [ 3971.451504] Lustre: DEBUG MARKER: SKIP: recovery-small test_110e needs >= 2 MDTs [ 3974.240762] Lustre: DEBUG MARKER: == recovery-small test 110f: remove remote directory: drop slave rep ========================================================== 22:15:49 (1716257749) [ 3974.781682] Lustre: DEBUG MARKER: SKIP: recovery-small test_110f needs >= 2 MDTs [ 3977.463529] Lustre: DEBUG MARKER: == recovery-small test 110g: drop reply during migration ========================================================== 22:15:52 (1716257752) [ 3977.980557] Lustre: DEBUG MARKER: SKIP: recovery-small test_110g needs >= 2 MDTs [ 3980.418176] Lustre: DEBUG MARKER: == recovery-small test 110h: drop update reply during cross-MDT file rename ========================================================== 22:15:55 (1716257755) [ 3980.726546] Lustre: DEBUG MARKER: SKIP: recovery-small test_110h needs >= 2 MDTs [ 3982.794298] Lustre: DEBUG MARKER: == recovery-small test 110i: drop update reply during cross-MDT dir rename ========================================================== 22:15:57 (1716257757) [ 3983.316528] Lustre: DEBUG MARKER: SKIP: recovery-small test_110i needs >= 2 MDTs [ 3985.991551] Lustre: DEBUG MARKER: == recovery-small test 110j: drop update reply during cross-MDT ln ========================================================== 22:16:01 (1716257761) [ 3986.515082] Lustre: DEBUG MARKER: SKIP: recovery-small test_110j needs >= 2 MDTs [ 3989.140723] Lustre: DEBUG MARKER: == recovery-small test 110k: FID_QUERY failed during recovery ========================================================== 22:16:04 (1716257764) [ 3989.646456] Lustre: DEBUG MARKER: SKIP: recovery-small test_110k needs >= 2 MDTS [ 3992.273410] Lustre: DEBUG MARKER: == recovery-small test 110m: update resent vs original RPC race ========================================================== 22:16:07 (1716257767) [ 3993.136540] Lustre: DEBUG MARKER: SKIP: recovery-small test_110m needs at least 2 MDTs [ 3995.821582] Lustre: DEBUG MARKER: == recovery-small test 111: mdd setup fail should not cause umount oops ========================================================== 22:16:10 (1716257770) [ 3996.826374] Lustre: Failing over lustre-MDT0000 [ 3996.951211] Lustre: server umount lustre-MDT0000 complete [ 3999.709062] Lustre: *** cfs_fail_loc=151, val=0*** [ 3999.711424] LustreError: 4668:0:(mdd_device.c:687:mdd_changelog_init()) lustre-MDD0000: changelog setup during init failed: rc = -5 [ 3999.716506] LustreError: 4668:0:(mdd_device.c:1402:mdd_prepare()) lustre-MDD0000: failed to initialize changelog: rc = -5 [ 3999.721351] LustreError: 4668:0:(tgt_mount.c:2223:server_fill_super()) Unable to start targets: -5 [ 3999.726356] Lustre: Failing over lustre-MDT0000 [ 3999.851897] Lustre: server umount lustre-MDT0000 complete [ 3999.853692] LustreError: 4668:0:(super25.c:189:lustre_fill_super()) llite: Unable to mount : rc = -5 [ 4001.609830] LustreError: 5163:0:(ldlm_resource.c:1128:ldlm_resource_complain()) MGC192.168.204.149@tcp: namespace resource [0x65727473756c:0x0:0x0].0x0 (ffff88012ce81500) refcount nonzero (1) after lock cleanup; forcing cleanup. [ 4001.616343] LustreError: 6971:0:(mgc_request.c:629:do_requeue()) failed processing log: -5 [ 4002.847346] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4004.384029] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:29699 to 0x280000401:29793) [ 4004.391621] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:28035 to 0x240000bd0:28065) [ 4007.916221] Lustre: DEBUG MARKER: == recovery-small test 112a: bulk resend while orignal request is in progress ========================================================== 22:16:22 (1716257782) [ 4008.448993] LustreError: 22499:0:(tgt_handler.c:2714:tgt_brw_write()) cfs_fail_timeout id 214 sleeping for 20000ms [ 4028.453205] LustreError: 22499:0:(tgt_handler.c:2714:tgt_brw_write()) cfs_fail_timeout id 214 awake [ 4033.025358] Lustre: DEBUG MARKER: == recovery-small test 115a: read: late REQ MDunlink and no bulk ========================================================== 22:16:48 (1716257808) [ 4041.339913] Lustre: DEBUG MARKER: == recovery-small test 115b: write: late REQ MDunlink and no bulk ========================================================== 22:16:56 (1716257816) [ 4045.439929] Lustre: *** cfs_fail_loc=215, val=2*** [ 4045.442119] Lustre: Skipped 80 previous similar messages [ 4049.575246] Lustre: DEBUG MARKER: == recovery-small test 115c: read: late Reply MDunlink and no bulk ========================================================== 22:17:04 (1716257824) [ 4054.921895] Lustre: DEBUG MARKER: == recovery-small test 115d: write: late Reply MDunlink and no bulk ========================================================== 22:17:10 (1716257830) [ 4056.839839] Lustre: *** cfs_fail_loc=215, val=0*** [ 4060.760459] Lustre: DEBUG MARKER: == recovery-small test 115e: read: late Bulk MDunlink and no reply ========================================================== 22:17:15 (1716257835) [ 4066.605360] Lustre: DEBUG MARKER: == recovery-small test 115f: read: late REQ MDunlink and no reply ========================================================== 22:17:21 (1716257841) [ 4069.447586] LustreError: 22945:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff880133767480 x1799622656061888/t0(0) o400->57183a4c-0744-4464-bd9a-e463ba6a5b37@192.168.204.49@tcp:199/0 lens 224/224 e 0 to 0 dl 1716257894 ref 1 fl Interpret:H/200/0 rc 0/0 job:'kworker.0' uid:0 gid:0 [ 4074.970058] Lustre: DEBUG MARKER: == recovery-small test 115g: read: late REQ MDunlink and Reply MDunlink ========================================================== 22:17:30 (1716257850) [ 4136.691052] Lustre: DEBUG MARKER: == recovery-small test 120: flock race: completion vs. evict ========================================================== 22:18:31 (1716257911) [ 4138.941576] Lustre: 13130:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 57183a4c-0744-4464-bd9a-e463ba6a5b37 at adminstrative request [ 4153.117212] Lustre: 13435:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 57183a4c-0744-4464-bd9a-e463ba6a5b37 at adminstrative request [ 4153.123441] Lustre: 13435:0:(genops.c:1671:obd_export_evict_by_uuid()) Skipped 1 previous similar message [ 4173.891542] Lustre: 13810:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 57183a4c-0744-4464-bd9a-e463ba6a5b37 at adminstrative request [ 4173.900773] Lustre: 13810:0:(genops.c:1671:obd_export_evict_by_uuid()) Skipped 2 previous similar messages [ 4201.630351] Lustre: DEBUG MARKER: == recovery-small test 113: ldlm enqueue dropped reply should not cause deadlocks ========================================================== 22:19:36 (1716257976) [ 4202.119713] Lustre: *** cfs_fail_loc=157, val=2147483648*** [ 4202.122313] Lustre: Skipped 1 previous similar message [ 4202.124595] LustreError: 5170:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88009e4e9880 x1799622656095680/t0(0) o101->57183a4c-0744-4464-bd9a-e463ba6a5b37@192.168.204.49@tcp:293/0 lens 576/688 e 0 to 0 dl 1716257988 ref 1 fl Interpret:/200/0 rc 0/0 job:'stat.0' uid:0 gid:0 [ 4227.101672] Lustre: DEBUG MARKER: == recovery-small test 130a: enqueue resend on not existing file ========================================================== 22:20:02 (1716258002) [ 4227.751291] LustreError: 5171:0:(mdt_handler.c:5184:mdt_intent_opc()) cfs_fail_timeout id 160 sleeping for 10000ms [ 4237.758207] LustreError: 5171:0:(mdt_handler.c:5184:mdt_intent_opc()) cfs_fail_timeout id 160 awake [ 4272.999431] Lustre: DEBUG MARKER: == recovery-small test 130b: enqueue resend on a stale inode ========================================================== 22:20:48 (1716258048) [ 4283.729208] LustreError: 5197:0:(mdt_handler.c:5184:mdt_intent_opc()) cfs_fail_timeout id 160 awake [ 4328.648138] Lustre: lustre-MDT0000: Client 57183a4c-0744-4464-bd9a-e463ba6a5b37 (at 192.168.204.49@tcp) reconnecting [ 4328.653101] Lustre: Skipped 5 previous similar messages [ 4328.658490] Lustre: *** cfs_fail_loc=217, val=0*** [ 4332.997236] Lustre: DEBUG MARKER: == recovery-small test 130c: layout intent resend on a stale inode ========================================================== 22:21:48 (1716258108) [ 4335.624881] LustreError: 5172:0:(mdt_handler.c:5184:mdt_intent_opc()) cfs_fail_timeout id 160 sleeping for 10000ms [ 4335.629338] LustreError: 5172:0:(mdt_handler.c:5184:mdt_intent_opc()) Skipped 1 previous similar message [ 4345.633195] LustreError: 5172:0:(mdt_handler.c:5184:mdt_intent_opc()) cfs_fail_timeout id 160 awake [ 4360.870785] Lustre: DEBUG MARKER: == recovery-small test 132: long punch =================== 22:22:16 (1716258136) [ 4433.575245] Lustre: ll_ost_io00_007: service thread pid 22499 was inactive for 72.056 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 4433.585126] Pid: 22499, comm: ll_ost_io00_007 3.10.0-7.9-debug #1 SMP Sat Mar 26 23:28:42 EDT 2022 [ 4433.590695] Call Trace: [ 4433.592077] [<0>] __cfs_fail_timeout_set+0xe9/0x210 [libcfs] [ 4433.594887] [<0>] ofd_punch_hdl+0xa8c/0xb40 [ofd] [ 4433.597580] [<0>] tgt_request_handle+0x74e/0x1a50 [ptlrpc] [ 4433.601135] [<0>] ptlrpc_server_handle_request+0x26c/0xcb0 [ptlrpc] [ 4433.604623] [<0>] ptlrpc_main+0xc76/0x1690 [ptlrpc] [ 4433.607198] [<0>] kthread+0xe4/0xf0 [ 4433.609574] [<0>] ret_from_fork_nospec_begin+0x7/0x21 [ 4433.612060] [<0>] 0xfffffffffffffffe [ 4481.618185] LustreError: 22499:0:(ofd_dev.c:2089:ofd_punch_hdl()) cfs_fail_timeout id 236 awake [ 4485.676384] Lustre: DEBUG MARKER: == recovery-small test 131: IO vs evict results to IO under staled lock ========================================================== 22:24:20 (1716258260) [ 4487.630911] Lustre: 19920:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-OST0000: evicting 57183a4c-0744-4464-bd9a-e463ba6a5b37 at adminstrative request [ 4487.637011] Lustre: 19920:0:(genops.c:1671:obd_export_evict_by_uuid()) Skipped 3 previous similar messages [ 4487.641281] LustreError: 9988:0:(ldlm_lockd.c:2996:ldlm_bl_thread_exports()) cfs_fail_timeout id 31e sleeping for 4000ms [ 4487.646572] LustreError: 9988:0:(ldlm_lockd.c:2996:ldlm_bl_thread_exports()) Skipped 1 previous similar message [ 4490.251132] LustreError: 9988:0:(ldlm_lockd.c:2996:ldlm_bl_thread_exports()) cfs_fail_timeout interrupted [ 4493.365077] Lustre: DEBUG MARKER: == recovery-small test 133: don't fail on flock resend === 22:24:28 (1716258268) [ 4494.823405] LustreError: 8231:0:(ldlm_lib.c:3271:target_send_reply_msg()) @@@ dropping reply req@ffff88012ff81500 x1799622656127104/t0(0) o101->57183a4c-0744-4464-bd9a-e463ba6a5b37@192.168.204.49@tcp:624/0 lens 328/344 e 0 to 0 dl 1716258319 ref 1 fl Interpret:/200/0 rc 0/0 job:'multiop.0' uid:0 gid:0 [ 4494.834835] LustreError: 8231:0:(ldlm_lib.c:3271:target_send_reply_msg()) Skipped 2 previous similar messages [ 4554.228344] Lustre: DEBUG MARKER: == recovery-small test 134: race between failover and search for reply data free slot ========================================================== 22:25:29 (1716258329) [ 4554.727539] Lustre: DEBUG MARKER: SKIP: recovery-small test_134 Need 2+ clients, have 1 [ 4557.058961] Lustre: DEBUG MARKER: == recovery-small test 135: DOM: open/create resend to return size ========================================================== 22:25:32 (1716258332) [ 4612.604733] Lustre: 5172:0:(mdt_recovery.c:148:mdt_req_from_lrd()) @@@ restoring transno req@ffff880098132a00 x1799622656132992/t98784247974(0) o101->57183a4c-0744-4464-bd9a-e463ba6a5b37@192.168.204.49@tcp:742/0 lens 648/3488 e 0 to 0 dl 1716258437 ref 1 fl Interpret:/202/0 rc 0/0 job:'openfile.0' uid:0 gid:0 [ 4615.184696] Lustre: DEBUG MARKER: SKIP: recovery-small test_136 skipping excluded test 136 [ 4617.218978] Lustre: DEBUG MARKER: == recovery-small test 137: late resend must be skipped if already applied ========================================================== 22:26:32 (1716258392) [ 4618.614477] LustreError: 5197:0:(mdt_reint.c:923:mdt_reint_setattr()) cfs_race id 525 sleeping [ 4623.616193] LustreError: 5197:0:(mdt_reint.c:923:mdt_reint_setattr()) cfs_fail_race id 525 awake: rc=0 [ 4623.641530] LustreError: 5197:0:(mdt_reint.c:923:mdt_reint_setattr()) cfs_fail_race id 525 waking [ 4675.345750] Lustre: DEBUG MARKER: == recovery-small test 138: Umount MDT during recovery === 22:27:30 (1716258450) [ 4675.935376] Lustre: DEBUG MARKER: SKIP: recovery-small test_138 needs >= 2 MDTs [ 4678.224970] Lustre: DEBUG MARKER: == recovery-small test 139: corrupted catid won't cause crash ========================================================== 22:27:33 (1716258453) [ 4678.773711] Lustre: DEBUG MARKER: SKIP: recovery-small test_139 needs >= 2 MDTs [ 4681.474886] Lustre: DEBUG MARKER: == recovery-small test 140a: local mount is flagged properly ========================================================== 22:27:36 (1716258456) [ 4682.759847] Lustre: lustre-MDT0000: local client fb5cbcb7-08d3-465c-9ff7-cdc802c2e242 w/o recovery [ 4682.764152] Lustre: Mounted lustre-client [ 4683.221285] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4684.588214] Lustre: Unmounted lustre-client [ 4685.757504] Lustre: Mounted lustre-client [ 4686.177503] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4687.466208] Lustre: Unmounted lustre-client [ 4692.386667] Lustre: DEBUG MARKER: == recovery-small test 140b: local mount is excluded from recovery ========================================================== 22:27:47 (1716258467) [ 4693.253925] Lustre: lustre-MDT0000: local client a2266b3e-9587-4a0c-85a4-a55e51b4bfad w/o recovery [ 4693.260400] Lustre: Mounted lustre-client [ 4693.991571] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4695.064690] LustreError: 28064:0:(osd_handler.c:717:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** [ 4695.359940] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 [ 4696.092189] Lustre: Unmounted lustre-client [ 4697.031692] Lustre: Failing over lustre-MDT0000 [ 4697.171787] Lustre: server umount lustre-MDT0000 complete [ 4709.585404] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 4709.590039] LustreError: Skipped 4 previous similar messages [ 4709.718763] Lustre: lustre-MDT0000-lwp-OST0000: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 4709.726666] Lustre: Skipped 12 previous similar messages [ 4709.759460] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 4709.762327] Lustre: Skipped 6 previous similar messages [ 4709.792352] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 4709.795145] Lustre: Skipped 6 previous similar messages [ 4710.757290] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4712.681364] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 4712.684864] Lustre: Skipped 6 previous similar messages [ 4712.708270] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 4712.712732] Lustre: Skipped 6 previous similar messages [ 4712.735219] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:28079 to 0x240000bd0:28097) [ 4712.735296] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:29807 to 0x280000401:29825) [ 4713.523910] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 4714.079889] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 4714.760698] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 4714.763128] Lustre: Skipped 10 previous similar messages [ 4719.947840] Lustre: DEBUG MARKER: == recovery-small test 141: do not lose locks on MGS restart ========================================================== 22:28:15 (1716258495) [ 4720.791741] Lustre: DEBUG MARKER: SKIP: recovery-small test_141 cannot run in local mode or from build tree [ 4722.930526] Lustre: DEBUG MARKER: == recovery-small test 142: orphan name stub can be cleaned up in startup ========================================================== 22:28:17 (1716258497) [ 4723.288904] Lustre: *** cfs_fail_loc=165, val=0*** [ 4723.958947] Lustre: Failing over lustre-MDT0000 [ 4724.109767] Lustre: server umount lustre-MDT0000 complete [ 4727.744221] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:29827 to 0x280000401:29857) [ 4727.744374] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:28079 to 0x240000bd0:28129) [ 4728.026943] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4732.853241] Lustre: DEBUG MARKER: == recovery-small test 143: orphan cleanup thread shouldn't be blocked even delete failed ========================================================== 22:28:27 (1716258507) [ 4733.541140] Lustre: Failing over lustre-MDT0000 [ 4733.681585] Lustre: server umount lustre-MDT0000 complete [ 4741.067382] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4742.166681] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 [ 4742.781443] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:29827 to 0x280000401:29889) [ 4742.781475] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:28079 to 0x240000bd0:28161) [ 4742.782199] LustreError: 3968:0:(mdd_orphans.c:452:mdd_orphan_index_iterate()) lustre-MDD0000: bad FID [0x0:0x0:0x0] cleaning 'PENDING' [ 4751.362155] Lustre: DEBUG MARKER: == recovery-small test 144a: MDT failover should stop precreation threads ========================================================== 22:28:46 (1716258526) [ 4752.966894] Lustre: Failing over lustre-OST0000 [ 4752.980724] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_create to node 0@lo failed: rc = -19 [ 4753.966236] Lustre: 3231:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716258473/real 1716258473] req@ffff8800a8549180 x1799622588133696/t0(0) o400->lustre-MDT0000-lwp-OST0001@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716258528 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 4753.978826] Lustre: 3231:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 12 previous similar messages [ 4755.046453] Lustre: server umount lustre-OST0000 complete [ 4757.751899] LustreError: 137-5: lustre-OST0000: not available for connect from 192.168.204.49@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 4757.755056] LustreError: Skipped 6 previous similar messages [ 4769.176796] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4771.528376] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 4771.902239] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 4833.953503] Lustre: Failing over lustre-MDT0000 [ 4834.160142] Lustre: server umount lustre-MDT0000 complete [ 4847.955710] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:55178 to 0x280000401:55265) [ 4847.955735] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:52874 to 0x240000bd0:52897) [ 4847.961952] LustreError: 8871:0:(mdd_orphans.c:452:mdd_orphan_index_iterate()) lustre-MDD0000: bad FID [0x0:0x0:0x0] cleaning 'PENDING' [ 4848.097551] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4851.050653] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 4851.628244] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 4853.677149] Lustre: Failing over lustre-MDT0000 [ 4853.834751] Lustre: server umount lustre-MDT0000 complete [ 4867.875558] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4867.975416] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:55178 to 0x280000401:55297) [ 4867.975435] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x240000bd0:52874 to 0x240000bd0:52929) [ 4867.982064] LustreError: 10625:0:(mdd_orphans.c:452:mdd_orphan_index_iterate()) lustre-MDD0000: bad FID [0x0:0x0:0x0] cleaning 'PENDING' [ 4870.802609] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 4871.365498] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 4890.039712] Lustre: DEBUG MARKER: == recovery-small test 144b: orphan cleanup shouldn't be blocked for no objects+failover situation ========================================================== 22:31:05 (1716258665) [ 4891.413391] Lustre: Failing over lustre-OST0000 [ 4891.422630] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_create to node 0@lo failed: rc = -19 [ 4892.967942] Lustre: lustre-OST0000: Not available for connect from 192.168.204.49@tcp (stopping) [ 4892.972282] Lustre: Skipped 1 previous similar message [ 4893.640137] Lustre: server umount lustre-OST0000 complete [ 4897.975640] LustreError: 137-5: lustre-OST0000: not available for connect from 192.168.204.49@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 4897.979181] LustreError: Skipped 1 previous similar message [ 4908.001671] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 4910.343282] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 4910.815120] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 4914.789655] Lustre: lustre-OST0001-osc-MDT0000: update sequence from 0x280000401 to 0x280000402 [ 4916.241776] Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x240000bd0 to 0x2400013a0 [ 5048.758158] Lustre: DEBUG MARKER: == recovery-small test 144c: reconnection during orphan cleanup shouldn't lose LAST_ID synchronization ========================================================== 22:33:43 (1716258823) [ 5085.739331] Lustre: Failing over lustre-MDT0000 [ 5086.199814] Lustre: server umount lustre-MDT0000 complete [ 5090.233438] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 5090.638307] LustreError: 6393:0:(ofd_dev.c:1523:ofd_create_hdl()) cfs_fail_timeout id 254 sleeping for 5000ms [ 5090.642357] LustreError: 6393:0:(ofd_dev.c:1523:ofd_create_hdl()) Skipped 1 previous similar message [ 5090.643309] LustreError: 21144:0:(mdd_orphans.c:452:mdd_orphan_index_iterate()) lustre-MDD0000: bad FID [0x0:0x0:0x0] cleaning 'PENDING' [ 5091.432036] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 [ 5092.772028] Lustre: lustre-OST0000: Client lustre-MDT0000-mdtlov_UUID (at 0@lo) reconnecting [ 5092.775745] Lustre: Skipped 3 previous similar messages [ 5093.138163] LustreError: 6541:0:(ofd_dev.c:1523:ofd_create_hdl()) cfs_fail_timeout interrupted [ 5093.142149] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000402:14859 to 0x280000402:14977) [ 5093.150267] LustreError: 6393:0:(ofd_dev.c:1528:ofd_create_hdl()) lustre-OST0000: dropping old orphan cleanup request [ 5093.155395] LustreError: 20666:0:(osp_precreate.c:992:osp_precreate_cleanup_orphans()) lustre-OST0000-osc-MDT0000: cannot cleanup orphans: rc = -116 [ 5094.162859] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2400013a0:21299 to 0x2400013a0:23937) [ 5109.810437] Lustre: DEBUG MARKER: == recovery-small test 145: connect mdtlovs and process update logs after recovery expire ========================================================== 22:34:44 (1716258884) [ 5110.343667] Lustre: DEBUG MARKER: SKIP: recovery-small test_145 needs >= 3 MDTs [ 5113.081877] Lustre: DEBUG MARKER: == recovery-small test 146: test eviction is counted properly ========================================================== 22:34:48 (1716258888) [ 5113.764366] Lustre: 23031:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 57183a4c-0744-4464-bd9a-e463ba6a5b37 at adminstrative request [ 5118.503069] Lustre: DEBUG MARKER: == recovery-small test 147: Check client reconnect ======= 22:34:53 (1716258893) [ 5119.257957] Lustre: *** cfs_fail_loc=225, val=0*** [ 5274.455749] Lustre: lustre-OST0000: haven't heard from client 57183a4c-0744-4464-bd9a-e463ba6a5b37 (at 192.168.204.49@tcp) in 155 seconds. I think it's dead, and I am evicting it. exp ffff88009cd84800, cur 1716259049 expire 1716259019 last 1716258894 [ 5287.003031] Lustre: DEBUG MARKER: == recovery-small test 148: data corruption through resend ========================================================== 22:37:42 (1716259062) [ 5304.295848] Lustre: lustre-MDT0000: haven't heard from client lustre-MDT0000-lwp-OST0001_UUID (at 0@lo) in 35 seconds. I think it's dead, and I am evicting it. exp ffff88009329f000, cur 1716259079 expire 1716259049 last 1716259044 [ 5314.727891] LustreError: 166-1: MGC192.168.204.149@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 5314.728586] Lustre: lustre-MDT0000-lwp-OST0000: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 5314.728589] Lustre: Skipped 13 previous similar messages [ 5314.742202] LustreError: Skipped 5 previous similar messages [ 5314.745834] Lustre: Evicted from MGS (at 192.168.204.149@tcp) after server handle changed from 0xc8a46ac833d783da to 0xc8a46ac833d7d3b9 [ 5315.759243] LustreError: 22499:0:(tgt_handler.c:2880:tgt_brw_write()) cfs_fail_timeout id 227 awake [ 5322.575663] Lustre: DEBUG MARKER: == recovery-small test 149: skip orphan removal at umount ========================================================== 22:38:17 (1716259097) [ 5322.890614] Lustre: DEBUG MARKER: SKIP: recovery-small test_149 needs >= 2 MDTs [ 5325.193191] Lustre: DEBUG MARKER: == recovery-small test 150: statfs when MDT0 offline with lazystatfs option ========================================================== 22:38:20 (1716259100) [ 5325.732183] Lustre: DEBUG MARKER: SKIP: recovery-small test_150 needs >= 2 MDTs [ 5328.358011] Lustre: DEBUG MARKER: == recovery-small test 152: QoS object allocation could be awakened in case of OST failover ========================================================== 22:38:23 (1716259103) [ 5329.254814] Lustre: DEBUG MARKER: SKIP: recovery-small test_152 MDS Linux kernel does not support killable semaphore [ 5331.905066] Lustre: DEBUG MARKER: == recovery-small test 153: evict vs reconnect race ====== 22:38:26 (1716259106) [ 5355.668638] Lustre: Failing over lustre-MDT0000 [ 5355.782540] Lustre: server umount lustre-MDT0000 complete [ 5358.788163] Lustre: lustre-MDT0000: Not available for connect from 192.168.204.49@tcp (not set up) [ 5358.866487] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 5358.869935] Lustre: Skipped 7 previous similar messages [ 5358.905413] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 5358.909059] Lustre: Skipped 7 previous similar messages [ 5359.969973] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 5360.147474] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 5360.149490] Lustre: Skipped 7 previous similar messages [ 5360.160507] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 5360.162657] Lustre: Skipped 7 previous similar messages [ 5360.174559] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2400013a0:23941 to 0x2400013a0:23969) [ 5360.174561] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000402:14979 to 0x280000402:15009) [ 5360.175635] LustreError: 29264:0:(mdd_orphans.c:452:mdd_orphan_index_iterate()) lustre-MDD0000: bad FID [0x0:0x0:0x0] cleaning 'PENDING' [ 5361.402830] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 [ 5363.865996] Lustre: lustre-MDT0000-lwp-OST0000: Connection restored to 192.168.204.149@tcp (at 0@lo) [ 5363.870102] Lustre: Skipped 17 previous similar messages [ 5365.609154] Lustre: DEBUG MARKER: == recovery-small test 154a: corruption update llog can be skipped ========================================================== 22:39:00 (1716259140) [ 5366.128384] Lustre: DEBUG MARKER: SKIP: recovery-small test_154a needs >= 2 MDTs [ 5368.834000] Lustre: DEBUG MARKER: == recovery-small test 154b: restore update llog after failed recovery ========================================================== 22:39:03 (1716259143) [ 5369.361349] Lustre: DEBUG MARKER: SKIP: recovery-small test_154b needs >= 2 MDTs [ 5371.994044] Lustre: DEBUG MARKER: == recovery-small test 155: failover after client remount ========================================================== 22:39:07 (1716259147) [ 5374.874300] LustreError: 31618:0:(osd_handler.c:717:osd_ro()) lustre-MDT0000: *** setting device osd-zfs read-only *** [ 5375.191623] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 [ 5375.886763] Lustre: Failing over lustre-MDT0000 [ 5376.033349] Lustre: server umount lustre-MDT0000 complete [ 5389.307695] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000402:14979 to 0x280000402:15041) [ 5389.308421] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2400013a0:23971 to 0x2400013a0:24001) [ 5389.308453] LustreError: 310:0:(mdd_orphans.c:452:mdd_orphan_index_iterate()) lustre-MDD0000: bad FID [0x0:0x0:0x0] cleaning 'PENDING' [ 5389.616494] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 5394.820101] Lustre: DEBUG MARKER: == recovery-small test 156: tot_granted miscount after client eviction ========================================================== 22:39:29 (1716259169) [ 5395.467016] Lustre: Setting parameter general.timeout in log params [ 5396.870422] LustreError: 1780:0:(osd_handler.c:717:osd_ro()) lustre-OST0000: *** setting device osd-zfs read-only *** [ 5397.183311] Lustre: DEBUG MARKER: ost1 REPLAY BARRIER on lustre-OST0000 [ 5398.158610] Lustre: Failing over lustre-OST0000 [ 5398.631985] LustreError: 11-0: lustre-OST0000-osc-MDT0000: operation ost_statfs to node 0@lo failed: rc = -107 [ 5398.636261] Lustre: lustre-OST0000: Not available for connect from 0@lo (stopping) [ 5400.343298] Lustre: server umount lustre-OST0000 complete [ 5400.668198] LustreError: 137-5: lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 5400.675113] LustreError: Skipped 1 previous similar message [ 5414.364144] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug -1 all [ 5434.713218] Lustre: 3229:0:(client.c:2343:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1716259153/real 1716259153] req@ffff8800935e1c00 x1799622594769024/t0(0) o400->lustre-MDT0000-lwp-OST0001@0@lo:12/10 lens 224/224 e 0 to 1 dl 1716259208 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 5434.726243] Lustre: 3229:0:(client.c:2343:ptlrpc_expire_one_request()) Skipped 10 previous similar messages [ 5452.712212] Lustre: lustre-OST0000: recovery is timed out, evict stale exports [ 5452.715504] Lustre: 2747:0:(genops.c:1528:class_disconnect_stale_exports()) lustre-OST0000: disconnect stale client 76cf1662-4e82-4feb-9ee6-863b25f176d8@192.168.204.49@tcp [ 5452.722119] Lustre: lustre-OST0000: disconnecting 1 stale clients [ 5452.725585] Lustre: 2747:0:(ldlm_lib.c:1992:extend_recovery_timer()) lustre-OST0000: extended recovery timer reached hard limit: 45, extend: 1 [ 5452.750360] Lustre: 2747:0:(ldlm_lib.c:2874:target_recovery_thread()) too long recovery - read logs [ 5452.754916] LustreError: dumping log to /tmp/lustre-log.1716259227.2747 [ 5459.131202] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 5459.671733] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 5463.582174] Lustre: Modifying parameter general.timeout in log params [ 5466.087044] Lustre: DEBUG MARKER: == recovery-small test 157: eviction during mmaped i/o === 22:40:41 (1716259241) [ 5467.426933] Lustre: 4486:0:(genops.c:1671:obd_export_evict_by_uuid()) lustre-OST0000: evicting 76cf1662-4e82-4feb-9ee6-863b25f176d8 at adminstrative request [ 5467.430503] Lustre: 4486:0:(genops.c:1671:obd_export_evict_by_uuid()) Skipped 1 previous similar message [ 5471.004488] Lustre: DEBUG MARKER: == recovery-small test complete, duration 5367 sec ======= 22:40:46 (1716259246) [ 5542.340293] Lustre: Failing over lustre-MDT0000 [ 5542.529266] Lustre: server umount lustre-MDT0000 complete [ 5555.612295] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2400013a0:24004 to 0x2400013a0:24033) [ 5555.612308] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000402:14979 to 0x280000402:15073) [ 5555.612762] LustreError: 9928:0:(mdd_orphans.c:452:mdd_orphan_index_iterate()) lustre-MDD0000: bad FID [0x0:0x0:0x0] cleaning 'PENDING' [ 5556.160805] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all [ 5558.942661] Lustre: DEBUG MARKER: oleg449-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 5559.509245] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 5565.079839] Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) [ 5565.083216] Lustre: Skipped 5 previous similar messages [ 5567.835708] Lustre: server umount lustre-MDT0000 complete [ 5569.256968] LustreError: 27526:0:(ldlm_lockd.c:2594:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1716259344 with bad export cookie 14457798111862496454 [ 5569.263192] LustreError: 27526:0:(ldlm_lockd.c:2594:ldlm_cancel_handler()) Skipped 2 previous similar messages [ 5569.270800] Lustre: server umount lustre-OST0000 complete [ 5570.648377] Lustre: server umount lustre-OST0001 complete [ 5574.459741] Lustre: DEBUG MARKER: oleg449-server.virtnet: executing unload_modules_local [ 5575.199394] Key type lgssc unregistered [ 5575.285870] LNet: 12006:0:(lib-ptl.c:966:lnet_clear_lazy_portal()) Active lazy portal 0 on exit [ 5576.289691] LNet: Removed LNI 192.168.204.149@tcp