[ 0.000000] Linux version 4.18.0rh8.10-debug (green@maintenance) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-22) (GCC)) #7 SMP Sat Jan 18 21:01:29 EST 2025 [ 0.000000] Command line: rd.shell root=nbd:192.168.200.253:rocky8.10:ext4:ro:-p,-b4096 ro crashkernel=256M panic=1 nomodeset ipmtu=9000 ip=dhcp rd.neednet=1 init_on_free=off mitigations=off console=ttyS1,115200 audit=0 [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [ 0.000000] signal: max sigframe size: 1776 [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bffcdfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000bffce000-0x00000000bfffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x0000000146dfffff] usable [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] SMBIOS 3.0.0 present. [ 0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 0.000000] Hypervisor detected: KVM [ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00 [ 0.000000] kvm-clock: using sched offset of 605520856 cycles [ 0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns [ 0.000000] tsc: Detected 2399.998 MHz processor [ 0.000000] last_pfn = 0x146e00 max_arch_pfn = 0x400000000 [ 0.000000] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT [ 0.000000] last_pfn = 0xbffce max_arch_pfn = 0x400000000 [ 0.000000] found SMP MP-table at [mem 0x000f53f0-0x000f53ff] [ 0.000000] RAMDISK: [mem 0xbcbe3000-0xbffbffff] [ 0.000000] ACPI: Early table checksum verification disabled [ 0.000000] ACPI: RSDP 0x00000000000F5200 000014 (v00 BOCHS ) [ 0.000000] ACPI: RSDT 0x00000000BFFE1D87 000034 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: FACP 0x00000000BFFE1C23 000074 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: DSDT 0x00000000BFFE0040 001BE3 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: FACS 0x00000000BFFE0000 000040 [ 0.000000] ACPI: APIC 0x00000000BFFE1C97 000090 (v03 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: HPET 0x00000000BFFE1D27 000038 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: WAET 0x00000000BFFE1D5F 000028 (v01 BOCHS BXPC 00000001 BXPC 00000001) [ 0.000000] ACPI: Reserving FACP table memory at [mem 0xbffe1c23-0xbffe1c96] [ 0.000000] ACPI: Reserving DSDT table memory at [mem 0xbffe0040-0xbffe1c22] [ 0.000000] ACPI: Reserving FACS table memory at [mem 0xbffe0000-0xbffe003f] [ 0.000000] ACPI: Reserving APIC table memory at [mem 0xbffe1c97-0xbffe1d26] [ 0.000000] ACPI: Reserving HPET table memory at [mem 0xbffe1d27-0xbffe1d5e] [ 0.000000] ACPI: Reserving WAET table memory at [mem 0xbffe1d5f-0xbffe1d86] [ 0.000000] No NUMA configuration found [ 0.000000] Faking a node at [mem 0x0000000000000000-0x0000000146dfffff] [ 0.000000] NODE_DATA(0) allocated [mem 0x1465a3000-0x1465cdfff] [ 0.000000] Reserving 256MB of memory at 2752MB for crashkernel (System RAM: 4205MB) [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff] [ 0.000000] DMA32 [mem 0x0000000001000000-0x00000000ffffffff] [ 0.000000] Normal [mem 0x0000000100000000-0x0000000146dfffff] [ 0.000000] Device empty [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges [ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009efff] [ 0.000000] node 0: [mem 0x0000000000100000-0x00000000bffcdfff] [ 0.000000] node 0: [mem 0x0000000100000000-0x0000000146dfffff] [ 0.000000] Zeroed struct page in unavailable ranges: 4756 pages [ 0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x0000000146dfffff] [ 0.000000] ACPI: PM-Timer IO Port: 0x608 [ 0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1]) [ 0.000000] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23 [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) [ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) [ 0.000000] Using ACPI (MADT) for SMP configuration information [ 0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000 [ 0.000000] TSC deadline timer available [ 0.000000] smpboot: Allowing 4 CPUs, 0 hotplug CPUs [ 0.000000] kvm-guest: KVM setup pv remote TLB flush [ 0.000000] kvm-guest: setup PV sched yield [ 0.000000] PM: Registered nosave memory: [mem 0x00000000-0x00000fff] [ 0.000000] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff] [ 0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000effff] [ 0.000000] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff] [ 0.000000] PM: Registered nosave memory: [mem 0xbffce000-0xbfffffff] [ 0.000000] PM: Registered nosave memory: [mem 0xc0000000-0xfeffbfff] [ 0.000000] PM: Registered nosave memory: [mem 0xfeffc000-0xfeffffff] [ 0.000000] PM: Registered nosave memory: [mem 0xff000000-0xfffbffff] [ 0.000000] PM: Registered nosave memory: [mem 0xfffc0000-0xffffffff] [ 0.000000] [mem 0xc0000000-0xfeffbfff] available for PCI devices [ 0.000000] Booting paravirtualized kernel on KVM [ 0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns [ 0.000000] setup_percpu: NR_CPUS:8192 nr_cpumask_bits:4 nr_cpu_ids:4 nr_node_ids:1 [ 0.000000] percpu: Embedded 513 pages/cpu s2064384 r8192 d28672 u4194304 [ 0.000000] kvm-guest: PV spinlocks enabled [ 0.000000] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes, linear) [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 1059606 [ 0.000000] Policy zone: Normal [ 0.000000] Kernel command line: rd.shell root=nbd:192.168.200.253:rocky8.10:ext4:ro:-p,-b4096 ro crashkernel=256M panic=1 nomodeset ipmtu=9000 ip=dhcp rd.neednet=1 init_on_free=off mitigations=off console=ttyS1,115200 audit=0 [ 0.000000] Specific versions of hardware are certified with Red Hat Enterprise Linux 8. Please see the list of hardware certified with Red Hat Enterprise Linux 8 at https://catalog.redhat.com. [ 0.000000] audit: disabled (until reboot) [ 0.000000] software IO TLB: area num 4. [ 0.000000] Memory: 2818960K/4306352K available (20483K kernel code, 12066K rwdata, 7356K rodata, 4680K init, 23504K bss, 542472K reserved, 0K cma-reserved) [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 [ 0.000000] kmemleak: Kernel memory leak detector disabled [ 0.000000] ftrace: allocating 41388 entries in 162 pages [ 0.000000] ftrace: allocated 162 pages with 3 groups [ 0.000000] rcu: Hierarchical RCU implementation. [ 0.000000] rcu: RCU event tracing is enabled. [ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=8192 to nr_cpu_ids=4. [ 0.000000] Rude variant of Tasks RCU enabled. [ 0.000000] Tracing variant of Tasks RCU enabled. [ 0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 100 jiffies. [ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4 [ 0.000000] NR_IRQS: 524544, nr_irqs: 456, preallocated irqs: 16 [ 0.000000] random: get_random_bytes called from start_kernel+0x616/0x99a with crng_init=0 [ 0.001000] Console: colour *CGA 80x25 [ 0.001000] printk: console [ttyS1] enabled [ 0.001000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar [ 0.001000] ... MAX_LOCKDEP_SUBCLASSES: 8 [ 0.001000] ... MAX_LOCK_DEPTH: 48 [ 0.001000] ... MAX_LOCKDEP_KEYS: 8192 [ 0.001000] ... CLASSHASH_SIZE: 4096 [ 0.001000] ... MAX_LOCKDEP_ENTRIES: 32768 [ 0.001000] ... MAX_LOCKDEP_CHAINS: 65536 [ 0.001000] ... CHAINHASH_SIZE: 32768 [ 0.001000] memory used by lock dependency info: 4149 kB [ 0.001000] per task-struct memory footprint: 2688 bytes [ 0.001000] ACPI: Core revision 20220331 [ 0.001000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns [ 0.001016] APIC: Switch to symmetric I/O mode setup [ 0.002163] x2apic enabled [ 0.003010] Switched APIC routing to physical x2apic. [ 0.004018] kvm-guest: setup PV IPIs [ 0.008000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 [ 0.008000] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x229835b7123, max_idle_ns: 440795242976 ns [ 0.008031] Calibrating delay loop (skipped) preset value.. 4799.99 BogoMIPS (lpj=2399998) [ 0.009016] pid_max: default: 32768 minimum: 301 [ 0.010391] LSM: Security Framework initializing [ 0.011138] Yama: becoming mindful. [ 0.012146] SELinux: Initializing. [ 0.014146] *** VALIDATE selinux *** [ 0.021016] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes, vmalloc) [ 0.023000] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes, vmalloc) [ 0.024037] Mount-cache hash table entries: 16384 (order: 5, 131072 bytes, vmalloc) [ 0.025146] Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes, vmalloc) [ 0.026259] *** VALIDATE tmpfs *** [ 0.029839] *** VALIDATE proc *** [ 0.031190] *** VALIDATE cgroup *** [ 0.032014] *** VALIDATE cgroup2 *** [ 0.034214] x86/cpu: User Mode Instruction Prevention (UMIP) activated [ 0.035215] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0 [ 0.036013] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 [ 0.038017] Spectre V2 : User space: Vulnerable [ 0.039010] Speculative Store Bypass: Vulnerable [ 0.042143] debug: unmapping init [mem 0xffffffffb4703000-0xffffffffb470afff] [ 0.044704] smpboot: CPU0: Intel(R) Xeon(R) CPU E5-2695 v2 @ 2.40GHz (family: 0x6, model: 0x3e, stepping: 0x4) [ 0.047203] Performance Events: IvyBridge events, full-width counters, Intel PMU driver. [ 0.048033] ... version: 2 [ 0.049014] ... bit width: 48 [ 0.050018] ... generic registers: 4 [ 0.051013] ... value mask: 0000ffffffffffff [ 0.052017] ... max period: 00007fffffffffff [ 0.053018] ... fixed-purpose events: 3 [ 0.054014] ... event mask: 000000070000000f [ 0.056101] rcu: Hierarchical SRCU implementation. [ 0.061929] smp: Bringing up secondary CPUs ... [ 0.063594] x86: Booting SMP configuration: [ 0.064019] .... node #0, CPUs: #1 [ 0.071709] #2 [ 0.083766] #3 [ 0.088899] smp: Brought up 1 node, 4 CPUs [ 0.089037] smpboot: Max logical packages: 1 [ 0.090017] smpboot: Total of 4 processors activated (19199.98 BogoMIPS) [ 0.120023] node 0 deferred pages initialised in 25ms [ 0.121000] pgdatinit0 (35) used greatest stack depth: 14528 bytes left [ 0.127465] devtmpfs: initialized [ 0.132051] x86/mm: Memory block size: 128MB [ 0.147887] gcov: version magic: 0x41383552 [ 0.150903] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns [ 0.155153] futex hash table entries: 1024 (order: 5, 131072 bytes, vmalloc) [ 0.158819] pinctrl core: initialized pinctrl subsystem [ 0.161847] [ 0.162011] ************************************************************* [ 0.164018] ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** [ 0.167017] ** ** [ 0.169018] ** IOMMU DebugFS SUPPORT HAS BEEN ENABLED IN THIS KERNEL ** [ 0.171015] ** ** [ 0.174022] ** This means that this kernel is built to expose internal ** [ 0.176014] ** IOMMU data structures, which may compromise security on ** [ 0.179017] ** your system. ** [ 0.181015] ** ** [ 0.183019] ** If you see this message and you are not debugging the ** [ 0.186023] ** kernel, report this immediately to your vendor! ** [ 0.188023] ** ** [ 0.191016] ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** [ 0.193016] ************************************************************* [ 0.199228] NET: Registered protocol family 16 [ 0.203083] DMA: preallocated 512 KiB GFP_KERNEL pool for atomic allocations [ 0.204169] DMA: preallocated 512 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations [ 0.205113] DMA: preallocated 512 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations [ 0.208534] cpuidle: using governor menu [ 0.210724] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5 [ 0.213685] PCI: Using configuration type 1 for base access [ 0.216132] core: PMU erratum BJ122, BV98, HSD29 worked around, HT is on [ 0.267138] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages [ 0.268032] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages [ 0.283126] cryptd: max_cpu_qlen set to 1000 [ 0.289980] ACPI: Added _OSI(Module Device) [ 0.290029] ACPI: Added _OSI(Processor Device) [ 0.291021] ACPI: Added _OSI(3.0 _SCP Extensions) [ 0.292024] ACPI: Added _OSI(Processor Aggregator Device) [ 0.332717] ACPI: 1 ACPI AML tables successfully acquired and loaded [ 0.348727] ACPI: Interpreter enabled [ 0.350396] ACPI: PM: (supports S0 S3 S4 S5) [ 0.351015] ACPI: Using IOAPIC for interrupt routing [ 0.353427] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 0.360151] ACPI: Enabled 2 GPEs in block 00 to 0F [ 0.439000] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff]) [ 0.442073] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI HPX-Type3] [ 0.445098] acpi PNP0A03:00: _OSC: not requesting OS control; OS requires [ExtendedConfig ASPM ClockPM MSI] [ 0.449355] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. [ 0.459240] acpiphp: Slot [2] registered [ 0.460426] acpiphp: Slot [5] registered [ 0.462382] acpiphp: Slot [6] registered [ 0.464289] acpiphp: Slot [7] registered [ 0.466353] acpiphp: Slot [8] registered [ 0.468291] acpiphp: Slot [9] registered [ 0.469357] acpiphp: Slot [10] registered [ 0.471337] acpiphp: Slot [3] registered [ 0.473279] acpiphp: Slot [4] registered [ 0.475304] acpiphp: Slot [11] registered [ 0.477414] acpiphp: Slot [12] registered [ 0.479316] acpiphp: Slot [13] registered [ 0.480425] acpiphp: Slot [14] registered [ 0.482354] acpiphp: Slot [15] registered [ 0.484285] acpiphp: Slot [16] registered [ 0.486296] acpiphp: Slot [17] registered [ 0.488304] acpiphp: Slot [18] registered [ 0.490357] acpiphp: Slot [19] registered [ 0.491239] acpiphp: Slot [20] registered [ 0.493290] acpiphp: Slot [21] registered [ 0.494401] acpiphp: Slot [22] registered [ 0.496319] acpiphp: Slot [23] registered [ 0.498276] acpiphp: Slot [24] registered [ 0.499286] acpiphp: Slot [25] registered [ 0.501302] acpiphp: Slot [26] registered [ 0.503316] acpiphp: Slot [27] registered [ 0.505129] acpiphp: Slot [28] registered [ 0.506321] acpiphp: Slot [29] registered [ 0.508260] acpiphp: Slot [30] registered [ 0.510148] acpiphp: Slot [31] registered [ 0.511185] PCI host bridge to bus 0000:00 [ 0.512085] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window] [ 0.513033] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window] [ 0.514048] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window] [ 0.515073] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xfebfffff window] [ 0.516048] pci_bus 0000:00: root bus resource [mem 0x380000000000-0x38007fffffff window] [ 0.517051] pci_bus 0000:00: root bus resource [bus 00-ff] [ 0.518411] pci 0000:00:00.0: [8086:1237] type 00 class 0x060000 [ 0.521564] pci 0000:00:01.0: [8086:7000] type 00 class 0x060100 [ 0.525343] pci 0000:00:01.1: [8086:7010] type 00 class 0x010180 [ 0.531017] pci 0000:00:01.1: reg 0x20: [io 0xc320-0xc32f] [ 0.534539] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io 0x01f0-0x01f7] [ 0.535020] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io 0x03f6] [ 0.536020] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io 0x0170-0x0177] [ 0.537024] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io 0x0376] [ 0.539984] pci 0000:00:01.3: [8086:7113] type 00 class 0x068000 [ 0.541020] pci 0000:00:01.3: quirk: [io 0x0600-0x063f] claimed by PIIX4 ACPI [ 0.542049] pci 0000:00:01.3: quirk: [io 0x0700-0x070f] claimed by PIIX4 SMB [ 0.545016] pci 0000:00:02.0: [1af4:1000] type 00 class 0x020000 [ 0.548035] pci 0000:00:02.0: reg 0x10: [io 0xc300-0xc31f] [ 0.556019] pci 0000:00:02.0: reg 0x20: [mem 0x380000000000-0x380000003fff 64bit pref] [ 0.558020] pci 0000:00:02.0: reg 0x30: [mem 0xfeb80000-0xfebbffff pref] [ 0.580614] pci 0000:00:05.0: [1af4:1001] type 00 class 0x010000 [ 0.592021] pci 0000:00:05.0: reg 0x10: [io 0xc000-0xc07f] [ 0.600025] pci 0000:00:05.0: reg 0x14: [mem 0xfebc0000-0xfebc0fff] [ 0.622016] pci 0000:00:05.0: reg 0x20: [mem 0x380000004000-0x380000007fff 64bit pref] [ 0.653000] pci 0000:00:06.0: [1af4:1001] type 00 class 0x010000 [ 0.656023] pci 0000:00:06.0: reg 0x10: [io 0xc080-0xc0ff] [ 0.659023] pci 0000:00:06.0: reg 0x14: [mem 0xfebc1000-0xfebc1fff] [ 0.666017] pci 0000:00:06.0: reg 0x20: [mem 0x380000008000-0x38000000bfff 64bit pref] [ 0.698303] pci 0000:00:07.0: [1af4:1001] type 00 class 0x010000 [ 0.709023] pci 0000:00:07.0: reg 0x10: [io 0xc100-0xc17f] [ 0.722019] pci 0000:00:07.0: reg 0x14: [mem 0xfebc2000-0xfebc2fff] [ 0.741018] pci 0000:00:07.0: reg 0x20: [mem 0x38000000c000-0x38000000ffff 64bit pref] [ 0.768509] pci 0000:00:08.0: [1af4:1001] type 00 class 0x010000 [ 0.777020] pci 0000:00:08.0: reg 0x10: [io 0xc180-0xc1ff] [ 0.786024] pci 0000:00:08.0: reg 0x14: [mem 0xfebc3000-0xfebc3fff] [ 0.803024] pci 0000:00:08.0: reg 0x20: [mem 0x380000010000-0x380000013fff 64bit pref] [ 0.829000] pci 0000:00:09.0: [1af4:1001] type 00 class 0x010000 [ 0.835018] pci 0000:00:09.0: reg 0x10: [io 0xc200-0xc27f] [ 0.841025] pci 0000:00:09.0: reg 0x14: [mem 0xfebc4000-0xfebc4fff] [ 0.860024] pci 0000:00:09.0: reg 0x20: [mem 0x380000014000-0x380000017fff 64bit pref] [ 0.886696] pci 0000:00:0a.0: [1af4:1001] type 00 class 0x010000 [ 0.889018] pci 0000:00:0a.0: reg 0x10: [io 0xc280-0xc2ff] [ 0.892023] pci 0000:00:0a.0: reg 0x14: [mem 0xfebc5000-0xfebc5fff] [ 0.899021] pci 0000:00:0a.0: reg 0x20: [mem 0x380000018000-0x38000001bfff 64bit pref] [ 0.927909] ACPI: PCI: Interrupt link LNKA configured for IRQ 10 [ 0.935027] ACPI: PCI: Interrupt link LNKB configured for IRQ 10 [ 0.938734] ACPI: PCI: Interrupt link LNKC configured for IRQ 11 [ 0.943091] ACPI: PCI: Interrupt link LNKD configured for IRQ 11 [ 0.944962] ACPI: PCI: Interrupt link LNKS configured for IRQ 9 [ 0.954336] iommu: Default domain type: Passthrough [ 0.957423] SCSI subsystem initialized [ 0.958604] ACPI: bus type USB registered [ 0.959430] usbcore: registered new interface driver usbfs [ 0.960239] usbcore: registered new interface driver hub [ 0.961201] usbcore: registered new device driver usb [ 0.962930] pps_core: LinuxPPS API ver. 1 registered [ 0.963018] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti [ 0.964146] PTP clock support registered [ 0.966258] EDAC MC: Ver: 3.0.0 [ 0.970163] PCI: Using ACPI for IRQ routing [ 0.973614] NetLabel: Initializing [ 0.974015] NetLabel: domain hash size = 128 [ 0.975015] NetLabel: protocols = UNLABELED CIPSOv4 CALIPSO [ 0.976368] NetLabel: unlabeled traffic allowed by default [ 0.978092] vgaarb: loaded [ 0.980111] hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0 [ 0.981013] hpet0: 3 comparators, 64-bit 100.000000 MHz counter [ 0.989254] clocksource: Switched to clocksource kvm-clock [ 1.578292] VFS: Disk quotas dquot_6.6.0 [ 1.579962] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 1.582916] *** VALIDATE ramfs *** [ 1.584430] *** VALIDATE hugetlbfs *** [ 1.587097] pnp: PnP ACPI init [ 1.596597] pnp: PnP ACPI: found 6 devices [ 1.634819] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns [ 1.638403] pci_bus 0000:00: resource 4 [io 0x0000-0x0cf7 window] [ 1.640478] pci_bus 0000:00: resource 5 [io 0x0d00-0xffff window] [ 1.642471] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window] [ 1.644875] pci_bus 0000:00: resource 7 [mem 0xc0000000-0xfebfffff window] [ 1.647378] pci_bus 0000:00: resource 8 [mem 0x380000000000-0x38007fffffff window] [ 1.650893] NET: Registered protocol family 2 [ 1.654579] IP idents hash table entries: 131072 (order: 8, 1048576 bytes, vmalloc) [ 1.660829] tcp_listen_portaddr_hash hash table entries: 4096 (order: 6, 360448 bytes, vmalloc) [ 1.665030] TCP established hash table entries: 65536 (order: 7, 524288 bytes, vmalloc) [ 1.672549] TCP bind hash table entries: 65536 (order: 10, 5242880 bytes, vmalloc) [ 1.679850] TCP: Hash tables configured (established 65536 bind 65536) [ 1.685267] MPTCP token hash table entries: 8192 (order: 7, 786432 bytes, vmalloc) [ 1.689520] UDP hash table entries: 4096 (order: 7, 786432 bytes, vmalloc) [ 1.693202] UDP-Lite hash table entries: 4096 (order: 7, 786432 bytes, vmalloc) [ 1.698371] NET: Registered protocol family 1 [ 1.705900] RPC: Registered named UNIX socket transport module. [ 1.708403] RPC: Registered udp transport module. [ 1.710229] RPC: Registered tcp transport module. [ 1.713269] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 1.715989] NET: Registered protocol family 44 [ 1.719319] pci 0000:00:00.0: Limiting direct PCI/PCI transfers [ 1.721749] pci 0000:00:01.0: PIIX3: Enabling Passive Release [ 1.726080] pci 0000:00:01.0: Activating ISA DMA hang workarounds [ 1.729716] PCI: CLS 0 bytes, default 64 [ 1.733788] Unpacking initramfs... [ 4.221545] debug: unmapping init [mem 0xffff94bb7cbe3000-0xffff94bb7ffbffff] [ 4.226227] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) [ 4.229125] software IO TLB: mapped [mem 0x00000000a8000000-0x00000000ac000000] (64MB) [ 4.232487] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x229835b7123, max_idle_ns: 440795242976 ns [ 4.248896] cryptomgr_test (65) used greatest stack depth: 14248 bytes left [ 5.091504] Initialise system trusted keyrings [ 5.093337] Key type blacklist registered [ 5.095721] workingset: timestamp_bits=36 max_order=20 bucket_order=0 [ 5.160863] zbud: loaded [ 5.178340] *** VALIDATE nfs *** [ 5.179813] *** VALIDATE nfs4 *** [ 5.182082] pstore: using deflate compression [ 5.187733] Platform Keyring initialized [ 5.194109] cryptomgr_test (73) used greatest stack depth: 14024 bytes left [ 5.221330] cryptomgr_test (81) used greatest stack depth: 14008 bytes left [ 5.246401] cryptomgr_test (86) used greatest stack depth: 13800 bytes left [ 5.297081] modprobe (92) used greatest stack depth: 13768 bytes left [ 5.310335] cryptomgr_test (94) used greatest stack depth: 13640 bytes left [ 5.474503] NET: Registered protocol family 38 [ 5.476357] Key type asymmetric registered [ 5.477623] Asymmetric key parser 'x509' registered [ 5.479791] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 247) [ 5.482885] io scheduler mq-deadline registered [ 5.485526] io scheduler kyber registered [ 5.488349] io scheduler bfq registered [ 5.494106] atomic64_test: passed for x86-64 platform with CX8 and with SSE [ 5.503380] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 5.507814] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 [ 5.512299] ACPI: Power Button [PWRF] [ 7.049862] ACPI: \_SB_.LNKB: Enabled at IRQ 10 [ 9.940186] ACPI: \_SB_.LNKA: Enabled at IRQ 11 [ 18.662624] ACPI: \_SB_.LNKC: Enabled at IRQ 11 [ 22.208204] ACPI: \_SB_.LNKD: Enabled at IRQ 10 [ 31.091400] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled [ 31.160968] 00:03: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A [ 31.228932] 00:04: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A [ 31.248970] Non-volatile memory driver v1.3 [ 31.253553] Linux agpgart interface v0.103 [ 31.551358] virtio_blk virtio1: [vda] 131896 512-byte logical blocks (67.5 MB/64.4 MiB) [ 31.564185] vda: detected capacity change from 0 to 67530752 [ 31.631561] virtio_blk virtio2: [vdb] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB) [ 31.638136] vdb: detected capacity change from 0 to 1073741824 [ 31.705075] virtio_blk virtio3: [vdc] 5120000 512-byte logical blocks (2.62 GB/2.44 GiB) [ 31.708296] vdc: detected capacity change from 0 to 2621440000 [ 31.754436] virtio_blk virtio4: [vdd] 5120000 512-byte logical blocks (2.62 GB/2.44 GiB) [ 31.757534] vdd: detected capacity change from 0 to 2621440000 [ 31.827425] virtio_blk virtio5: [vde] 8388608 512-byte logical blocks (4.29 GB/4.00 GiB) [ 31.835409] vde: detected capacity change from 0 to 4294967296 [ 31.884173] virtio_blk virtio6: [vdf] 8388608 512-byte logical blocks (4.29 GB/4.00 GiB) [ 31.887242] vdf: detected capacity change from 0 to 4294967296 [ 31.916818] libphy: Fixed MDIO Bus: probed [ 31.942557] usbcore: registered new interface driver usbserial_generic [ 31.956584] usbserial: USB Serial support registered for generic [ 31.959100] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12 [ 31.979808] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 31.982199] serio: i8042 AUX port at 0x60,0x64 irq 12 [ 31.997864] mousedev: PS/2 mouse device common for all mice [ 32.014201] rtc_cmos 00:05: RTC can wake from S4 [ 32.032655] rtc_cmos 00:05: registered as rtc0 [ 32.034412] rtc_cmos 00:05: alarms up to one day, y3k, 242 bytes nvram, hpet irqs [ 32.037326] intel_pstate: CPU model not supported [ 32.040877] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input1 [ 32.070289] input: VirtualPS/2 VMware VMMouse as /devices/platform/i8042/serio1/input/input4 [ 32.079424] hid: raw HID events driver (C) Jiri Kosina [ 32.084750] usbcore: registered new interface driver usbhid [ 32.101984] usbhid: USB HID core driver [ 32.103873] drop_monitor: Initializing network drop monitor service [ 32.121578] Initializing XFRM netlink socket [ 32.128956] input: VirtualPS/2 VMware VMMouse as /devices/platform/i8042/serio1/input/input3 [ 32.146191] NET: Registered protocol family 10 [ 32.193605] Segment Routing with IPv6 [ 32.194988] NET: Registered protocol family 17 [ 32.216449] mpls_gso: MPLS GSO support [ 32.228627] RAS: Correctable Errors collector initialized. [ 32.231308] AVX version of gcm_enc/dec engaged. [ 32.238153] AES CTR mode by8 optimization enabled [ 32.634545] sched_clock: Marking stable (32634334571, 0)->(33915336180, -1281001609) [ 32.650620] registered taskstats version 1 [ 32.656411] Loading compiled-in X.509 certificates [ 32.659097] zswap: loaded using pool lzo/zbud [ 32.777889] Key type big_key registered [ 32.799963] Key type encrypted registered [ 32.801665] ima: No TPM chip found, activating TPM-bypass! [ 32.803736] ima: Allocated hash algorithm: sha1 [ 32.807738] ima: No architecture policies found [ 32.809420] evm: Initialising EVM extended attributes: [ 32.810882] evm: security.selinux [ 32.811859] evm: security.ima [ 32.812700] evm: security.capability [ 32.813755] evm: HMAC attrs: 0x1 [ 32.834715] rtc_cmos 00:05: setting system clock to 2025-04-01 07:38:50 UTC (1743493130) [ 32.906528] debug: unmapping init [mem 0xffffffffb5c03000-0xffffffffb5dfffff] [ 32.913646] debug: unmapping init [mem 0xffffffffb4271000-0xffffffffb4702fff] [ 32.919122] Write protecting the kernel read-only data: 30720k [ 32.928519] debug: unmapping init [mem 0xffffffffb2803000-0xffffffffb29fffff] [ 32.932760] debug: unmapping init [mem 0xffffffffb312f000-0xffffffffb31fffff] [ 33.166763] systemd[1]: systemd 239 (239-82.el8_10.3) running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=legacy) [ 33.179970] systemd[1]: Detected virtualization kvm. [ 33.182222] systemd[1]: Detected architecture x86-64. [ 33.184354] systemd[1]: Running in initial RAM disk. Welcome to Rocky Linux 8.10 (Green Obsidian) dracut-049-233.git20240115.el8 (Initramfs)! [ 33.243453] systemd[1]: No hostname configured. [ 33.246845] systemd[1]: Set hostname to . [ 33.249356] random: systemd: uninitialized urandom read (16 bytes read) [ 33.252666] systemd[1]: Initializing machine ID from random generator. [ 33.451955] random: ln: uninitialized urandom read (6 bytes read) [ 33.633343] systemd-hiberna (205) used greatest stack depth: 13608 bytes left [ 33.884835] random: systemd: uninitialized urandom read (16 bytes read) [ 33.887277] systemd[1]: Reached target Swap. [ OK ] Reached target Swap. [ 33.899449] systemd[1]: Listening on udev Kernel Socket. [ OK ] Listening on udev Kernel Socket. [ 33.910713] systemd[1]: Listening on udev Control Socket. [ OK ] Listening on udev Control Socket. [ OK ] Started Dispatch Password Requests to Console Directory Watch. [ OK ] Reached target Paths. [ OK ] Reached target Slices. [ OK ] Listening on Journal Socket. Starting Setup Virtual Console... Starting Apply Kernel Variables... [ OK ] Reached target Local File Systems. [ 34.174368] systemd-vconsol (231) used greatest stack depth: 13528 bytes left Starting Create Volatile Files and Directories... Starting Create list of required st…ce nodes for the current kernel... [ OK ] Reached target Local Encrypted Volumes. [ OK ] Listening on Journal Socket (/dev/log). Starting Journal Service... [ OK ] Started Memstrack Anylazing Service. [ OK ] Reached target Timers. [ OK ] Reached target Initrd Root Device. [ OK ] Reached target Sockets. [ OK ] Started Setup Virtual Console. [ OK ] Started Apply Kernel Variables. [ OK ] Started Create Volatile Files and Directories. [ OK ] Started Create list of required sta…vice nodes for the current kernel. Starting Create Static Device Nodes in /dev... Starting dracut cmdline hook... [ OK ] Started Create Static Device Nodes in /dev. [ OK ] Started Journal Service. [ OK ] Started dracut cmdline hook. Starting dracut pre-udev hook... [ 36.982396] device-mapper: uevent: version 1.0.3 [ 36.987277] device-mapper: ioctl: 4.46.0-ioctl (2022-02-22) initialised: dm-devel@redhat.com [ OK ] Started dracut pre-udev hook. Starting udev Kernel Device Manager... [ OK ] Started udev Kernel Device Manager. Starting dracut pre-trigger hook... [ OK ] Started dracut pre-trigger hook. Starting udev Coldplug all Devices... [ 39.411593] udevadm (417) used greatest stack depth: 13424 bytes left Mounting Kernel Configuration File System... [ OK ] Mounted Kernel Configuration File System. [ OK ] Started udev Coldplug all Devices. Starting dracut initqueue hook... [ OK ] Reached target System Initialization. [ OK ] Reached target Basic System. [ OK ] Started Hardware RNG Entropy Gatherer Daemon. [ 41.656501] virtio_net virtio0 ens2: renamed from eth0 [ 41.915965] random: fast init done [ 42.313609] scsi host0: ata_piix [ 42.368289] scsi host1: ata_piix [ 42.370382] ata1: PATA max MWDMA2 cmd 0x1f0 ctl 0x3f6 bmdma 0xc320 irq 14 [ 42.372715] ata2: PATA max MWDMA2 cmd 0x170 ctl 0x376 bmdma 0xc328 irq 15 [ 44.906592] systemd-udevd (448) used greatest stack depth: 13048 bytes left [ 44.984914] systemd-udevd (445) used greatest stack depth: 12648 bytes left [ 46.221361] ip (531) used greatest stack depth: 11496 bytes left [ 47.393463] random: crng init done [ 47.394818] random: 7 urandom warning(s) missed due to ratelimiting [ 51.680047] dracut-initqueue[593]: RTNETLINK answers: File exists Starting nbd nbd0... [ OK ] Started nbd nbd0. [ OK ] Started dracut initqueue hook. [ OK ] Reached target Remote File Systems (Pre). [ OK ] Reached target Remote File Systems. Mounting /sysroot... [ 55.112574] EXT4-fs (nbd0): mounted filesystem with ordered data mode. Opts: (null) [ OK ] Mounted /sysroot. [ OK ] Reached target Initrd Root File System. Starting Reload Configuration from the Real Root... [ OK ] Started Reload Configuration from the Real Root. [ OK ] Reached target Initrd File Systems. [ OK ] Reached target Initrd Default Target. Starting dracut pre-pivot and cleanup hook... [ OK ] Started dracut pre-pivot and cleanup hook. Starting Cleaning Up and Shutting Down Daemons... [ OK ] Stopped target Timers. [ OK ] Stopped dracut pre-pivot and cleanup hook. [ OK ] Stopped target Remote File Systems. [ OK ] Stopped target Remote File Systems (Pre). [ OK ] Stopped target Initrd Default Target. [ OK ] Stopped dracut initqueue hook. Stopping Hardware RNG Entropy Gatherer Daemon... [ OK ] Stopped target Initrd Root Device. [ OK ] Stopped Hardware RNG Entropy Gatherer Daemon. [ OK ] Stopped target Basic System. [ OK ] Stopped target Slices. [ OK ] Stopped target Sockets. [ OK ] Stopped target Paths. [ OK ] Stopped target System Initialization. [ OK ] Stopped Create Volatile Files and Directories. [ OK ] Stopped target Local File Systems. [ OK ] Stopped target Local Encrypted Volumes. [ OK ] Stopped Dispatch Password Requests to Console Directory Watch. [ OK ] Stopped target Swap. [ OK ] Stopped udev Coldplug all Devices. [ OK ] Stopped dracut pre-trigger hook. [ OK ] Stopped Apply Kernel Variables. Stopping udev Kernel Device Manager... [ OK ] Started Cleaning Up and Shutting Down Daemons. [ OK ] Stopped udev Kernel Device Manager. [ OK ] Stopped dracut pre-udev hook. [ OK ] Stopped dracut cmdline hook. [ OK ] Stopped Create Static Device Nodes in /dev. [ OK ] Stopped Create list of required sta…vice nodes for the current kernel. [ OK ] Closed udev Control Socket. [ OK ] Closed udev Kernel Socket. Starting Cleanup udevd DB... [ OK ] Started Cleanup udevd DB. [ OK ] Reached target Switch Root. Starting Switch Root... [ 60.997883] printk: systemd: 26 output lines suppressed due to ratelimiting [ 62.394680] SELinux: Disabled at runtime. [ 62.608170] systemd[1]: systemd 239 (239-82.el8_10.3) running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=legacy) [ 62.630931] systemd[1]: Detected virtualization kvm. [ 62.632959] systemd[1]: Detected architecture x86-64. Welcome to Rocky Linux 8.10 (Green Obsidian)! [ 65.913150] systemd[1]: initrd-switch-root.service: Succeeded. [ 65.921664] systemd[1]: Stopped Switch Root. [ OK ] Stopped Switch Root. [ 65.957920] systemd[1]: systemd-journald.service: Service has no hold-off time (RestartSec=0), scheduling restart. [ 65.964280] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 1. [ 65.970971] systemd[1]: Stopped Journal Service. [ OK ] Stopped Journal Service. [ 65.994422] systemd[1]: Starting Journal Service... Starting Journal Service... [ 66.022569] systemd[1]: Created slice system-getty.slice. [ OK ] Created slice system-getty.slice. Mounting Huge Pages File System... [ OK ] Listening on initctl Compatibility Named Pipe. Starting Apply Kernel Variables... [ OK ] Listening on udev Control Socket. [ OK ] Listening on RPCbind Server Activation Socket. [ OK ] Reached target rpc_pipefs.target. [ OK ] Listening on Process Core Dump Socket. Starting Remount Root and Kernel File Systems... Starting Create list of required st…ce nodes for the current kernel... [ OK ] Stopped target Switch Root. [ OK ] Stopped target Initrd Root File System. Mounting Kernel Debug File System... [ OK ] Started Forward Password Requests to Wall Directory Watch. [ OK ] Started Dispatch Password Requests to Console Directory Watch. [ OK ] Reached target Local Encrypted Volumes. [ OK ] Reached target Paths. Mounting POSIX Message Queue File System... [ OK ] Listening on udev Kernel Socket. Starting udev Coldplug all Devices... [ OK ] Created slice system-sshd\x2dkeygen.slice. [FAILED] Failed to set up automount Arbitrar…rmats File System Automount Point. See 'systemctl status proc-sys-fs-binfmt_misc.automount' for details. [ OK ] Created slice User and Session Slice. [ OK ] Reached target Slices. [ OK ] Created slice system-serial\x2dgetty.slice. Activating swap /dev/disk/by-label/SWAP... [ OK ] Stopped target Initrd File Systems. [ OK ] Reached target RPC Port Mapper. [ 67.352263] Adding 1048572k swap on /dev/vdb. Priority:-2 extents:1 across:1048572k FS [ 67.636166] systemd[1]: Started Journal Service. [ OK ] Started Journal Service. [ OK ] Mounted Huge Pages File System. [ OK ] Started Apply Kernel Variables. [FAILED] Failed to start Remount Root and Kernel File Systems. See 'systemctl status systemd-remount-fs.service' for details. [ OK ] Started Create list of required sta…vice nodes for the current kernel. [ OK ] Mounted Kernel Debug File System. [ OK ] Mounted POSIX Message Queue File System. [ OK ] Activated swap /dev/disk/by-label/SWAP. [ OK ] Reached target Swap. Starting Configure read-only root support... Starting Create Static Device Nodes in /dev... Starting Flush Journal to Persistent Storage... [ OK ] Started Flush Journal to Persistent Storage. [ OK ] Started Create Static Device Nodes in /dev. Starting udev Kernel Device Manager... [ OK ] Reached target Local File Systems (Pre). Mounting /home/green/git/lustre-release... Mounting /mnt... [ 69.874514] squashfs: version 4.0 (2009/01/31) Phillip Lougher [ OK ] Mounted /mnt. [ OK ] Mounted /home/green/git/lustre-release. [ OK ] Started udev Coldplug all Devices. [ OK ] Started udev Kernel Device Manager. [ 72.665062] piix4_smbus 0000:00:01.3: SMBus Host Controller at 0x700, revision 0 [ 72.771437] input: PC Speaker as /devices/platform/pcspkr/input/input5 [ 75.579923] RAPL PMU: API unit is 2^-32 Joules, 0 fixed counters, 10737418240 ms ovfl timer [ 76.290935] EDAC sbridge: Ver: 1.1.2 [* ] A start job is running for Configur…only root support (11s / no limit) [** ] A start job is running for Configur…only root support (11s / no limit) [*** ] A start job is running for Configur…only root support (12s / no limit) [ *** ] A start job is running for Configur…only root support (13s / no limit) [ *** ] A start job is running for Configur…only root support (13s / no limit) [ ***] A start job is running for Configur…only root support (14s / no limit) [ **] A start job is running for Configur…only root support (14s / no limit) [ *] A start job is running for Configur…only root support (15s / no limit) [ **] A start job is running for Configur…only root support (15s / no limit) [ ***] A start job is running for Configur…only root support (16s / no limit) [ *** ] A start job is running for Configur…only root support (16s / no limit) [ *** ] A start job is running for Configur…only root support (17s / no limit) [*** ] A start job is running for Configur…only root support (17s / no limit) [** ] A start job is running for Configur…only root support (18s / no limit) [* ] A start job is running for Configur…only root support (18s / no limit) [** ] A start job is running for Configur…only root support (19s / no limit) [*** ] A start job is running for Configur…only root support (20s / no limit) [ *** ] A start job is running for Configur…only root support (20s / no limit)[ 86.598607] Key type dns_resolver registered [ *** ] A start job is running for Configur…only root support (21s / no limit) [ ***] A start job is running for Configur…only root support (21s / no limit)[ 87.516595] NFS: Registering the id_resolver key type [ 87.518496] Key type id_resolver registered [ 87.520042] Key type id_legacy registered [ **] A start job is running for Configur…only root support (22s / no limit) [ *] A start job is running for Configur…only root support (22s / no limit)[ 88.500996] mount.nfs (977) used greatest stack depth: 10376 bytes left [ **] A start job is running for Configur…only root support (23s / no limit) [ OK ] Started Configure read-only root support. [ OK ] Reached target Local File Systems. Starting Mark the need to relabel after reboot... Starting Rebuild Dynamic Linker Cache... Starting Create Volatile Files and Directories... Starting Load/Save Random Seed... [ OK ] Started Mark the need to relabel after reboot. [ OK ] Started Create Volatile Files and Directories. [ OK ] Started Load/Save Random Seed. Starting RPC Bind... Starting Update UTMP about System Boot/Shutdown... [ OK ] Started RPC Bind. [ OK ] Started Update UTMP about System Boot/Shutdown. [ OK ] Started Rebuild Dynamic Linker Cache. Starting Update is Completed... [ OK ] Started Update is Completed. [ OK ] Reached target System Initialization. [ OK ] Started daily update of the root trust anchor for DNSSEC. [ OK ] Started dnf makecache --timer. [ OK ] Listening on D-Bus System Message Bus Socket. [ OK ] Reached target Sockets. [ OK ] Started Daily Cleanup of Temporary Directories. [ OK ] Reached target Timers. [ OK ] Reached target Basic System. [ OK ] Started D-Bus System Message Bus. [ OK ] Started Hardware RNG Entropy Gatherer Daemon. Starting Restore /run/initramfs on shutdown... Starting Login Service... [ OK ] Reached target sshd-keygen.target. [ OK ] Started irqbalance daemon. Starting Network Manager... [ OK ] Started Restore /run/initramfs on shutdown. [ OK ] Started Login Service. [ OK ] Started Network Manager. [ OK ] Reached target Network. Starting GSSAPI Proxy Daemon... Starting Dynamic System Tuning Daemon... Starting OpenSSH server daemon... Starting Network Manager Wait Online... [ OK ] Started GSSAPI Proxy Daemon. [ OK ] Started OpenSSH server daemon. Starting Hostname Service... [ OK ] Reached target NFS client services. [ OK ] Reached target Remote File Systems (Pre). [ OK ] Reached target Remote File Systems. Starting Permit User Sessions... [ OK ] Started Permit User Sessions. [ OK ] Started Command Scheduler. [ OK ] Started Serial Getty on ttyS1. [ OK ] Started Serial Getty on ttyS0. [ OK ] Started Getty on tty1. [ OK ] Reached target Login Prompts. [ OK ] Started Hostname Service. Starting Network Manager Script Dispatcher Service... [ OK ] Started Network Manager Script Dispatcher Service. [ OK ] Started Network Manager Wait Online. [ OK ] Reached target Network is Online. Starting Crash recovery kernel arming... Starting System Logging Service... Starting Notify NFS peers of a restart... [ OK ] Started Notify NFS peers of a restart. [ OK ] Started System Logging Service. Rocky Linux 8.10 (Green Obsidian) Kernel 4.18.0rh8.10-debug on an x86_64 oleg653-server login: [ 199.541924] libcfs: loading out-of-tree module taints kernel. [ 199.567954] Key type ._llcrypt registered [ 199.569593] Key type .llcrypt registered [ 199.894148] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_hostid [ 240.546428] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing load_modules_local [ 245.748990] libcfs: HW NUMA nodes: 1, HW CPU cores: 4, npartitions: 1 [ 245.788954] alg: No test for adler32 (adler32-zlib) [ 247.681642] Lustre: Lustre: Build Version: 2.16.52_73_g6bb624e [ 249.243857] LNet: Added LNI 192.168.206.153@tcp [8/256/0/180] [ 249.249280] LNet: Accept secure, port 988 [ 251.223381] Key type lgssc registered [ 254.485385] Lustre: Echo OBD driver; http://www.lustre.org/ [ 288.808020] hrtimer: interrupt took 5005762 ns [ 289.052339] ZFS: Loaded module v2.3.0-1, ZFS pool version 5000, ZFS filesystem version 5 [ 289.063264] modprobe (4347) used greatest stack depth: 5584 bytes left [ 294.606939] LDISKFS-fs (vdc): mounted filesystem with ordered data mode. Opts: errors=remount-ro [ 312.974784] LDISKFS-fs (vdd): mounted filesystem with ordered data mode. Opts: errors=remount-ro [ 329.624296] LDISKFS-fs (vde): mounted filesystem with ordered data mode. Opts: errors=remount-ro [ 348.204689] LDISKFS-fs (vdf): mounted filesystem with ordered data mode. Opts: errors=remount-ro [ 392.501268] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing load_modules_local [ 427.806757] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: errors=remount-ro [ 427.951758] Lustre: lustre-MDT0000: mounting server target with '-t lustre' deprecated, use '-t lustre_tgt' [ 427.988681] ------------[ cut here ]------------ [ 427.994331] DEBUG_LOCKS_WARN_ON(!lockdep_enabled()) [ 427.994360] WARNING: CPU: 0 PID: 6532 at kernel/locking/lockdep.c:4713 lockdep_init_map_type+0x29d/0x410 [ 428.005323] Modules linked in: zfs(O) spl(O) lustre(O) osp(O) ofd(O) lod(O) mdt(O) mdd(O) mgs(O) osd_ldiskfs(O) ldiskfs(O) lquota(O) lfsck(O) obdecho(O) mgc(O) mdc(O) lov(O) osc(O) lmv(O) fid(O) fld(O) ptlrpc_gss(O) ptlrpc(O) obdclass(O) ksocklnd(O) lnet(O) libcfs(O) dm_flakey rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver intel_rapl_msr intel_rapl_common sb_edac rapl pcspkr i2c_piix4 squashfs crct10dif_pclmul crc32_pclmul crc32c_intel ata_generic ghash_clmulni_intel ata_piix serio_raw libata dm_mirror dm_region_hash dm_log dm_mod sha512_ssse3 sha512_generic [ 428.022955] CPU: 0 PID: 6532 Comm: mount.lustre Kdump: loaded Tainted: G O -------- - - 4.18.0rh8.10-debug #7 [ 428.027291] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 428.031102] RIP: 0010:lockdep_init_map_type+0x29d/0x410 [ 428.032675] Code: c0 0f 85 db fe ff ff 48 c7 c6 66 64 f9 b2 48 c7 c7 a7 4e f7 b2 48 83 05 f0 df 25 03 01 e8 27 14 f5 ff 48 83 05 eb df 25 03 01 <0f> 0b 48 83 05 e9 df 25 03 01 48 83 05 e9 df 25 03 01 e9 a1 fe ff [ 428.038986] RSP: 0018:ffffba21424f7748 EFLAGS: 00010202 [ 428.040667] RAX: 0000000000000000 RBX: ffff94bbfe1ef188 RCX: 0000000000000001 [ 428.043429] RDX: 0000000000000001 RSI: 00000000ffff7fff RDI: ffff94bc013de800 [ 428.046432] RBP: ffffffffc157d7c0 R08: 0000000000000000 R09: c0000000ffff7fff [ 428.049306] R10: 0000000000000001 R11: ffffba21424f7538 R12: 0000000000000002 [ 428.051872] R13: ffff94bbf23e1000 R14: 0000000000000000 R15: 0000000000000001 [ 428.054221] FS: 00007f51ff6abb40(0000) GS:ffff94bc01200000(0000) knlGS:0000000000000000 [ 428.057445] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 428.059693] CR2: 0000555a0307d79e CR3: 00000001257a6003 CR4: 0000000000170ef0 [ 428.062214] Call Trace: [ 428.062885] ? show_regs.cold.9+0x22/0x2f [ 428.064960] ? __warn+0xc8/0x150 [ 428.066136] ? lockdep_init_map_type+0x29d/0x410 [ 428.068140] ? report_bug+0x113/0x140 [ 428.069438] ? do_error_trap+0xb6/0x130 [ 428.070881] ? do_invalid_op+0x46/0x60 [ 428.071927] ? lockdep_init_map_type+0x29d/0x410 [ 428.073765] ? invalid_op+0x14/0x20 [ 428.075156] ? lockdep_init_map_type+0x29d/0x410 [ 428.076839] ? lockdep_init_map_type+0x295/0x410 [ 428.078402] ldiskfs_enable_quotas+0x1b9/0x4a0 [ldiskfs] [ 428.080555] ldiskfs_fill_super+0x3a56/0x43c0 [ldiskfs] [ 428.082821] ? ldiskfs_calculate_overhead+0x670/0x670 [ldiskfs] [ 428.085314] ? mount_bdev+0x226/0x270 [ 428.086799] mount_bdev+0x226/0x270 [ 428.088270] ldiskfs_mount+0x19/0x30 [ldiskfs] [ 428.089748] legacy_get_tree+0x35/0x90 [ 428.091116] vfs_get_tree+0x2a/0x140 [ 428.092282] fc_mount+0x16/0x60 [ 428.093737] vfs_kern_mount+0x91/0x100 [ 428.095169] osd_mount+0x5c4/0x1080 [osd_ldiskfs] [ 428.097089] osd_device_init0+0x2e1/0xc20 [osd_ldiskfs] [ 428.099450] osd_device_alloc+0x22a/0x290 [osd_ldiskfs] [ 428.101285] obd_setup+0x196/0x430 [obdclass] [ 428.102732] class_setup+0x6f5/0x9f0 [obdclass] [ 428.104223] class_process_config+0x1658/0x2b60 [obdclass] [ 428.106084] do_lcfg+0x376/0x740 [obdclass] [ 428.108528] lustre_start_simple+0x8f/0x220 [obdclass] [ 428.110911] osd_start+0x6aa/0xb60 [ptlrpc] [ 428.112562] ? server_name2index+0x79/0xe0 [obdclass] [ 428.114527] ? lsi_prepare+0x2e7/0x690 [ptlrpc] [ 428.116532] server_fill_super+0x99/0x1190 [ptlrpc] [ 428.118638] ? obd_zombie_barrier+0x63/0x120 [obdclass] [ 428.120912] ? debug_mutex_init+0x43/0x60 [ 428.122200] lustre_fill_super+0x4a6/0x5e0 [lustre] [ 428.124240] ? lustre_mount+0x30/0x30 [lustre] [ 428.125989] mount_nodev+0x56/0xf0 [ 428.127598] lustre_mount+0x1c/0x30 [lustre] [ 428.129416] legacy_get_tree+0x35/0x90 [ 428.130910] vfs_get_tree+0x2a/0x140 [ 428.132017] do_mount+0xd84/0x1190 [ 428.133370] ksys_mount+0x11d/0x150 [ 428.134785] __x64_sys_mount+0x29/0x40 [ 428.136112] do_syscall_64+0xc1/0x450 [ 428.137937] entry_SYSCALL_64_after_hwframe+0x49/0xae [ 428.140074] RIP: 0033:0x7f51fb7dfdbe [ 428.141668] Code: 48 8b 0d cd 60 39 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 9a 60 39 00 f7 d8 64 89 01 48 [ 428.152437] RSP: 002b:00007ffd11792978 EFLAGS: 00000286 ORIG_RAX: 00000000000000a5 [ 428.155634] RAX: ffffffffffffffda RBX: 0000000000430cf6 RCX: 00007f51fb7dfdbe [ 428.158131] RDX: 0000000000430cf6 RSI: 00007ffd11799020 RDI: 0000000001703940 [ 428.160760] RBP: 00007ffd11799020 R08: 0000000001703960 R09: 0000000001703010 [ 428.165713] R10: 0000000001000000 R11: 0000000000000286 R12: 0000000000000000 [ 428.170217] R13: 0000000000654920 R14: 00000000fffffff5 R15: 00000000fffffff5 [ 428.174508] ---[ end trace 096113db9c0ee76c ]--- [ 428.181321] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 429.838738] Lustre: Setting parameter lustre-MDT0000.mdt.identity_upcall in log lustre-MDT0000 [ 429.927596] Lustre: ctl-lustre-MDT0000: No data found on store. Initialize space. [ 430.055284] Lustre: lustre-MDT0000: new disk, initializing [ 430.276997] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 430.314603] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000200000400-0x0000000240000400]:0:mdt [ 436.701518] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all [ 458.865061] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: errors=remount-ro [ 459.063958] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 459.328424] Lustre: Modifying parameter lustre-MDT0001.mdt.identity_upcall in log lustre-MDT0001 [ 459.377166] Lustre: 7467:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 459.444255] Lustre: srv-lustre-MDT0001: No data found on store. Initialize space. [ 459.448698] Lustre: Skipped 1 previous similar message [ 459.564128] Lustre: lustre-MDT0001: new disk, initializing [ 459.727052] Lustre: lustre-MDT0001: Imperative Recovery not enabled, recovery window 60-180 [ 459.783970] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000240000400-0x0000000280000400]:1:mdt [ 459.807457] Lustre: cli-ctl-lustre-MDT0001: Allocated super-sequence [0x0000000240000400-0x0000000280000400]:1:mdt] [ 466.091849] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all [ 473.445574] Lustre: Modifying parameter general.debug_raw_pointers in log params [ 490.176660] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: errors=remount-ro [ 490.313126] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 490.741875] Lustre: 8425:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 490.785371] Lustre: lustre-OST0000: new disk, initializing [ 490.789147] Lustre: srv-lustre-OST0000: No data found on store. Initialize space. [ 490.961469] Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 60-180 [ 499.840105] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x0000000280000400-0x00000002c0000400]:0:ost [ 499.867324] Lustre: cli-lustre-OST0000-super: Allocated super-sequence [0x0000000280000400-0x00000002c0000400]:0:ost] [ 500.093656] Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x100000000 to 0x280000401 [ 501.503793] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all [ 525.747457] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: errors=remount-ro [ 525.940584] LDISKFS-fs (dm-3): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 526.088036] Lustre: 9441:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 526.118122] Lustre: lustre-OST0001: new disk, initializing [ 526.129331] Lustre: srv-lustre-OST0001: No data found on store. Initialize space. [ 526.258487] Lustre: lustre-OST0001: Imperative Recovery not enabled, recovery window 60-180 [ 531.567206] Lustre: ctl-lustre-MDT0000: super-sequence allocation rc = 0 [0x00000002c0000400-0x0000000300000400]:1:ost [ 531.582157] Lustre: cli-lustre-OST0001-super: Allocated super-sequence [0x00000002c0000400-0x0000000300000400]:1:ost] [ 531.788456] Lustre: lustre-OST0001-osc-MDT0000: update sequence from 0x100010000 to 0x2c0000401 [ 535.794230] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all [ 555.406304] Lustre: DEBUG MARKER: Using TIMEOUT=20 [ 569.627922] Lustre: Setting parameter general.lod.*.mdt_hash in log params [ 587.249041] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing check_logdir /tmp/testlogs/ [ 596.532156] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing yml_node [ 612.848048] Lustre: DEBUG MARKER: Client: 2.16.52.73 [ 618.580311] Lustre: DEBUG MARKER: MDS: 2.16.52.73 [ 624.438599] Lustre: DEBUG MARKER: OSS: 2.16.52.73 [ 629.079617] Lustre: DEBUG MARKER: -----============= acceptance-small: recovery-small ============----- Tue Apr 1 03:48:43 EDT 2025 [ 666.142371] Lustre: DEBUG MARKER: excepting tests: 136 [ 671.986630] Lustre: DEBUG MARKER: === recovery-small: start setup 03:49:25 (1743493765) === [ 680.504236] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing check_config_client /mnt/lustre [ 722.051193] Lustre: DEBUG MARKER: Using TIMEOUT=20 [ 729.537456] Lustre: Modifying parameter general.lod.*.mdt_hash in log params [ 736.796878] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 750.050136] Lustre: DEBUG MARKER: === recovery-small: finish setup 03:50:44 (1743493844) === [ 754.200463] Lustre: DEBUG MARKER: == recovery-small test 1: create, chmod, stat: drop req, drop rep ========================================================== 03:50:49 (1743493849) [ 757.138498] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 773.637881] Lustre: lustre-MDT0000: Client 9addf4d1-8d81-46bd-bf45-0f6375e05d72 (at 192.168.206.53@tcp) reconnecting [ 778.266626] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 778.282426] LustreError: 6558:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad02e67c0 x1828185238955392/t4294967300(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:321/0 lens 520/448 e 0 to 0 dl 1743493886 ref 1 fl Interpret:/200/0 rc 0/0 job:'mcreate.0' uid:0 gid:0 [ 794.591234] Lustre: lustre-MDT0000: Client 9addf4d1-8d81-46bd-bf45-0f6375e05d72 (at 192.168.206.53@tcp) reconnecting [ 794.637188] Lustre: 9962:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad01a4540 x1828185238955392/t4294967300(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:338/0 lens 520/2880 e 0 to 0 dl 1743493903 ref 1 fl Interpret:/202/0 rc 0/0 job:'mcreate.0' uid:0 gid:0 [ 800.276989] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 816.674773] Lustre: lustre-MDT0000: Client 9addf4d1-8d81-46bd-bf45-0f6375e05d72 (at 192.168.206.53@tcp) reconnecting [ 820.643965] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 820.646197] LustreError: 9962:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad0aa9740 x1828185238960384/t4294967302(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:364/0 lens 488/456 e 0 to 0 dl 1743493929 ref 1 fl Interpret:/200/0 rc 0/0 job:'tchmod.0' uid:0 gid:0 [ 836.123274] Lustre: lustre-MDT0000: Client 9addf4d1-8d81-46bd-bf45-0f6375e05d72 (at 192.168.206.53@tcp) reconnecting [ 836.162096] Lustre: 6556:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bbff68f340 x1828185238960384/t4294967302(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:379/0 lens 488/3152 e 0 to 0 dl 1743493944 ref 1 fl Interpret:/202/0 rc 0/0 job:'tchmod.0' uid:0 gid:0 [ 841.127514] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 845.056615] Lustre: *** cfs_fail_loc=122, val=2147483648*** [ 845.065332] LustreError: 9961:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad0c6cb00 x1828185238962688/t0(0) o34->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:388/0 lens 472/464 e 0 to 0 dl 1743493953 ref 1 fl Interpret:/200/0 rc 0/0 job:'statone.0' uid:0 gid:0 [ 861.639367] Lustre: lustre-MDT0000: Client 9addf4d1-8d81-46bd-bf45-0f6375e05d72 (at 192.168.206.53@tcp) reconnecting [ 881.816556] Lustre: DEBUG MARKER: == recovery-small test 4: open: drop req, drop rep ======= 03:52:56 (1743493976) [ 884.710259] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 888.720422] Lustre: *** cfs_fail_loc=122, val=2147483648*** [ 888.722805] LustreError: 6560:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad01a50c0 x1828185238968704/t4294967308(0) o35->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:432/0 lens 392/456 e 0 to 0 dl 1743493997 ref 1 fl Interpret:/200/0 rc 0/0 job:'cat.0' uid:0 gid:0 [ 900.578940] Lustre: 3684:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743493982/real 1743493982] req@ffff94bad0d20600 x1828185278999040/t0(0) o41->lustre-MDT0000-osp-MDT0001@0@lo:24/4 lens 224/368 e 0 to 1 dl 1743493998 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'osp-pre-0-1.0' uid:0 gid:0 [ 900.635972] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 900.654633] Lustre: lustre-MDT0000: Received new MDS connection from 0@lo, keep former export from same NID [ 900.675360] Lustre: lustre-MDT0000-osp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 904.201045] Lustre: lustre-MDT0000: Client 9addf4d1-8d81-46bd-bf45-0f6375e05d72 (at 192.168.206.53@tcp) reconnecting [ 904.247816] Lustre: 6560:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad07a1740 x1828185238968704/t4294967308(0) o35->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:447/0 lens 392/456 e 0 to 0 dl 1743494012 ref 1 fl Interpret:/202/0 rc 0/0 job:'cat.0' uid:0 gid:0 [ 922.724763] Lustre: DEBUG MARKER: == recovery-small test 5: rename: drop req, drop rep ===== 03:53:37 (1743494017) [ 925.031972] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 941.569989] Lustre: lustre-MDT0000: Client 9addf4d1-8d81-46bd-bf45-0f6375e05d72 (at 192.168.206.53@tcp) reconnecting [ 947.183962] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 947.189370] LustreError: 6571:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad1010bc0 x1828185238978432/t4294967312(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:490/0 lens 552/456 e 0 to 0 dl 1743494055 ref 1 fl Interpret:/200/0 rc 0/0 job:'mv.0' uid:0 gid:0 [ 963.527641] Lustre: 6571:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bacf8bbf80 x1828185238978432/t4294967312(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:507/0 lens 552/2888 e 0 to 0 dl 1743494072 ref 1 fl Interpret:/202/0 rc 0/0 job:'mv.0' uid:0 gid:0 [ 983.959458] Lustre: DEBUG MARKER: == recovery-small test 6: link, unlink: drop req, drop rep ========================================================== 03:54:38 (1743494078) [ 986.569789] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 1003.008759] Lustre: lustre-MDT0000: Client 9addf4d1-8d81-46bd-bf45-0f6375e05d72 (at 192.168.206.53@tcp) reconnecting [ 1003.034943] Lustre: Skipped 1 previous similar message [ 1008.344325] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 1008.354389] LustreError: 6557:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad0c93f80 x1828185238988928/t4294967317(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:552/0 lens 512/440 e 0 to 0 dl 1743494117 ref 1 fl Interpret:/200/0 rc 0/0 job:'link.0' uid:0 gid:0 [ 1024.947392] Lustre: 6556:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad026b400 x1828185238988928/t4294967317(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:568/0 lens 512/440 e 0 to 0 dl 1743494133 ref 1 fl Interpret:/202/0 rc 0/0 job:'link.0' uid:0 gid:0 [ 1029.487266] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 1051.463907] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 1051.465961] LustreError: 6558:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad023e7c0 x1828185238995456/t4294967319(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:595/0 lens 504/456 e 0 to 0 dl 1743494160 ref 1 fl Interpret:/200/0 rc 0/0 job:'unlink.0' uid:0 gid:0 [ 1068.048885] Lustre: lustre-MDT0000: Client 9addf4d1-8d81-46bd-bf45-0f6375e05d72 (at 192.168.206.53@tcp) reconnecting [ 1068.072584] Lustre: Skipped 2 previous similar messages [ 1068.129437] Lustre: 6557:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bac660e7c0 x1828185238995456/t4294967319(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:611/0 lens 504/2888 e 0 to 0 dl 1743494176 ref 1 fl Interpret:/202/0 rc 0/0 job:'unlink.0' uid:0 gid:0 [ 1090.016858] Lustre: DEBUG MARKER: == recovery-small test 8: touch: drop rep (bug 1423) ===== 03:56:23 (1743494183) [ 1092.058332] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 1092.060430] LustreError: 9961:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94baceea1180 x1828185239000960/t4294967322(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:635/0 lens 488/456 e 0 to 0 dl 1743494200 ref 1 fl Interpret:/200/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 1108.419617] Lustre: 6557:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad09c0600 x1828185239000960/t4294967322(0) o36->9addf4d1-8d81-46bd-bf45-0f6375e05d72@192.168.206.53@tcp:652/0 lens 488/3152 e 0 to 0 dl 1743494217 ref 1 fl Interpret:/202/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 1128.734776] Lustre: DEBUG MARKER: == recovery-small test 9: pause bulk on OST (bug 1420) === 03:57:03 (1743494223) [ 1133.799102] LustreError: 8439:0:(tgt_handler.c:2694:tgt_brw_write()) cfs_fail_timeout id 214 sleeping for 5000ms [ 1138.839215] LustreError: 8439:0:(tgt_handler.c:2694:tgt_brw_write()) cfs_fail_timeout id 214 awake [ 1159.315489] Lustre: DEBUG MARKER: == recovery-small test 10a: finish request on server after client eviction (bug 1521) ========================================================== 03:57:33 (1743494253) [ 1177.058894] Lustre: 9962:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743494258/real 1743494258] req@ffff94bad0c94540 x1828185279123840/t0(0) o104->lustre-MDT0000@192.168.206.53@tcp:15/16 lens 328/224 e 0 to 1 dl 1743494274 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 1192.416271] Lustre: 9962:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743494274/real 1743494274] req@ffff94bad0c94540 x1828185279123840/t0(0) o104->lustre-MDT0000@192.168.206.53@tcp:15/16 lens 328/224 e 0 to 1 dl 1743494290 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 1201.119211] Lustre: mdt00_004: service thread pid 9962 was inactive for 40.595 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 1201.125470] Pid: 9962, comm: mdt00_004 4.18.0rh8.10-debug #7 SMP Sat Jan 18 21:01:29 EST 2025 [ 1201.141942] Call Trace TBD: [ 1201.147413] [<0>] wait_woken+0x9c/0xd0 [ 1201.149100] [<0>] ptlrpc_set_wait+0x3c1/0xa70 [ptlrpc] [ 1201.153745] [<0>] ldlm_run_ast_work+0x17d/0x500 [ptlrpc] [ 1201.162627] [<0>] ldlm_handle_conflict_lock+0x97/0x490 [ptlrpc] [ 1201.174408] [<0>] ldlm_lock_enqueue+0x321/0xcd0 [ptlrpc] [ 1201.184030] [<0>] ldlm_cli_enqueue_local+0x709/0xc20 [ptlrpc] [ 1201.206065] [<0>] mdt_object_lock_internal+0x20b/0x5a0 [mdt] [ 1201.216460] [<0>] mdt_object_lock+0x9e/0x240 [mdt] [ 1201.221126] [<0>] mdt_object_stripes_lock+0x28b/0x670 [mdt] [ 1201.225373] [<0>] mdt_reint_setattr+0xdd5/0x1f80 [mdt] [ 1201.227821] [<0>] mdt_reint_rec+0x139/0x2c0 [mdt] [ 1201.230880] [<0>] mdt_reint_internal+0x6a0/0xbf0 [mdt] [ 1201.234198] [<0>] mdt_reint+0x163/0x190 [mdt] [ 1201.236420] [<0>] tgt_handle_request0+0x137/0xaf0 [ptlrpc] [ 1201.241356] [<0>] tgt_request_handle+0x351/0x1c10 [ptlrpc] [ 1201.250290] [<0>] ptlrpc_server_handle_request+0x374/0x1320 [ptlrpc] [ 1201.260111] [<0>] ptlrpc_main+0xd2a/0x1450 [ptlrpc] [ 1201.261778] [<0>] kthread+0x1d7/0x210 [ 1201.262894] [<0>] ret_from_fork+0x1f/0x30 [ 1208.805161] Lustre: 9962:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743494290/real 1743494290] req@ffff94bad0c94540 x1828185279123840/t0(0) o104->lustre-MDT0000@192.168.206.53@tcp:15/16 lens 328/224 e 0 to 1 dl 1743494306 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 1225.184909] Lustre: 9962:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743494306/real 1743494306] req@ffff94bad0c94540 x1828185279123840/t0(0) o104->lustre-MDT0000@192.168.206.53@tcp:15/16 lens 328/224 e 0 to 1 dl 1743494322 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 1240.545883] Lustre: 9962:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743494322/real 1743494322] req@ffff94bad0c94540 x1828185279123840/t0(0) o104->lustre-MDT0000@192.168.206.53@tcp:15/16 lens 328/224 e 0 to 1 dl 1743494338 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 1256.927949] Lustre: 9962:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743494338/real 1743494338] req@ffff94bad0c94540 x1828185279123840/t0(0) o104->lustre-MDT0000@192.168.206.53@tcp:15/16 lens 328/224 e 0 to 1 dl 1743494354 ref 1 fl Rpc:XQr/2/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 1273.311393] LustreError: 9962:0:(ldlm_lockd.c:773:ldlm_handle_ast_error()) ### client (nid 192.168.206.53@tcp) failed to reply to blocking AST (req@0000000096711295 x1828185279123840 status 0 rc -110), evict it ns: mdt-lustre-MDT0000_UUID lock: ffff94bbff63bc40/0x623b88e7ff149af2 lrc: 4/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba27e97 expref: 9 pid: 9962 timeout: 1356 lvb_type: 0 [ 1273.369966] LustreError: lustre-MDT0000: A client on nid 192.168.206.53@tcp was evicted due to a lock blocking callback time out: rc -110 [ 1273.388327] LustreError: 6546:0:(ldlm_lockd.c:252:expired_lock_main()) ### lock callback timer expired after 17s: evicting client at 192.168.206.53@tcp ns: mdt-lustre-MDT0000_UUID lock: ffff94bbff63bc40/0x623b88e7ff149af2 lrc: 3/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba27e97 expref: 10 pid: 9962 timeout: 0 lvb_type: 0 [ 1273.455651] Lustre: mdt00_004: service thread pid 9962 completed after 112.931s. This likely indicates the system was overloaded (too many service threads, or not enough hardware resources). [ 1296.329532] Lustre: DEBUG MARKER: == recovery-small test 10b: re-send BL AST =============== 03:59:51 (1743494391) [ 1312.735364] Lustre: 9962:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743494394/real 1743494394] req@ffff94bbfdf950c0 x1828185279186048/t0(0) o104->lustre-MDT0000@192.168.206.53@tcp:15/16 lens 328/224 e 0 to 1 dl 1743494410 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 1312.761021] Lustre: 9962:0:(client.c:2346:ptlrpc_expire_one_request()) Skipped 1 previous similar message [ 1332.118544] Lustre: DEBUG MARKER: == recovery-small test 10c: re-send BL AST vs reconnect race (LU-5569) ========================================================== 04:00:26 (1743494426) [ 1334.489622] Lustre: lustre-MDT0000: Client 9addf4d1-8d81-46bd-bf45-0f6375e05d72 (at 192.168.206.53@tcp) reconnecting [ 1334.506100] Lustre: Skipped 1 previous similar message [ 1351.377422] Lustre: DEBUG MARKER: == recovery-small test 10d: test failed blocking ast ===== 04:00:46 (1743494446) [ 1360.462267] LustreError: 8434:0:(ldlm_lockd.c:773:ldlm_handle_ast_error()) ### client (nid 192.168.206.53@tcp) returned error from blocking AST (req@000000004bf30d7c x1828185279215616 status -71 rc -71), evict it ns: filter-lustre-OST0000_UUID lock: ffff94bbff638040/0x623b88e7ff149e95 lrc: 4/0,0 mode: PW/PW res: [0x280000401:0x7:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x60000480000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba28088 expref: 5 pid: 8434 timeout: 1460 lvb_type: 0 [ 1360.538678] LustreError: lustre-OST0000: A client on nid 192.168.206.53@tcp was evicted due to a lock blocking callback time out: rc -71 [ 1360.550909] LustreError: 6546:0:(ldlm_lockd.c:252:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 192.168.206.53@tcp ns: filter-lustre-OST0000_UUID lock: ffff94bbff638040/0x623b88e7ff149e95 lrc: 3/0,0 mode: PW/PW res: [0x280000401:0x7:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->18446744073709551615) gid 0 flags: 0x60000480000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba28088 expref: 6 pid: 8434 timeout: 0 lvb_type: 0 [ 1383.598405] Lustre: DEBUG MARKER: == recovery-small test 10e: re-send BL AST vs reconnect race 2 ========================================================== 04:01:18 (1743494478) [ 1387.890698] Lustre: DEBUG MARKER: SKIP: recovery-small test_10e need two clients [ 1392.264992] Lustre: DEBUG MARKER: == recovery-small test 11: wake up a thread waiting for completion after eviction (b=2460) ========================================================== 04:01:26 (1743494486) [ 1409.568057] Lustre: 14897:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743494491/real 1743494491] req@ffff94bbff6e1740 x1828185279232384/t0(0) o104->lustre-OST0000@192.168.206.53@tcp:15/16 lens 328/224 e 0 to 1 dl 1743494507 ref 1 fl Rpc:XQr/0/ffffffff rc 0/-1 job:'' uid:4294967295 gid:4294967295 [ 1430.568239] Lustre: DEBUG MARKER: == recovery-small test 12: recover from timed out resend in ptlrpcd (b=2494) ========================================================== 04:02:05 (1743494525) [ 1432.758627] Lustre: *** cfs_fail_loc=115, val=2147483648*** [ 1478.189756] Lustre: DEBUG MARKER: == recovery-small test 13: mdc_readpage restart test (bug 1138) ========================================================== 04:02:52 (1743494572) [ 1514.223629] Lustre: DEBUG MARKER: == recovery-small test 14: mdc_readpage resend test (bug 1138) ========================================================== 04:03:28 (1743494608) [ 1516.528942] Lustre: *** cfs_fail_loc=106, val=0*** [ 1516.531228] LustreError: 3676:0:(events.c:456:server_bulk_callback()) event type 5, status -110, desc ffff94bbc721a000 [ 1534.116388] Lustre: DEBUG MARKER: == recovery-small test 15: failed open (-ENOMEM) ========= 04:03:48 (1743494628) [ 1536.442979] Lustre: *** cfs_fail_loc=128, val=0*** [ 1553.320248] Lustre: DEBUG MARKER: == recovery-small test 16: timeout bulk put, don't evict client (2732) ========================================================== 04:04:08 (1743494648) [ 1557.283488] Lustre: *** cfs_fail_loc=504, val=0*** [ 1557.285225] LustreError: 8438:0:(ldlm_lib.c:3581:target_bulk_io()) @@@ truncated bulk READ 0(102400) req@ffff94bbff604540 x1828185239110144/t0(0) o3->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:345/0 lens 488/440 e 0 to 0 dl 1743494665 ref 1 fl Interpret:/200/0 rc 0/0 job:'cmp.0' uid:0 gid:0 [ 1557.310594] Lustre: lustre-OST0000: Bulk IO read error with 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp), client will retry: rc -110 [ 1615.286220] Lustre: DEBUG MARKER: == recovery-small test 17a: timeout bulk get, don't evict client (2732) ========================================================== 04:05:09 (1743494709) [ 1688.662981] Lustre: DEBUG MARKER: == recovery-small test 17b: timeout bulk get, dont evict client (3582) ========================================================== 04:06:22 (1743494782) [ 1692.590377] Lustre: DEBUG MARKER: SKIP: recovery-small test_17b Needs multiple clients [ 1696.625220] Lustre: DEBUG MARKER: == recovery-small test 18a: manual ost invalidate clears page cache immediately ========================================================== 04:06:31 (1743494791) [ 1698.609182] Lustre: lustre-OST0001: Client 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp) reconnecting [ 1698.623568] Lustre: Skipped 3 previous similar messages [ 1714.837350] Lustre: DEBUG MARKER: == recovery-small test 18b: eviction and reconnect clears page cache (2766) ========================================================== 04:06:49 (1743494809) [ 1719.552963] Lustre: 20225:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-OST0000: evicting 439ca163-a653-428d-9575-7b14f4b0f667 at adminstrative request [ 1759.476580] Lustre: DEBUG MARKER: == recovery-small test 18c: Dropped connect reply after eviction handing (14755) ========================================================== 04:07:34 (1743494854) [ 1763.849836] Lustre: 20496:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-OST0000: evicting 439ca163-a653-428d-9575-7b14f4b0f667 at adminstrative request [ 1766.687871] Lustre: *** cfs_fail_loc=225, val=0*** [ 1766.696211] Lustre: Skipped 1 previous similar message [ 1794.576982] Lustre: DEBUG MARKER: == recovery-small test 19a: test expired_lock_main on mds (2867) ========================================================== 04:08:09 (1743494889) [ 1799.489116] Lustre: *** cfs_fail_loc=304, val=0*** [ 1815.033231] Lustre: *** cfs_fail_loc=304, val=0*** [ 1831.359318] Lustre: *** cfs_fail_loc=304, val=0*** [ 1840.095185] Lustre: mdt00_004: service thread pid 9962 was inactive for 40.625 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 1840.121523] Pid: 9962, comm: mdt00_004 4.18.0rh8.10-debug #7 SMP Sat Jan 18 21:01:29 EST 2025 [ 1840.142564] Call Trace TBD: [ 1840.145952] [<0>] ldlm_completion_ast+0xbc9/0x1240 [ptlrpc] [ 1840.150556] [<0>] ldlm_cli_enqueue_local+0x60d/0xc20 [ptlrpc] [ 1840.169040] [<0>] mdt_object_lock_internal+0x20b/0x5a0 [mdt] [ 1840.171055] [<0>] mdt_object_lock+0x9e/0x240 [mdt] [ 1840.173901] [<0>] mdt_object_stripes_lock+0x28b/0x670 [mdt] [ 1840.176502] [<0>] mdt_reint_setattr+0xdd5/0x1f80 [mdt] [ 1840.178648] [<0>] mdt_reint_rec+0x139/0x2c0 [mdt] [ 1840.181112] [<0>] mdt_reint_internal+0x6a0/0xbf0 [mdt] [ 1840.184148] [<0>] mdt_reint+0x163/0x190 [mdt] [ 1840.185682] [<0>] tgt_handle_request0+0x137/0xaf0 [ptlrpc] [ 1840.187915] [<0>] tgt_request_handle+0x351/0x1c10 [ptlrpc] [ 1840.195991] [<0>] ptlrpc_server_handle_request+0x374/0x1320 [ptlrpc] [ 1840.207973] [<0>] ptlrpc_main+0xd2a/0x1450 [ptlrpc] [ 1840.218536] [<0>] kthread+0x1d7/0x210 [ 1840.224995] [<0>] ret_from_fork+0x1f/0x30 [ 1847.747053] Lustre: *** cfs_fail_loc=304, val=0*** [ 1863.107277] Lustre: *** cfs_fail_loc=304, val=0*** [ 1879.493731] Lustre: *** cfs_fail_loc=304, val=0*** [ 1895.883526] Lustre: *** cfs_fail_loc=304, val=0*** [ 1901.535586] LustreError: 6546:0:(ldlm_lockd.c:252:expired_lock_main()) ### lock callback timer expired after 102s: evicting client at 192.168.206.53@tcp ns: mdt-lustre-MDT0000_UUID lock: ffff94bbf6dd8f40/0x623b88e7ff14a7b7 lrc: 3/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 4 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba28367 expref: 17 pid: 9961 timeout: 1899 lvb_type: 0 [ 1901.582932] Lustre: mdt00_004: service thread pid 9962 completed after 102.113s. This likely indicates the system was overloaded (too many service threads, or not enough hardware resources). [ 1925.728042] Lustre: DEBUG MARKER: == recovery-small test 19b: test expired_lock_main on ost (2867) ========================================================== 04:10:21 (1743495021) [ 1931.965128] Lustre: *** cfs_fail_loc=304, val=0*** [ 1996.215805] Lustre: *** cfs_fail_loc=304, val=0*** [ 1996.219656] Lustre: Skipped 3 previous similar messages [ 2032.607669] LustreError: 6546:0:(ldlm_lockd.c:252:expired_lock_main()) ### lock callback timer expired after 101s: evicting client at 192.168.206.53@tcp ns: filter-lustre-OST0001_UUID lock: ffff94bac602f100/0x623b88e7ff14aa88 lrc: 3/0,0 mode: PW/PW res: [0x2c0000401:0xc:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) gid 0 flags: 0x60000400000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba28504 expref: 6 pid: 14897 timeout: 2031 lvb_type: 0 [ 2057.456304] Lustre: DEBUG MARKER: == recovery-small test 19c: check reconnect and lock resend do not trigger expired_lock_main ========================================================== 04:12:31 (1743495151) [ 2088.204713] Lustre: DEBUG MARKER: == recovery-small test 20a: ldlm_handle_enqueue error (should return error) ========================================================== 04:13:02 (1743495182) [ 2104.211023] Lustre: DEBUG MARKER: == recovery-small test 20b: ldlm_handle_enqueue error (should return error) ========================================================== 04:13:19 (1743495199) [ 2122.203638] Lustre: DEBUG MARKER: == recovery-small test 21a: drop close request while close and open are both in flight ========================================================== 04:13:36 (1743495216) [ 2124.476457] LustreError: 9962:0:(mdt_open.c:1428:mdt_reint_open()) cfs_fail_timeout id 129 sleeping for 5000ms [ 2127.362090] LustreError: 9962:0:(mdt_open.c:1428:mdt_reint_open()) cfs_fail_timeout interrupted [ 2128.819976] Lustre: *** cfs_fail_loc=115, val=2147483648*** [ 2165.047225] Lustre: DEBUG MARKER: == recovery-small test 21b: drop open request while close and open are both in flight ========================================================== 04:14:19 (1743495259) [ 2317.225466] Lustre: lustre-MDT0000: Client 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp) reconnecting [ 2317.238891] Lustre: Skipped 14 previous similar messages [ 2336.427415] Lustre: DEBUG MARKER: == recovery-small test 21c: drop both request while close and open are both in flight ========================================================== 04:17:10 (1743495430) [ 2379.827965] Lustre: DEBUG MARKER: == recovery-small test 21d: drop close reply while close and open are both in flight ========================================================== 04:17:54 (1743495474) [ 2382.447782] LustreError: 6557:0:(mdt_open.c:1428:mdt_reint_open()) cfs_fail_timeout id 129 sleeping for 5000ms [ 2385.114100] LustreError: 6557:0:(mdt_open.c:1428:mdt_reint_open()) cfs_fail_timeout interrupted [ 2386.639319] Lustre: *** cfs_fail_loc=122, val=2147483648*** [ 2386.657895] LustreError: 6560:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad09bd680 x1828185239292032/t4294967548(0) o35->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:420/0 lens 392/456 e 0 to 0 dl 1743495495 ref 1 fl Interpret:/200/0 rc 0/0 job:'multiop.0' uid:0 gid:0 [ 2402.257710] Lustre: 6560:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bbf5665050 x1828185239292032/t4294967548(0) o35->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:435/0 lens 392/456 e 0 to 0 dl 1743495510 ref 1 fl Interpret:/202/0 rc 0/0 job:'multiop.0' uid:0 gid:0 [ 2421.672181] Lustre: DEBUG MARKER: == recovery-small test 21e: drop open reply while close and open are both in flight ========================================================== 04:18:36 (1743495516) [ 2423.849033] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 2423.857581] LustreError: 9961:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94baceea5c40 x1828185239304064/t4294967565(0) o36->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:457/0 lens 488/456 e 0 to 0 dl 1743495532 ref 1 fl Interpret:/200/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 2439.117711] Lustre: 16003:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad0c40600 x1828185239304064/t4294967565(0) o36->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:472/0 lens 488/3152 e 0 to 0 dl 1743495547 ref 1 fl Interpret:/202/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 2462.559148] Lustre: DEBUG MARKER: == recovery-small test 21f: drop both reply while close and open are both in flight ========================================================== 04:19:17 (1743495557) [ 2464.796474] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 2464.802687] LustreError: 6558:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad09e6d80 x1828185239317760/t4294967584(0) o36->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:498/0 lens 488/456 e 0 to 0 dl 1743495573 ref 1 fl Interpret:/200/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 2480.048497] Lustre: 6559:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bbff72bf80 x1828185239318016/t4294967585(0) o35->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:513/0 lens 392/456 e 0 to 0 dl 1743495588 ref 1 fl Interpret:/202/0 rc 0/0 job:'multiop.0' uid:0 gid:0 [ 2480.091541] Lustre: 6559:0:(mdt_recovery.c:128:mdt_req_from_lrd()) Skipped 1 previous similar message [ 2498.634251] Lustre: DEBUG MARKER: == recovery-small test 21g: drop open reply and close request while close and open are both in flight ========================================================== 04:19:53 (1743495593) [ 2501.419618] LustreError: 6556:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bbc7e4e7c0 x1828185239330688/t4294967603(0) o36->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:535/0 lens 488/456 e 0 to 0 dl 1743495610 ref 1 fl Interpret:/200/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 2501.453964] LustreError: 6556:0:(ldlm_lib.c:3251:target_send_reply_msg()) Skipped 1 previous similar message [ 2505.528712] Lustre: *** cfs_fail_loc=115, val=2147483648*** [ 2505.536267] Lustre: Skipped 3 previous similar messages [ 2517.951804] Lustre: 16003:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad089a880 x1828185239330688/t4294967603(0) o36->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:551/0 lens 488/3152 e 0 to 0 dl 1743495626 ref 1 fl Interpret:/202/0 rc 0/0 job:'touch.0' uid:0 gid:0 [ 2538.760131] Lustre: DEBUG MARKER: == recovery-small test 21h: drop open request and close reply while close and open are both in flight ========================================================== 04:20:32 (1743495632) [ 2544.986355] Lustre: *** cfs_fail_loc=122, val=2147483648*** [ 2544.995600] Lustre: Skipped 2 previous similar messages [ 2557.890868] Lustre: 6559:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bbff71a2c0 x1828185239344640/t4294967622(0) o35->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:591/0 lens 392/456 e 0 to 0 dl 1743495666 ref 1 fl Interpret:/202/0 rc 0/0 job:'multiop.0' uid:0 gid:0 [ 2576.283661] Lustre: DEBUG MARKER: == recovery-small test 22: drop close request and do mknod ========================================================== 04:21:11 (1743495671) [ 2611.393967] Lustre: DEBUG MARKER: == recovery-small test 23: client hang when close a file after mds crash ========================================================== 04:21:46 (1743495706) [ 2623.818461] Lustre: Failing over lustre-MDT0000 [ 2624.291973] Lustre: server umount lustre-MDT0000 complete [ 2626.021701] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 2626.029284] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 2630.561482] LustreError: 9962:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 192.168.206.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2630.586802] LustreError: 9962:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 6 previous similar messages [ 2631.140085] LustreError: 6561:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2631.160113] LustreError: 6561:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 3 previous similar messages [ 2635.681518] LustreError: 16003:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 192.168.206.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2640.802286] LustreError: 6557:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 192.168.206.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2640.822768] LustreError: 6557:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 4 previous similar messages [ 2641.376723] Lustre: 3683:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743495723/real 1743495723] req@ffff94bace096200 x1828185279795456/t0(0) o400->MGC192.168.206.153@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1743495739 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 2641.429153] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 2646.499858] LustreError: 14569:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2646.516907] LustreError: 14569:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 8 previous similar messages [ 2649.741309] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 2651.637624] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bad089e7c0 x1828185279803392/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 2652.029154] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 2652.101477] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 2652.346381] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 2657.300827] Lustre: lustre-MDT0000-lwp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 2657.373632] Lustre: lustre-MDT0000: Recovery over after 0:05, of 2 clients 2 recovered and 0 were evicted. [ 2657.429282] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:21 to 0x2c0000401:65) [ 2657.434464] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:23 to 0x280000401:65) [ 2658.040889] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 2671.350527] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 2676.401322] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 2697.970704] Lustre: DEBUG MARKER: == recovery-small test 24a: fsync error (should return error) ========================================================== 04:23:12 (1743495792) [ 2700.929482] Lustre: 27013:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-OST0000: evicting 439ca163-a653-428d-9575-7b14f4b0f667 at adminstrative request [ 2717.405425] Lustre: DEBUG MARKER: == recovery-small test 24b: test dirty page discard due to client eviction ========================================================== 04:23:32 (1743495812) [ 2720.844339] Lustre: 27258:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-OST0000: evicting 439ca163-a653-428d-9575-7b14f4b0f667 at adminstrative request [ 2736.754522] Lustre: DEBUG MARKER: == recovery-small test 26a: evict dead exports =========== 04:23:51 (1743495831) [ 2743.375075] Lustre: DEBUG MARKER: SKIP: recovery-small test_26a msg and ost1 are at the same node [ 2747.629778] Lustre: DEBUG MARKER: == recovery-small test 26b: evict dead exports =========== 04:24:02 (1743495842) [ 2752.055782] Lustre: DEBUG MARKER: SKIP: recovery-small test_26b msg and ost1 are at the same node [ 2756.203949] Lustre: DEBUG MARKER: == recovery-small test 27: fail LOV while using OSC's ==== 04:24:10 (1743495850) [ 2762.062927] Lustre: Failing over lustre-MDT0000 [ 2762.279344] Lustre: lustre-MDT0000: Not available for connect from 192.168.206.53@tcp (stopping) [ 2762.662787] Lustre: server umount lustre-MDT0000 complete [ 2764.747668] LustreError: 9962:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 192.168.206.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2764.775431] Lustre: lustre-MDT0000-lwp-OST0000: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 2764.782909] LustreError: 9962:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 6 previous similar messages [ 2764.818324] Lustre: Skipped 4 previous similar messages [ 2781.152943] Lustre: 3682:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743495862/real 1743495862] req@ffff94bad0ee1180 x1828185279876608/t0(0) o400->MGC192.168.206.153@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1743495878 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 2781.168609] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 2781.179321] LustreError: 16003:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2781.189580] LustreError: 16003:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 22 previous similar messages [ 2789.430671] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 2790.460914] Lustre: Evicted from MGS (at 192.168.206.153@tcp) after server handle changed from 0x0 to 0x623b88e7ff14dfa2 [ 2790.467464] Lustre: MGC192.168.206.153@tcp: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 2790.470592] Lustre: Skipped 3 previous similar messages [ 2791.263513] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 2791.408458] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 2792.455930] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 2796.567408] Lustre: lustre-MDT0000-lwp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 2796.583450] Lustre: lustre-MDT0000: Recovery over after 0:04, of 2 clients 2 recovered and 0 were evicted. [ 2796.663906] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:102 to 0x280000401:129) [ 2796.668362] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:101 to 0x2c0000401:129) [ 2798.000304] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 2900.413824] Lustre: Failing over lustre-MDT0000 [ 2900.838796] Lustre: server umount lustre-MDT0000 complete [ 2904.037639] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 2904.043650] LustreError: 9961:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2904.052747] Lustre: Skipped 5 previous similar messages [ 2904.084492] LustreError: 9961:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 13 previous similar messages [ 2920.416417] Lustre: 3683:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743496001/real 1743496001] req@ffff94bad13c8040 x1828185280235264/t0(0) o400->MGC192.168.206.153@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1743496017 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 2920.484334] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 2925.762943] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 2931.546183] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 2931.639207] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 2932.920440] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 2936.816717] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 2936.832621] Lustre: Skipped 3 previous similar messages [ 2936.906772] Lustre: lustre-MDT0000: Recovery over after 0:04, of 2 clients 2 recovered and 0 were evicted. [ 2937.002356] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:790 to 0x280000401:833) [ 2937.004596] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:790 to 0x2c0000401:833) [ 2938.172175] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 2958.273593] Lustre: DEBUG MARKER: == recovery-small test 28: handle error adding new clients (bug 6086) ========================================================== 04:27:33 (1743496053) [ 2984.084083] Lustre: Failing over lustre-MDT0000 [ 2984.420887] Lustre: server umount lustre-MDT0000 complete [ 2984.882411] LustreError: 6558:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 192.168.206.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 2984.895472] LustreError: 6558:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 30 previous similar messages [ 2988.022516] Lustre: lustre-MDT0000-lwp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 2988.035312] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 2988.041264] Lustre: Skipped 1 previous similar message [ 3004.383135] Lustre: 3682:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743496085/real 1743496085] req@ffff94bacf72b400 x1828185280311936/t0(0) o400->MGC192.168.206.153@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1743496101 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3004.414251] Lustre: 3682:0:(client.c:2346:ptlrpc_expire_one_request()) Skipped 1 previous similar message [ 3004.430089] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3008.714645] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3014.630705] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bbfdf92e40 x1828185280321664/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3015.063662] Lustre: *** cfs_fail_loc=12f, val=0*** [ 3015.075043] LustreError: 14569:0:(tgt_lastrcvd.c:1052:tgt_client_new()) lustre-MDT0001: no room for 0 clients - fix LR_MAX_CLIENTS [ 3015.098850] LustreError: lustre-MDT0001-osp-MDT0000: operation mds_connect to node 0@lo failed: rc = -75 [ 3015.289611] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 3015.415594] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 3015.596238] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 3020.823491] Lustre: lustre-MDT0000-lwp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 3020.827451] Lustre: Skipped 3 previous similar messages [ 3020.921325] Lustre: lustre-MDT0000: Recovery over after 0:05, of 2 clients 2 recovered and 0 were evicted. [ 3020.993213] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:876 to 0x2c0000401:897) [ 3020.996416] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:877 to 0x280000401:897) [ 3021.723862] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3035.447918] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 3039.395041] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 3057.151082] Lustre: DEBUG MARKER: == recovery-small test 29a: error adding new clients doesn't cause LBUG (bug 22273) ========================================================== 04:29:11 (1743496151) [ 3061.976824] Lustre: Failing over lustre-MDT0000 [ 3062.241630] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 3062.250939] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 3062.258523] Lustre: Skipped 2 previous similar messages [ 3062.289135] Lustre: server umount lustre-MDT0000 complete [ 3076.212559] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3076.602749] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3076.902768] Lustre: *** cfs_fail_loc=711, val=0*** [ 3077.139996] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 3077.211119] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 3077.214956] Lustre: lustre-MDT0000: Aborting client recovery [ 3077.216597] LustreError: 33045:0:(ldlm_lib.c:2907:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery [ 3077.244627] Lustre: 33077:0:(ldlm_lib.c:2289:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 3077.252780] Lustre: 33077:0:(genops.c:1508:class_disconnect_stale_exports()) lustre-MDT0000: disconnect stale client 439ca163-a653-428d-9575-7b14f4b0f667@ [ 3077.258974] Lustre: lustre-MDT0000: disconnecting 2 stale clients [ 3077.268313] Lustre: lustre-MDT0000-osd: cancel update llog [0x200000400:0x1:0x0] [ 3077.301697] Lustre: lustre-MDT0001-osp-MDT0000: cancel update llog [0x240000401:0x1:0x0] [ 3077.388218] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:877 to 0x280000401:929) [ 3077.391397] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:876 to 0x2c0000401:929) [ 3082.252039] LustreError: lustre-MDT0000-osp-MDT0001: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. [ 3082.274293] Lustre: lustre-MDT0000-lwp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 3082.280428] Lustre: Skipped 3 previous similar messages [ 3083.759599] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3116.677245] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0000.ost_server_uuid 50 [ 3117.320963] Lustre: DEBUG MARKER: os[cp].lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 0 sec [ 3134.344126] Lustre: DEBUG MARKER: == recovery-small test 29b: error adding new clients doesn't cause LBUG (bug 22273) ========================================================== 04:30:28 (1743496228) [ 3139.810838] Lustre: Failing over lustre-OST0000 [ 3140.047788] Lustre: server umount lustre-OST0000 complete [ 3142.115895] LustreError: lustre-OST0000-osc-MDT0001: operation ost_statfs to node 0@lo failed: rc = -107 [ 3142.121327] Lustre: lustre-OST0000-osc-MDT0001: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 3142.126744] Lustre: Skipped 3 previous similar messages [ 3142.132326] LustreError: 27754:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 3142.168028] LustreError: 27754:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 44 previous similar messages [ 3155.515957] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 3155.837378] Lustre: 34595:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 3156.350384] Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 60-180 [ 3156.368177] Lustre: lustre-OST0000: in recovery but waiting for the first client to connect [ 3156.368953] Lustre: lustre-OST0000: Aborting recovery [ 3156.372950] Lustre: Skipped 2 previous similar messages [ 3156.377764] LustreError: 34595:0:(ldlm_lib.c:2907:target_stop_recovery_thread()) lustre-OST0000: Aborting recovery [ 3156.399756] Lustre: 34611:0:(ldlm_lib.c:2289:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 3156.403447] Lustre: 34611:0:(ldlm_lib.c:2289:target_recovery_overseer()) Skipped 2 previous similar messages [ 3156.422285] Lustre: 34611:0:(genops.c:1508:class_disconnect_stale_exports()) lustre-OST0000: disconnect stale client 439ca163-a653-428d-9575-7b14f4b0f667@ [ 3156.432579] Lustre: 34611:0:(genops.c:1508:class_disconnect_stale_exports()) Skipped 1 previous similar message [ 3156.436665] Lustre: lustre-OST0000: disconnecting 3 stale clients [ 3156.462576] LustreError: 34611:0:(ofd_obd.c:1298:ofd_iocontrol()) lustre-OST0000: iocontrol from 'tgt_recover_0' cmd=c00866c1 _IOWR('f', 193, 8) unrecognized: rc = -25 [ 3158.251210] Lustre: *** cfs_fail_loc=711, val=0*** [ 3158.262268] LustreError: lustre-OST0000-osc-MDT0000: This client was evicted by lustre-OST0000; in progress operations using this service will fail. [ 3158.274271] Lustre: lustre-OST0000-osc-MDT0000: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 3158.277460] Lustre: Skipped 2 previous similar messages [ 3161.591549] LustreError: lustre-OST0000-osc-MDT0001: This client was evicted by lustre-OST0000; in progress operations using this service will fail. [ 3165.139308] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3205.005937] Lustre: DEBUG MARKER: == recovery-small test 50: failover MDS under load ======= 04:31:39 (1743496299) [ 3221.506057] Lustre: Failing over lustre-MDT0000 [ 3221.559459] Lustre: lustre-MDT0000: Not available for connect from 192.168.206.53@tcp (stopping) [ 3221.999359] Lustre: server umount lustre-MDT0000 complete [ 3223.019094] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 3223.037103] Lustre: Skipped 2 previous similar messages [ 3238.371034] Lustre: 3683:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743496320/real 1743496320] req@ffff94bac6609d00 x1828185280486272/t0(0) o400->MGC192.168.206.153@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1743496336 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3238.430666] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3247.227818] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3248.415162] LustreError: 35930:0:(import.c:333:ptlrpc_invalidate_import()) MGS: timeout waiting for callback (1 != 0) [ 3248.428368] LustreError: 35930:0:(import.c:357:ptlrpc_invalidate_import()) @@@ still on sending list req@ffff94bbf4cb7340 x1828185280491776/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 1743496346 ref 1 fl Rpc:NQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3248.448083] LustreError: 35930:0:(import.c:367:ptlrpc_invalidate_import()) MGS: Unregistering RPCs found (0). Network is sluggish? Waiting for them to error out. [ 3249.195198] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 3249.269757] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 3249.272990] Lustre: Skipped 2 previous similar messages [ 3251.120637] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 3254.286828] Lustre: lustre-MDT0000-lwp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 3254.299383] Lustre: Skipped 1 previous similar message [ 3254.344810] Lustre: lustre-MDT0000: Recovery over after 0:03, of 2 clients 2 recovered and 0 were evicted. [ 3254.413951] Lustre: 9961:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad0d24540 x1828185241059456/t25769804504(0) o36->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:573/0 lens 504/2888 e 0 to 0 dl 1743496403 ref 1 fl Interpret:/202/0 rc 0/0 job:'writemany.0' uid:0 gid:0 [ 3254.429292] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:1048 to 0x2c0000401:1089) [ 3254.432524] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:1048 to 0x280000401:1089) [ 3256.208137] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3274.042112] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 3281.566828] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 3352.192861] Lustre: Failing over lustre-MDT0000 [ 3352.471664] Lustre: server umount lustre-MDT0000 complete [ 3356.643202] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 3356.664374] Lustre: Skipped 3 previous similar messages [ 3373.037165] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3377.489836] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3383.269712] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bad0ac4b00 x1828185280862336/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3384.208258] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 3384.361562] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 3385.803074] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 3389.445183] Lustre: lustre-MDT0000-lwp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 3389.460865] Lustre: Skipped 3 previous similar messages [ 3389.498815] Lustre: lustre-MDT0000: Recovery over after 0:04, of 2 clients 2 recovered and 0 were evicted. [ 3389.593195] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:1780 to 0x2c0000401:1825) [ 3389.594559] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:1780 to 0x280000401:1825) [ 3390.680725] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3411.212704] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 3418.718631] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 3490.183454] Lustre: Failing over lustre-MDT0000 [ 3490.282724] Lustre: lustre-MDT0000: Not available for connect from 192.168.206.53@tcp (stopping) [ 3490.319599] LustreError: 3682:0:(client.c:1282:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff94bad0d27340 x1828185281228160/t0(0) o6->lustre-OST0001-osc-MDT0000@0@lo:28/4 lens 544/432 e 0 to 0 dl 0 ref 1 fl Rpc:QU/200/ffffffff rc 0/-1 job:'osp-syn-1-0.0' uid:0 gid:0 [ 3490.808267] Lustre: server umount lustre-MDT0000 complete [ 3491.843858] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 3491.850338] LustreError: 6557:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 3491.857131] Lustre: Skipped 3 previous similar messages [ 3491.866942] LustreError: 6557:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 75 previous similar messages [ 3508.204694] Lustre: 3684:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743496589/real 1743496589] req@ffff94bbff6322c0 x1828185281228800/t0(0) o400->MGC192.168.206.153@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1743496605 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3508.234489] Lustre: 3684:0:(client.c:2346:ptlrpc_expire_one_request()) Skipped 1 previous similar message [ 3508.245667] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3516.469557] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3518.641194] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 3518.828890] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 3520.460051] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 3523.599575] Lustre: lustre-MDT0000-lwp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 3523.609634] Lustre: Skipped 3 previous similar messages [ 3523.647702] Lustre: lustre-MDT0000: Recovery over after 0:03, of 2 clients 2 recovered and 0 were evicted. [ 3523.751849] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:2535 to 0x2c0000401:2561) [ 3523.759414] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:2534 to 0x280000401:2561) [ 3525.823586] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3543.969919] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 3551.316665] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 3592.174106] Lustre: DEBUG MARKER: == recovery-small test 51: failover MDS during recovery == 04:38:07 (1743496687) [ 3599.595628] Lustre: Failing over lustre-MDT0000 [ 3600.081688] Lustre: server umount lustre-MDT0000 complete [ 3600.864176] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 3600.876894] LustreError: Skipped 1 previous similar message [ 3620.319516] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3625.471473] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3631.534797] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 3636.807782] Lustre: lustre-MDT0000: Recovery over after 0:05, of 2 clients 2 recovered and 0 were evicted. [ 3636.860748] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:2943 to 0x2c0000401:2977) [ 3636.865829] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:2943 to 0x280000401:2977) [ 3636.921532] Lustre: 6558:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad0239180 x1828185244426880/t38654708038(0) o36->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:199/0 lens 504/2888 e 0 to 0 dl 1743496784 ref 1 fl Interpret:/202/0 rc 0/0 job:'writemany.0' uid:0 gid:0 [ 3638.321252] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3648.189211] Lustre: DEBUG MARKER: test_51: failover in 1 sec [ 3655.493226] Lustre: Failing over lustre-MDT0000 [ 3656.236321] Lustre: server umount lustre-MDT0000 complete [ 3657.196312] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 3657.208197] Lustre: Skipped 7 previous similar messages [ 3682.438370] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3684.586333] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 3684.590947] Lustre: Skipped 1 previous similar message [ 3684.712610] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 3684.721642] Lustre: Skipped 1 previous similar message [ 3690.075165] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:3075 to 0x2c0000401:3105) [ 3690.080507] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:3076 to 0x280000401:3105) [ 3692.034874] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3702.120741] Lustre: DEBUG MARKER: test_51: failover in 5 sec [ 3711.992466] Lustre: Failing over lustre-MDT0000 [ 3712.484534] Lustre: server umount lustre-MDT0000 complete [ 3738.637504] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3742.186581] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bbfdf967c0 x1828185281653888/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3744.174673] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 3744.177716] Lustre: Skipped 1 previous similar message [ 3747.878716] Lustre: lustre-MDT0000: Recovery over after 0:03, of 2 clients 2 recovered and 0 were evicted. [ 3747.902306] Lustre: Skipped 1 previous similar message [ 3747.972776] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:3235 to 0x280000401:3265) [ 3748.003508] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:3235 to 0x2c0000401:3265) [ 3750.571477] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3760.141092] Lustre: DEBUG MARKER: test_51: failover in 10 sec [ 3776.450277] Lustre: Failing over lustre-MDT0000 [ 3776.489722] Lustre: lustre-MDT0000: Not available for connect from 192.168.206.53@tcp (stopping) [ 3776.628677] LustreError: 3683:0:(client.c:1282:ptlrpc_import_delay_req()) @@@ IMP_CLOSED req@ffff94bad023ae40 x1828185281763456/t0(0) o6->lustre-OST0001-osc-MDT0000@0@lo:28/4 lens 544/432 e 0 to 0 dl 0 ref 1 fl Rpc:QU/200/ffffffff rc 0/-1 job:'osp-syn-1-0.0' uid:0 gid:0 [ 3777.100536] Lustre: server umount lustre-MDT0000 complete [ 3778.542415] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 3794.914588] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 3794.922705] LustreError: Skipped 2 previous similar messages [ 3803.266271] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3804.767239] LustreError: 42209:0:(import.c:333:ptlrpc_invalidate_import()) MGS: timeout waiting for callback (1 != 0) [ 3804.780095] LustreError: 42209:0:(import.c:357:ptlrpc_invalidate_import()) @@@ still on sending list req@ffff94bbff6e4b00 x1828185281769728/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 1743496902 ref 1 fl Rpc:NQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3804.796630] LustreError: 42209:0:(import.c:367:ptlrpc_invalidate_import()) MGS: Unregistering RPCs found (0). Network is sluggish? Waiting for them to error out. [ 3805.168550] Lustre: Evicted from MGS (at 192.168.206.153@tcp) after server handle changed from 0x0 to 0x623b88e7ff1be5a6 [ 3805.188323] Lustre: MGC192.168.206.153@tcp: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 3805.194413] Lustre: Skipped 15 previous similar messages [ 3811.422384] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:3447 to 0x280000401:3489) [ 3811.433847] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:3448 to 0x2c0000401:3489) [ 3812.797066] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3823.238722] Lustre: DEBUG MARKER: test_51: failover in 20 sec [ 3849.232237] Lustre: Failing over lustre-MDT0000 [ 3849.699437] Lustre: server umount lustre-MDT0000 complete [ 3852.266410] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 3875.097494] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3877.471134] LustreError: 43169:0:(import.c:333:ptlrpc_invalidate_import()) MGS: timeout waiting for callback (1 != 0) [ 3877.482339] LustreError: 43169:0:(import.c:357:ptlrpc_invalidate_import()) @@@ still on sending list req@ffff94bad0ee1740 x1828185281928960/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 1743496975 ref 1 fl Rpc:NQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3877.506658] LustreError: 43169:0:(import.c:367:ptlrpc_invalidate_import()) MGS: Unregistering RPCs found (0). Network is sluggish? Waiting for them to error out. [ 3877.855929] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bac61d4540 x1828185281930624/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3877.898629] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) Skipped 2 previous similar messages [ 3880.047021] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 3880.062550] Lustre: Skipped 1 previous similar message [ 3884.074078] Lustre: lustre-MDT0000: Recovery over after 0:04, of 2 clients 2 recovered and 0 were evicted. [ 3884.080921] Lustre: Skipped 1 previous similar message [ 3884.122051] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:3747 to 0x280000401:3777) [ 3884.131425] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:3746 to 0x2c0000401:3777) [ 3884.193135] Lustre: 9961:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad138d680 x1828185245705600/t55834576452(0) o36->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:446/0 lens 512/2888 e 0 to 0 dl 1743497031 ref 1 fl Interpret:/202/0 rc 0/0 job:'writemany.0' uid:0 gid:0 [ 3884.879300] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3893.822367] Lustre: DEBUG MARKER: test_51: failover in 25 sec [ 3925.999903] Lustre: Failing over lustre-MDT0000 [ 3926.415355] Lustre: server umount lustre-MDT0000 complete [ 3930.085660] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 3930.091490] Lustre: Skipped 12 previous similar messages [ 3930.098357] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 3952.899551] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 3957.225876] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bace0967c0 x1828185282106240/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 3958.166830] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 3958.178641] Lustre: Skipped 3 previous similar messages [ 3958.369451] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 3958.385864] Lustre: Skipped 3 previous similar messages [ 3963.504672] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:4067 to 0x280000401:4129) [ 3963.524433] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:4067 to 0x2c0000401:4129) [ 3965.839728] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 3975.913255] Lustre: DEBUG MARKER: test_51: failover in 30 sec [ 4013.805566] Lustre: Failing over lustre-MDT0000 [ 4014.534240] Lustre: server umount lustre-MDT0000 complete [ 4014.600282] LustreError: 6557:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 4014.638819] LustreError: 6557:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 234 previous similar messages [ 4030.944550] Lustre: 3683:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743497112/real 1743497112] req@ffff94bad166d0c0 x1828185282295552/t0(0) o400->MGC192.168.206.153@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1743497128 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 4030.959241] Lustre: 3683:0:(client.c:2346:ptlrpc_expire_one_request()) Skipped 6 previous similar messages [ 4039.153342] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 4041.185587] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bac19d5c40 x1828185282305152/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 4047.463144] Lustre: 6558:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad166d0c0 x1828185246917760/t64424511602(0) o36->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:610/0 lens 512/2888 e 0 to 0 dl 1743497195 ref 1 fl Interpret:/202/0 rc 0/0 job:'writemany.0' uid:0 gid:0 [ 4047.469612] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:4479 to 0x280000401:4513) [ 4047.473689] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:4479 to 0x2c0000401:4513) [ 4047.825870] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 4087.893257] Lustre: DEBUG MARKER: == recovery-small test 52: failover OST under load ======= 04:46:22 (1743497182) [ 4105.233431] Lustre: Failing over lustre-OST0000 [ 4105.311321] Lustre: lustre-OST0000: Not available for connect from 192.168.206.53@tcp (stopping) [ 4105.581495] Lustre: server umount lustre-OST0000 complete [ 4108.793765] LustreError: lustre-OST0000-osc-MDT0001: operation ost_statfs to node 0@lo failed: rc = -107 [ 4130.205096] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 4130.388713] Lustre: 46223:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 4141.411364] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 4159.806228] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 4166.893896] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 4435.153881] Lustre: Failing over lustre-OST0000 [ 4435.327757] Lustre: server umount lustre-OST0000 complete [ 4435.798302] LustreError: lustre-OST0000-osc-MDT0001: operation ost_destroy to node 0@lo failed: rc = -107 [ 4459.252518] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 4459.446460] Lustre: 47394:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 4459.774176] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 4461.710699] Lustre: lustre-OST0000: Will be in recovery for at least 1:00, or until 3 clients reconnect [ 4461.714057] Lustre: Skipped 3 previous similar messages [ 4461.881515] Lustre: lustre-OST0000: Recovery over after 0:01, of 3 clients 3 recovered and 0 were evicted. [ 4461.884813] Lustre: lustre-OST0000-osc-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 4461.893734] Lustre: Skipped 3 previous similar messages [ 4461.925966] Lustre: Skipped 18 previous similar messages [ 4470.468408] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 4487.746742] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 4494.047394] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 4766.763835] Lustre: Failing over lustre-OST0000 [ 4766.966298] Lustre: lustre-OST0000: Not available for connect from 192.168.206.53@tcp (stopping) [ 4767.222886] Lustre: lustre-OST0000-osc-MDT0001: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 4767.230828] Lustre: Skipped 11 previous similar messages [ 4769.083421] Lustre: server umount lustre-OST0000 complete [ 4769.251352] LustreError: 27754:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 4769.259286] LustreError: 27754:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 63 previous similar messages [ 4792.604284] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 4792.761614] Lustre: 48568:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 4793.048049] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 4793.078940] Lustre: lustre-OST0000: in recovery but waiting for the first client to connect [ 4793.084902] Lustre: Skipped 3 previous similar messages [ 4801.572772] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 4817.312739] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 4822.941528] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 5066.415705] Lustre: DEBUG MARKER: == recovery-small test 53a: touch: drop rep ============== 05:02:42 (1743498162) [ 5069.010421] Lustre: *** cfs_fail_loc=157, val=2147483648*** [ 5069.012984] LustreError: 16003:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bbf4cb2880 x1828185261704960/t0(0) o101->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:82/0 lens 576/688 e 0 to 0 dl 1743498177 ref 1 fl Interpret:/200/0 rc 0/0 job:'openfile.0' uid:0 gid:0 [ 5069.020135] LustreError: 16003:0:(ldlm_lib.c:3251:target_send_reply_msg()) Skipped 1 previous similar message [ 5085.640342] Lustre: lustre-MDT0000: Client 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp) reconnecting [ 5085.665084] Lustre: Skipped 7 previous similar messages [ 5098.637557] Lustre: DEBUG MARKER: == recovery-small test 53b: touch: drop rep ============== 05:03:14 (1743498194) [ 5101.023602] Lustre: *** cfs_fail_loc=157, val=2147483648*** [ 5101.025308] LustreError: 6558:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bacf72ed80 x1828185261710848/t0(0) o101->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:114/0 lens 576/688 e 0 to 0 dl 1743498209 ref 1 fl Interpret:/200/0 rc 0/0 job:'openfile.0' uid:0 gid:0 [ 5131.163407] Lustre: DEBUG MARKER: == recovery-small test 53c: touch: drop rep ============== 05:03:46 (1743498226) [ 5133.357678] Lustre: *** cfs_fail_loc=157, val=2147483648*** [ 5133.360491] LustreError: 31167:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad0b9a880 x1828185261715328/t68719477954(0) o101->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:147/0 lens 664/664 e 0 to 0 dl 1743498242 ref 1 fl Interpret:/200/0 rc 0/0 job:'openfile.0' uid:0 gid:0 [ 5149.626136] Lustre: 31167:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad166a2c0 x1828185261715328/t68719477954(0) o101->439ca163-a653-428d-9575-7b14f4b0f667@192.168.206.53@tcp:163/0 lens 664/3488 e 0 to 0 dl 1743498258 ref 1 fl Interpret:H/202/0 rc 0/0 job:'openfile.0' uid:0 gid:0 [ 5164.501393] Lustre: DEBUG MARKER: == recovery-small test 54: back in time ================== 05:04:20 (1743498260) [ 5178.927777] Lustre: Failing over lustre-MDT0000 [ 5179.307614] Lustre: server umount lustre-MDT0000 complete [ 5196.767996] Lustre: 3684:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743498278/real 1743498278] req@ffff94bad1010600 x1828185286169600/t0(0) o400->MGC192.168.206.153@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1743498294 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 5196.788400] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 5196.793876] LustreError: Skipped 3 previous similar messages [ 5201.129763] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 5207.012926] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bbc7d922c0 x1828185286178048/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 5207.429464] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 5207.433425] Lustre: Skipped 2 previous similar messages [ 5208.318119] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 3 clients reconnect [ 5208.322867] Lustre: Skipped 1 previous similar message [ 5211.535505] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 5212.654910] Lustre: lustre-MDT0000-lwp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 5212.659519] Lustre: Skipped 3 previous similar messages [ 5212.677517] Lustre: lustre-MDT0000: Recovery over after 0:04, of 3 clients 3 recovered and 0 were evicted. [ 5212.682042] Lustre: Skipped 1 previous similar message [ 5212.701892] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:4711 to 0x280000401:4737) [ 5212.702559] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:4711 to 0x2c0000401:4737) [ 5220.630952] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 5223.575752] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 5237.513514] Lustre: DEBUG MARKER: == recovery-small test 55: ost_brw_read/write drops timed-out read/write request ========================================================== 05:05:33 (1743498333) [ 5263.087092] Lustre: *** cfs_fail_loc=21d, val=0*** [ 5263.089034] Lustre: Skipped 3 previous similar messages [ 5263.102333] LustreError: 29740:0:(tgt_handler.c:2766:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.206.53@tcp because locking object 0x280000400:7228 took 0 seconds (limit was 11). [ 5263.123114] Lustre: lustre-OST0000: Bulk IO write error with 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp), client will retry: rc = -110 [ 5263.639290] LustreError: 8438:0:(tgt_handler.c:2766:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.206.53@tcp because locking object 0x280000400:7228 took 0 seconds (limit was 11). [ 5263.647731] LustreError: 8438:0:(tgt_handler.c:2766:tgt_brw_write()) Skipped 2 previous similar messages [ 5263.652847] Lustre: lustre-OST0000: Bulk IO write error with 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp), client will retry: rc = -110 [ 5263.660919] Lustre: Skipped 2 previous similar messages [ 5279.091658] Lustre: lustre-OST0000: Client 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp) reconnecting [ 5279.104403] Lustre: Skipped 2 previous similar messages [ 5279.118613] LustreError: 8437:0:(tgt_handler.c:2766:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.206.53@tcp because locking object 0x280000400:7228 took 0 seconds (limit was 11). [ 5279.119723] Lustre: lustre-OST0000: Bulk IO write error with 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp), client will retry: rc = -110 [ 5279.128320] LustreError: 8437:0:(tgt_handler.c:2766:tgt_brw_write()) Skipped 11 previous similar messages [ 5279.144624] Lustre: Skipped 10 previous similar messages [ 5295.544736] LustreError: 8437:0:(tgt_handler.c:2766:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.206.53@tcp because locking object 0x280000400:7228 took 0 seconds (limit was 11). [ 5295.551413] Lustre: lustre-OST0000: Bulk IO write error with 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp), client will retry: rc = -110 [ 5295.556143] Lustre: Skipped 2 previous similar messages [ 5311.867745] LustreError: 8439:0:(tgt_handler.c:2766:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.206.53@tcp because locking object 0x280000400:7228 took 0 seconds (limit was 11). [ 5311.871139] Lustre: lustre-OST0000: Bulk IO write error with 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp), client will retry: rc = -110 [ 5311.883305] LustreError: 8439:0:(tgt_handler.c:2766:tgt_brw_write()) Skipped 14 previous similar messages [ 5311.902491] Lustre: Skipped 7 previous similar messages [ 5320.771089] LustreError: 8437:0:(tgt_handler.c:2766:tgt_brw_write()) lustre-OST0000: Dropping timed-out write from 12345-192.168.206.53@tcp because locking object 0x280000400:7228 took 0 seconds (limit was 11). [ 5320.778994] Lustre: lustre-OST0000: Bulk IO write error with 439ca163-a653-428d-9575-7b14f4b0f667 (at 192.168.206.53@tcp), client will retry: rc = -110 [ 5327.259110] Lustre: *** cfs_fail_loc=21d, val=0*** [ 5327.261018] Lustre: Skipped 39 previous similar messages [ 5375.138540] Lustre: DEBUG MARKER: == recovery-small test 56: do not fail on getattr resend ========================================================== 05:07:51 (1743498471) [ 5376.548590] LustreError: 9961:0:(mdt_handler.c:2326:mdt_getattr_name_lock()) cfs_fail_timeout id 136 sleeping for 40000ms [ 5416.647153] LustreError: 9961:0:(mdt_handler.c:2326:mdt_getattr_name_lock()) cfs_fail_timeout id 136 awake [ 5429.193818] Lustre: DEBUG MARKER: == recovery-small test 57: read procfs entries causes kernel crash ========================================================== 05:08:44 (1743498524) [ 5435.123631] Lustre: Failing over lustre-MDT0000 [ 5435.442800] Lustre: server umount lustre-MDT0000 complete [ 5436.914174] Lustre: lustre-MDT0000-lwp-OST0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 5436.924422] LustreError: 6561:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 5436.928332] Lustre: Skipped 7 previous similar messages [ 5436.956448] LustreError: 6561:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 58 previous similar messages [ 5446.165791] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 5446.349145] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 5446.696426] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 5446.767733] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 5446.771175] Lustre: lustre-MDT0000: Aborting client recovery [ 5446.773494] Lustre: Skipped 1 previous similar message [ 5446.780882] LustreError: 52852:0:(ldlm_lib.c:2907:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery [ 5446.789362] Lustre: 52884:0:(ldlm_lib.c:2289:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 5446.796230] Lustre: 52884:0:(ldlm_lib.c:2289:target_recovery_overseer()) Skipped 2 previous similar messages [ 5446.801668] Lustre: 52884:0:(genops.c:1508:class_disconnect_stale_exports()) lustre-MDT0000: disconnect stale client lustre-MDT0001-mdtlov_UUID@ [ 5446.808146] Lustre: 52884:0:(genops.c:1508:class_disconnect_stale_exports()) Skipped 2 previous similar messages [ 5446.815189] Lustre: lustre-MDT0000: disconnecting 1 stale clients [ 5446.826969] Lustre: lustre-MDT0000-osd: cancel update llog [0x200002b10:0x1:0x0] [ 5446.843820] Lustre: lustre-MDT0001-osp-MDT0000: cancel update llog [0x240000404:0x1:0x0] [ 5446.887921] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:4739 to 0x280000401:4769) [ 5446.897624] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:4711 to 0x2c0000401:4769) [ 5451.802391] LustreError: lustre-MDT0000-osp-MDT0001: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. [ 5451.932206] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 5481.048574] Lustre: DEBUG MARKER: == recovery-small test 58: Eviction in the middle of open RPC reply processing ========================================================== 05:09:36 (1743498576) [ 5512.357738] Lustre: DEBUG MARKER: == recovery-small test 59: Read cancel race on client eviction ========================================================== 05:10:08 (1743498608) [ 5526.486672] LustreError: 8434:0:(ldlm_lockd.c:773:ldlm_handle_ast_error()) ### client (nid 192.168.206.53@tcp) returned error from blocking AST (req@0000000050578412 x1828185286348800 status -107 rc -107), evict it ns: filter-lustre-OST0000_UUID lock: ffff94bbde436d40/0x623b88e7ff302b46 lrc: 4/0,0 mode: PW/PW res: [0x280000401:0x12a2:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) gid 0 flags: 0x60000400000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba6afc0 expref: 5 pid: 28736 timeout: 5626 lvb_type: 0 [ 5526.509089] LustreError: lustre-OST0000: A client on nid 192.168.206.53@tcp was evicted due to a lock blocking callback time out: rc -107 [ 5526.515702] LustreError: 6546:0:(ldlm_lockd.c:252:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 192.168.206.53@tcp ns: filter-lustre-OST0000_UUID lock: ffff94bbde436d40/0x623b88e7ff302b46 lrc: 3/0,0 mode: PW/PW res: [0x280000401:0x12a2:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615] (req 0->4095) gid 0 flags: 0x60000400000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba6afc0 expref: 6 pid: 28736 timeout: 0 lvb_type: 0 [ 5538.276264] Lustre: DEBUG MARKER: == recovery-small test 60: Add Changelog entries during MDS failover ========================================================== 05:10:34 (1743498634) [ 5538.835259] LustreError: 9962:0:(ldlm_lockd.c:773:ldlm_handle_ast_error()) ### client (nid 192.168.206.53@tcp) returned error from blocking AST (req@00000000208fa4fb x1828185286355584 status -107 rc -107), evict it ns: mdt-lustre-MDT0000_UUID lock: ffff94bbde435300/0x623b88e7ff302b62 lrc: 4/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba6afce expref: 6 pid: 9962 timeout: 5638 lvb_type: 0 [ 5538.866977] LustreError: lustre-MDT0000: A client on nid 192.168.206.53@tcp was evicted due to a lock blocking callback time out: rc -107 [ 5538.878335] LustreError: 6546:0:(ldlm_lockd.c:252:expired_lock_main()) ### lock callback timer expired after 0s: evicting client at 192.168.206.53@tcp ns: mdt-lustre-MDT0000_UUID lock: ffff94bbde435300/0x623b88e7ff302b62 lrc: 3/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT gid 0 flags: 0x60200400000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba6afce expref: 7 pid: 9962 timeout: 0 lvb_type: 0 [ 5542.025800] Lustre: lustre-MDD0000: changelog on [ 5545.124621] Lustre: lustre-MDD0001: changelog on [ 5545.382538] Lustre: lustre-MDT0001: haven't heard from client 863acf86-b4a5-41e3-92ee-0fd609136a8b (at 192.168.206.53@tcp) in 32 seconds. I think it's dead, and I am evicting it. exp ffff94bace1db800, cur 1743498643 expire 1743498613 last 1743498611 [ 5549.030801] Lustre: lustre-OST0001: haven't heard from client 863acf86-b4a5-41e3-92ee-0fd609136a8b (at 192.168.206.53@tcp) in 35 seconds. I think it's dead, and I am evicting it. exp ffff94bad1768800, cur 1743498646 expire 1743498616 last 1743498611 [ 5642.016756] Lustre: Failing over lustre-MDT0000 [ 5642.059440] Lustre: lustre-MDT0000: Not available for connect from 192.168.206.53@tcp (stopping) [ 5642.062991] Lustre: Skipped 1 previous similar message [ 5642.711285] Lustre: server umount lustre-MDT0000 complete [ 5646.306176] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 5646.315603] LustreError: Skipped 13 previous similar messages [ 5661.665378] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 5662.234942] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 5671.904310] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bad13cbf80 x1828185286427136/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 5672.211166] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 5672.270544] Lustre: lustre-MDD0000: changelog on [ 5676.822281] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 5677.549732] LustreError: 3681:0:(import.c:1334:ptlrpc_connect_interpret()) lustre-MDT0000_UUID: went back in time (transno 73014444040 was previously committed, server now claims 68719477960)! [ 5677.561588] LustreError: 3681:0:(import.c:1336:ptlrpc_connect_interpret()) For further information, see http://doc.lustre.org/lustre_manual.xhtml#went_back_in_time [ 5677.607231] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6021 to 0x2c0000401:6049) [ 5677.607423] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6021 to 0x280000401:6049) [ 5783.829363] Lustre: lustre-MDD0000: changelog off [ 5787.003212] Lustre: lustre-MDD0001: changelog off [ 5801.795346] Lustre: DEBUG MARKER: == recovery-small test 61: Verify to not reuse orphan objects - bug 17025 ========================================================== 05:14:57 (1743498897) [ 5810.452186] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 [ 5813.376620] Lustre: Failing over lustre-MDT0000 [ 5813.704756] Lustre: server umount lustre-MDT0000 complete [ 5815.787265] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 5826.108602] LDISKFS-fs (dm-0): recovery complete [ 5826.110656] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 5826.246443] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 5826.594586] Lustre: lustre-MDT0000: Aborting client recovery [ 5826.596626] LustreError: 57401:0:(ldlm_lib.c:2907:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery [ 5826.603582] Lustre: 57432:0:(ldlm_lib.c:2289:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 5826.607397] Lustre: 57432:0:(ldlm_lib.c:2289:target_recovery_overseer()) Skipped 2 previous similar messages [ 5826.611712] Lustre: 57432:0:(genops.c:1508:class_disconnect_stale_exports()) lustre-MDT0000: disconnect stale client feaceb2d-dc87-4156-9bc6-241e59797d8c@ [ 5826.619514] Lustre: lustre-MDT0000: disconnecting 2 stale clients [ 5826.628698] Lustre: lustre-MDT0000-osd: cancel update llog [0x2000088d0:0x1:0x0] [ 5826.644792] Lustre: lustre-MDT0001-osp-MDT0000: cancel update llog [0x240000405:0x1:0x0] [ 5826.689530] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6021 to 0x280000401:6081) [ 5826.693315] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6021 to 0x2c0000401:6081) [ 5830.611093] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 5831.663659] LustreError: lustre-MDT0000-osp-MDT0001: This client was evicted by lustre-MDT0000; in progress operations using this service will fail. [ 5831.673528] Lustre: lustre-MDT0000-osp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 5831.678451] Lustre: Skipped 12 previous similar messages [ 5855.357910] Lustre: DEBUG MARKER: == recovery-small test 65: lock enqueue for destroyed export ========================================================== 05:15:51 (1743498951) [ 5857.466322] LustreError: 28659:0:(ldlm_lockd.c:1470:ldlm_handle_enqueue()) cfs_fail_timeout id 31e sleeping for 6000ms [ 5857.486958] Lustre: *** cfs_fail_loc=31e, val=0*** [ 5857.488636] Lustre: Skipped 4 previous similar messages [ 5859.488934] LustreError: 27755:0:(ldlm_lockd.c:1470:ldlm_handle_enqueue()) cfs_fail_timeout id 31e sleeping for 6000ms [ 5862.283153] Lustre: 58204:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-OST0000: evicting feaceb2d-dc87-4156-9bc6-241e59797d8c at adminstrative request [ 5862.288728] LustreError: 6544:0:(ldlm_lockd.c:2993:ldlm_bl_thread_exports()) cfs_fail_timeout id 31e sleeping for 4000ms [ 5863.513137] LustreError: 28659:0:(ldlm_lockd.c:1470:ldlm_handle_enqueue()) cfs_fail_timeout id 31e awake [ 5863.521471] LustreError: 28659:0:(ldlm_lockd.c:1492:ldlm_handle_enqueue()) ### lock on destroyed export 00000000c8df97f6 ns: filter-lustre-OST0000_UUID lock: ffff94bad73b34c0/0x623b88e7ff36a042 lrc: 3/0,0 mode: --/PW res: [0x280000401:0x17c3:0x0].0x0 rrc: 4 type: EXT [0->4095] (req 0->4095) gid 0 flags: 0x70000000020020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba77ea2 expref: 4 pid: 28659 timeout: 0 lvb_type: 0 [ 5865.313283] LustreError: 6544:0:(ldlm_lockd.c:2993:ldlm_bl_thread_exports()) cfs_fail_timeout interrupted [ 5873.612885] Lustre: lustre-OST0000: Client 8342a6be-aad7-4d25-96e1-2140a51e7c1e (at 192.168.206.53@tcp) reconnecting [ 5873.618798] Lustre: Skipped 6 previous similar messages [ 5886.152626] Lustre: DEBUG MARKER: == recovery-small test 66: lock enqueue re-send vs client eviction ========================================================== 05:16:22 (1743498982) [ 5888.340620] Lustre: *** cfs_fail_loc=157, val=2147483648*** [ 5888.344123] LustreError: 6556:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bac61d3f80 x1828185266083072/t0(0) o101->feaceb2d-dc87-4156-9bc6-241e59797d8c@192.168.206.53@tcp:191/0 lens 576/688 e 0 to 0 dl 1743499041 ref 1 fl Interpret:/200/0 rc 0/0 job:'stat.0' uid:0 gid:0 [ 5890.549518] LustreError: 16003:0:(mdt_handler.c:2326:mdt_getattr_name_lock()) cfs_fail_timeout id 136 sleeping for 40000ms [ 5893.395125] Lustre: 58615:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting feaceb2d-dc87-4156-9bc6-241e59797d8c at adminstrative request [ 5895.058236] LustreError: 16003:0:(mdt_handler.c:2326:mdt_getattr_name_lock()) cfs_fail_timeout interrupted [ 5895.062642] LustreError: 16003:0:(mdt_handler.c:2326:mdt_getattr_name_lock()) Skipped 1 previous similar message [ 5904.665173] Lustre: DEBUG MARKER: == recovery-small test 67: connect vs import invalidate race ========================================================== 05:16:40 (1743499000) [ 5910.414984] Lustre: 58907:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting feaceb2d-dc87-4156-9bc6-241e59797d8c at adminstrative request [ 5933.931238] Lustre: DEBUG MARKER: == recovery-small test 100: IR: Make sure normal recovery still works w/o IR ========================================================== 05:17:09 (1743499029) [ 5939.679471] Lustre: Failing over lustre-OST0000 [ 5939.804648] Lustre: server umount lustre-OST0000 complete [ 5944.315053] LustreError: lustre-OST0000-osc-MDT0000: operation ost_statfs to node 0@lo failed: rc = -107 [ 5944.319397] LustreError: Skipped 1 previous similar message [ 5959.147899] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 5959.244898] Lustre: 59789:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 5959.481087] Lustre: lustre-OST0000: Imperative Recovery not enabled, recovery window 60-180 [ 5959.484900] Lustre: Skipped 1 previous similar message [ 5960.620219] Lustre: lustre-OST0000: Will be in recovery for at least 1:00, or until 3 clients reconnect [ 5960.625187] Lustre: Skipped 1 previous similar message [ 5964.823568] Lustre: lustre-OST0000: Recovery over after 0:04, of 3 clients 3 recovered and 0 were evicted. [ 5964.828423] Lustre: Skipped 1 previous similar message [ 5965.323935] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 5974.453854] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 5977.127821] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 5992.382638] Lustre: DEBUG MARKER: == recovery-small test 101a: IR: Make sure IR works w/o normal recovery ========================================================== 05:18:08 (1743499088) [ 5997.298618] Lustre: Failing over lustre-OST0000 [ 5997.397510] Lustre: server umount lustre-OST0000 complete [ 6016.642150] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 6016.715112] Lustre: 61281:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 6016.892687] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 6022.882716] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6031.924815] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 6034.760943] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 6049.793183] Lustre: DEBUG MARKER: == recovery-small test 101b: IR: Make sure IR works w/o normal recovery and proceed EAGAIN ========================================================== 05:19:05 (1743499145) [ 6055.184253] Lustre: Failing over lustre-OST0000 [ 6055.253412] Lustre: server umount lustre-OST0000 complete [ 6057.957565] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 6057.966134] Lustre: Skipped 14 previous similar messages [ 6057.971095] LustreError: 8733:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-OST0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 6057.991469] LustreError: 8733:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 67 previous similar messages [ 6074.565113] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 6074.687456] Lustre: 62825:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 6074.907459] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 6074.922871] LustreError: 62825:0:(ofd_dev.c:631:ofd_prepare()) cfs_fail_timeout id 247 sleeping for 25000ms [ 6074.923428] Lustre: lustre-OST0000: in recovery but waiting for the first client to connect [ 6074.930669] Lustre: Skipped 8 previous similar messages [ 6100.023171] LustreError: 62825:0:(ofd_dev.c:631:ofd_prepare()) cfs_fail_timeout id 247 awake [ 6105.606866] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6114.087524] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 6116.511268] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 6128.439077] Lustre: DEBUG MARKER: == recovery-small test 102: IR: New client gets updated nidtbl after MGS restart ========================================================== 05:20:24 (1743499224) [ 6132.256589] Lustre: Failing over lustre-OST0000 [ 6132.341252] Lustre: server umount lustre-OST0000 complete [ 6150.729434] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 6150.835176] Lustre: 64227:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 6151.046486] Lustre: lustre-OST0000: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 6156.172858] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6163.669652] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 6166.045828] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 6173.556156] Lustre: Failing over lustre-MDT0000 [ 6173.791439] Lustre: server umount lustre-MDT0000 complete [ 6180.463036] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 6180.630393] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 6184.449166] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6186.015430] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6085 to 0x280000401:6113) [ 6186.015429] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6083 to 0x2c0000401:6113) [ 6187.836180] Lustre: Failing over lustre-OST0000 [ 6187.938719] Lustre: server umount lustre-OST0000 complete [ 6206.087348] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 6206.162333] Lustre: 66297:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 6211.855096] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6220.201439] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 6233.283777] Lustre: DEBUG MARKER: == recovery-small test 103: IR: MDS can start w/o MGS and get updated nidtbl later ========================================================== 05:22:09 (1743499329) [ 6236.400870] Lustre: DEBUG MARKER: SKIP: recovery-small test_103 needs separate mgs and mds [ 6238.982420] Lustre: DEBUG MARKER: == recovery-small test 104: IR: ost can disable IR voluntarily ========================================================== 05:22:15 (1743499335) [ 6242.616762] Lustre: Failing over lustre-OST0000 [ 6242.680736] Lustre: server umount lustre-OST0000 complete [ 6244.321860] LustreError: lustre-OST0000-osc-MDT0000: operation ost_statfs to node 0@lo failed: rc = -107 [ 6244.328059] LustreError: Skipped 9 previous similar messages [ 6250.720773] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 6250.797326] Lustre: 67818:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 6255.983965] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6270.690350] Lustre: DEBUG MARKER: == recovery-small test 105: IR: NON IR clients support === 05:22:46 (1743499366) [ 6272.802872] Lustre: DEBUG MARKER: SKIP: recovery-small test_105 Needs multiple clients [ 6275.013894] Lustre: DEBUG MARKER: == recovery-small test 106: lightweight connection support ========================================================== 05:22:51 (1743499371) [ 6284.070707] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 [ 6285.969335] Lustre: Failing over lustre-MDT0000 [ 6286.199383] Lustre: server umount lustre-MDT0000 complete [ 6304.224086] Lustre: 3682:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743499385/real 1743499385] req@ffff94badd948bc0 x1828185287394688/t0(0) o400->MGC192.168.206.153@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1743499401 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 6304.235342] Lustre: 3682:0:(client.c:2346:ptlrpc_expire_one_request()) Skipped 2 previous similar messages [ 6307.007577] LDISKFS-fs (dm-0): recovery complete [ 6307.009573] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 6313.447649] Lustre: Evicted from MGS (at 192.168.206.153@tcp) after server handle changed from 0x0 to 0x623b88e7ff36aba2 [ 6317.256293] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6319.105516] LustreError: 69894:0:(ldlm_lockd.c:961:ldlm_server_blocking_ast()) ### BUG 6063: lock collide during recovery ns: mdt-lustre-MDT0000_UUID lock: ffff94bad735c040/0x623b88e7ff36adcb lrc: 3/0,0 mode: PR/PR res: [0x200000007:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT gid 0 flags: 0x40200000000020 nid: 192.168.206.53@tcp remote: 0xbb53ec904ba78292 expref: 5 pid: 16003 timeout: 0 lvb_type: 0 [ 6319.171821] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6115 to 0x280000401:6145) [ 6319.171907] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6083 to 0x2c0000401:6145) [ 6330.428861] Lustre: DEBUG MARKER: == recovery-small test 107: drop reint reply, then restart MDT ========================================================== 05:23:46 (1743499426) [ 6331.770621] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 6331.773703] LustreError: 31167:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bade5d8600 x1828185266171648/t94489280516(0) o36->7b426693-5d9e-479a-ad8d-f86c3133e122@192.168.206.53@tcp:590/0 lens 552/448 e 0 to 0 dl 1743499440 ref 1 fl Interpret:/200/0 rc 0/0 job:'mkdir.0' uid:0 gid:0 [ 6334.291415] Lustre: Failing over lustre-MDT0000 [ 6334.451266] Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) [ 6334.459307] Lustre: Skipped 3 previous similar messages [ 6334.539797] Lustre: server umount lustre-MDT0000 complete [ 6352.609105] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 6356.641978] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6358.027786] Lustre: 6558:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad13cbf80 x1828185266171648/t94489280516(0) o36->7b426693-5d9e-479a-ad8d-f86c3133e122@192.168.206.53@tcp:616/0 lens 552/2880 e 0 to 0 dl 1743499466 ref 1 fl Interpret:/202/0 rc 0/0 job:'mkdir.0' uid:0 gid:0 [ 6358.045447] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6083 to 0x2c0000401:6177) [ 6358.046345] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6115 to 0x280000401:6177) [ 6364.701813] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 6367.334606] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 6378.000849] Lustre: DEBUG MARKER: == recovery-small test 108: client eviction don't crash == 05:24:34 (1743499474) [ 6379.383415] Lustre: 71847:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-OST0000: evicting 7b426693-5d9e-479a-ad8d-f86c3133e122 at adminstrative request [ 6392.689133] Lustre: DEBUG MARKER: == recovery-small test 110a: create remote directory: drop client req ========================================================== 05:24:48 (1743499488) [ 6395.544816] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 6395.546729] Lustre: Skipped 12 previous similar messages [ 6411.681774] Lustre: lustre-MDT0000: Client 7b426693-5d9e-479a-ad8d-f86c3133e122 (at 192.168.206.53@tcp) reconnecting [ 6411.688655] Lustre: Skipped 2 previous similar messages [ 6421.769840] Lustre: DEBUG MARKER: == recovery-small test 110b: create remote directory: drop Master rep ========================================================== 05:25:18 (1743499518) [ 6423.164111] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 6423.170140] LustreError: 6558:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bade213400 x1828185266199424/t4295025521(0) o36->7b426693-5d9e-479a-ad8d-f86c3133e122@192.168.206.53@tcp:681/0 lens 560/536 e 0 to 0 dl 1743499531 ref 1 fl Interpret:/200/0 rc 0/0 job:'lfs.0' uid:0 gid:0 [ 6439.358140] Lustre: 9961:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bae23adc40 x1828185266199424/t4295025521(0) o36->7b426693-5d9e-479a-ad8d-f86c3133e122@192.168.206.53@tcp:698/0 lens 560/2880 e 0 to 0 dl 1743499548 ref 1 fl Interpret:/202/0 rc 0/0 job:'lfs.0' uid:0 gid:0 [ 6449.381360] Lustre: DEBUG MARKER: == recovery-small test 110c: create remote directory: drop update rep on slave MDT ========================================================== 05:25:45 (1743499545) [ 6450.669041] LustreError: 6561:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bbdf4250c0 x1828185287497472/t98784247837(0) o1000->lustre-MDT0001-mdtlov_UUID@0@lo:709/0 lens 264/4320 e 0 to 0 dl 1743499559 ref 1 fl Interpret:/200/0 rc 0/0 job:'osp_up0-1.0' uid:0 gid:0 [ 6467.045902] Lustre: lustre-MDT0000: Received new MDS connection from 0@lo, keep former export from same NID [ 6467.053094] Lustre: lustre-MDT0000-osp-MDT0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 6467.059317] Lustre: Skipped 26 previous similar messages [ 6477.175990] Lustre: DEBUG MARKER: == recovery-small test 110d: remove remote directory: drop client req ========================================================== 05:26:13 (1743499573) [ 6478.425268] Lustre: *** cfs_fail_loc=123, val=2147483648*** [ 6544.014559] Lustre: DEBUG MARKER: == recovery-small test 110e: remove remote directory: drop master rep ========================================================== 05:27:20 (1743499640) [ 6545.610755] Lustre: *** cfs_fail_loc=119, val=2147483648*** [ 6545.616231] Lustre: Skipped 1 previous similar message [ 6545.618833] LustreError: 9962:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bade5de7c0 x1828185266228864/t4295025540(0) o36->7b426693-5d9e-479a-ad8d-f86c3133e122@192.168.206.53@tcp:88/0 lens 496/456 e 0 to 0 dl 1743499693 ref 1 fl Interpret:/200/0 rc 0/0 job:'rm.0' uid:0 gid:0 [ 6600.654426] Lustre: 6556:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad9ba9d00 x1828185266228864/t4295025540(0) o36->7b426693-5d9e-479a-ad8d-f86c3133e122@192.168.206.53@tcp:143/0 lens 496/2888 e 0 to 0 dl 1743499748 ref 1 fl Interpret:/202/0 rc 0/0 job:'rm.0' uid:0 gid:0 [ 6610.387711] Lustre: DEBUG MARKER: == recovery-small test 110f: remove remote directory: drop slave rep ========================================================== 05:28:26 (1743499706) [ 6611.826474] Lustre: *** cfs_fail_loc=1701, val=2147483648*** [ 6611.828716] LustreError: 6561:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bad0ac3f80 x1828185287591680/t98784247872(0) o1000->lustre-MDT0001-mdtlov_UUID@0@lo:115/0 lens 2592/4320 e 0 to 0 dl 1743499720 ref 1 fl Interpret:/200/0 rc 0/0 job:'osp_up0-1.0' uid:0 gid:0 [ 6628.333543] Lustre: lustre-MDT0000: Received new MDS connection from 0@lo, keep former export from same NID [ 6638.551709] Lustre: DEBUG MARKER: == recovery-small test 110g: drop reply during migration ========================================================== 05:28:54 (1743499734) [ 6694.838964] Lustre: 6571:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad0cbbf80 x1828185266245248/t4295025545(0) o36->7b426693-5d9e-479a-ad8d-f86c3133e122@192.168.206.53@tcp:237/0 lens 704/2888 e 0 to 0 dl 1743499842 ref 1 fl Interpret:/202/0 rc 0/0 job:'lfs.0' uid:0 gid:0 [ 6704.375150] Lustre: DEBUG MARKER: == recovery-small test 110h: drop update reply during cross-MDT file rename ========================================================== 05:30:00 (1743499800) [ 6721.503210] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 6721.517407] Lustre: Skipped 20 previous similar messages [ 6721.526921] Lustre: lustre-MDT0000: Received new MDS connection from 0@lo, keep former export from same NID [ 6731.882336] Lustre: DEBUG MARKER: == recovery-small test 110i: drop update reply during cross-MDT dir rename ========================================================== 05:30:28 (1743499828) [ 6750.181460] Lustre: lustre-MDT0000: Received new MDS connection from 0@lo, keep former export from same NID [ 6760.938900] Lustre: DEBUG MARKER: == recovery-small test 110j: drop update reply during cross-MDT ln ========================================================== 05:30:57 (1743499857) [ 6762.578799] Lustre: *** cfs_fail_loc=1701, val=2147483648*** [ 6762.585062] Lustre: Skipped 3 previous similar messages [ 6762.587485] LustreError: 6561:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bbde529d00 x1828185287695232/t98784247920(0) o1000->lustre-MDT0001-mdtlov_UUID@0@lo:266/0 lens 2240/4320 e 0 to 0 dl 1743499871 ref 1 fl Interpret:/200/0 rc 0/0 job:'osp_up0-1.0' uid:0 gid:0 [ 6762.600544] LustreError: 6561:0:(ldlm_lib.c:3251:target_send_reply_msg()) Skipped 3 previous similar messages [ 6778.855593] Lustre: lustre-MDT0000: Received new MDS connection from 0@lo, keep former export from same NID [ 6789.559444] Lustre: DEBUG MARKER: == recovery-small test 110k: FID_QUERY failed during recovery ========================================================== 05:31:25 (1743499885) [ 6791.970240] Lustre: Failing over lustre-MDT0001 [ 6792.248896] Lustre: server umount lustre-MDT0001 complete [ 6793.192542] LustreError: lustre-MDT0001-osp-MDT0000: operation mds_statfs to node 0@lo failed: rc = -107 [ 6793.199308] LustreError: Skipped 1 previous similar message [ 6793.218788] LustreError: 14569:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0001: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 6793.227271] LustreError: 14569:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 92 previous similar messages [ 6801.691294] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 6801.775372] Lustre: 75898:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 6802.071672] Lustre: lustre-MDT0001: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 6802.097291] Lustre: *** cfs_fail_loc=1103, val=0*** [ 6802.120111] Lustre: lustre-MDT0001: in recovery but waiting for the first client to connect [ 6802.120767] Lustre: lustre-MDT0001: Aborting client recovery [ 6802.125821] Lustre: Skipped 6 previous similar messages [ 6802.130615] LustreError: 75898:0:(ldlm_lib.c:2907:target_stop_recovery_thread()) lustre-MDT0001: Aborting recovery [ 6802.133723] Lustre: 75922:0:(ldlm_lib.c:2289:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 6802.133733] Lustre: 75922:0:(ldlm_lib.c:2289:target_recovery_overseer()) Skipped 2 previous similar messages [ 6804.171493] Lustre: 75922:0:(genops.c:1508:class_disconnect_stale_exports()) lustre-MDT0001: disconnect stale client lustre-MDT0000-mdtlov_UUID@ [ 6804.175910] Lustre: 75922:0:(genops.c:1508:class_disconnect_stale_exports()) Skipped 1 previous similar message [ 6804.180105] Lustre: lustre-MDT0001: disconnecting 1 stale clients [ 6804.184379] Lustre: 75922:0:(ldlm_lib.c:2289:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 6804.199278] Lustre: lustre-MDT0001-osd: cancel update llog [0x240000400:0x1:0x0] [ 6804.215847] Lustre: lustre-MDT0000-osp-MDT0001: cancel update llog [0x200000401:0x1:0x0] [ 6804.285461] Lustre: lustre-OST0001: new connection from lustre-MDT0001-mdtlov (cleaning up unused objects from 0x2c0000400:9632 to 0x2c0000400:9665) [ 6804.286826] Lustre: lustre-OST0000: new connection from lustre-MDT0001-mdtlov (cleaning up unused objects from 0x280000400:8481 to 0x280000400:8737) [ 6807.544884] LustreError: lustre-MDT0001-osp-MDT0000: This client was evicted by lustre-MDT0001; in progress operations using this service will fail. [ 6808.107368] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6812.428588] Lustre: Failing over lustre-MDT0001 [ 6812.702947] Lustre: server umount lustre-MDT0001 complete [ 6820.808095] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 6821.218159] Lustre: lustre-MDT0001: Imperative Recovery enabled, recovery window shrunk from 60-180 down to 60-180 [ 6824.752790] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6826.475514] Lustre: lustre-MDT0001: Will be in recovery for at least 1:00, or until 1 client reconnects [ 6826.480188] Lustre: Skipped 8 previous similar messages [ 6826.509761] Lustre: lustre-MDT0001: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 6826.513318] Lustre: Skipped 8 previous similar messages [ 6826.543534] Lustre: lustre-OST0000: new connection from lustre-MDT0001-mdtlov (cleaning up unused objects from 0x280000400:8481 to 0x280000400:8769) [ 6826.543574] Lustre: lustre-OST0001: new connection from lustre-MDT0001-mdtlov (cleaning up unused objects from 0x2c0000400:9632 to 0x2c0000400:9697) [ 6847.464047] Lustre: DEBUG MARKER: == recovery-small test 110m: update resent vs original RPC race ========================================================== 05:32:23 (1743499943) [ 6849.945931] LustreError: 14569:0:(out_handler.c:1207:out_handle()) cfs_race id 525 sleeping [ 6855.142126] LustreError: 14569:0:(out_handler.c:1207:out_handle()) cfs_fail_race id 525 awake: rc=0 [ 6855.165749] LustreError: 14569:0:(out_handler.c:1207:out_handle()) cfs_fail_race id 525 waking [ 6855.473985] Lustre: lustre-MDT0000: Received new MDS connection from 0@lo, keep former export from same NID [ 6855.833085] LustreError: 6561:0:(out_handler.c:1207:out_handle()) cfs_fail_race id 525 waking [ 6855.836256] LustreError: 6561:0:(out_handler.c:1207:out_handle()) Skipped 1 previous similar message [ 6865.813661] Lustre: DEBUG MARKER: == recovery-small test 111: mdd setup fail should not cause umount oops ========================================================== 05:32:41 (1743499961) [ 6868.916807] Lustre: Failing over lustre-MDT0000 [ 6869.134924] Lustre: server umount lustre-MDT0000 complete [ 6876.995604] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 6877.111787] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 6877.118752] LustreError: Skipped 2 previous similar messages [ 6877.350110] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 6877.354118] Lustre: Skipped 5 previous similar messages [ 6877.386158] Lustre: *** cfs_fail_loc=151, val=0*** [ 6877.388121] LustreError: 78562:0:(mdd_device.c:666:mdd_changelog_init()) lustre-MDD0000: changelog setup during init failed: rc = -5 [ 6877.394279] LustreError: 78562:0:(mdd_device.c:1383:mdd_prepare()) lustre-MDD0000: failed to initialize changelog: rc = -5 [ 6877.400663] LustreError: 78562:0:(tgt_mount.c:2266:server_fill_super()) Unable to start targets: -5 [ 6877.411830] Lustre: Failing over lustre-MDT0000 [ 6877.418611] LustreError: 78593:0:(llog_osd.c:1183:llog_osd_next_block()) lustre-MDT0001-osp-MDT0000: can't read llog block from log [0x240000407:0x1:0x0] offset 32768: rc = -108 [ 6877.427624] LustreError: 78593:0:(lod_dev.c:506:lod_sub_recovery_thread()) lustre-MDT0001-osp-MDT0000: get update log duration 0, retries 0, failed: rc = -108 [ 6877.563773] Lustre: server umount lustre-MDT0000 complete [ 6877.566112] LustreError: 78562:0:(super25.c:181:lustre_fill_super()) llite: Unable to mount : rc = -5 [ 6883.746030] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 6883.867413] LustreError: 6553:0:(mgc_request.c:616:do_requeue()) failed processing log: -5 [ 6884.156907] LustreError: 79025:0:(llog_cat.c:396:llog_cat_id2handle()) lustre-MDT0001-osp-MDT0000: error opening log id [0x240000409:0x1:0x0]: rc = -2 [ 6887.139601] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 6889.484083] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6180 to 0x280000401:6209) [ 6889.484322] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6179 to 0x2c0000401:6209) [ 6897.162354] Lustre: DEBUG MARKER: == recovery-small test 112a: bulk resend while orignal request is in progress ========================================================== 05:33:13 (1743499993) [ 6899.060421] LustreError: 8438:0:(tgt_handler.c:2694:tgt_brw_write()) cfs_fail_timeout id 214 sleeping for 20000ms [ 6919.151201] LustreError: 8438:0:(tgt_handler.c:2694:tgt_brw_write()) cfs_fail_timeout id 214 awake [ 6930.640379] Lustre: DEBUG MARKER: == recovery-small test 115a: read: late REQ MDunlink and no bulk ========================================================== 05:33:46 (1743500026) [ 6944.222777] Lustre: DEBUG MARKER: == recovery-small test 115b: write: late REQ MDunlink and no bulk ========================================================== 05:34:00 (1743500040) [ 6948.770414] Lustre: *** cfs_fail_loc=215, val=2*** [ 6956.986311] Lustre: DEBUG MARKER: == recovery-small test 115c: read: late Reply MDunlink and no bulk ========================================================== 05:34:13 (1743500053) [ 6968.464226] Lustre: DEBUG MARKER: == recovery-small test 115d: write: late Reply MDunlink and no bulk ========================================================== 05:34:24 (1743500064) [ 6980.029120] Lustre: DEBUG MARKER: == recovery-small test 115e: read: late Bulk MDunlink and no reply ========================================================== 05:34:36 (1743500076) [ 6992.392546] Lustre: DEBUG MARKER: == recovery-small test 115f: read: late REQ MDunlink and no reply ========================================================== 05:34:48 (1743500088) [ 7006.485229] Lustre: DEBUG MARKER: == recovery-small test 115g: read: late REQ MDunlink and Reply MDunlink ========================================================== 05:35:02 (1743500102) [ 7012.320722] Lustre: 3682:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743500093/real 1743500093] req@ffff94bbc84d6d80 x1828185287843712/t0(0) o13->lustre-OST0000-osc-MDT0001@0@lo:7/4 lens 224/368 e 0 to 1 dl 1743500109 ref 1 fl Rpc:XQr/200/ffffffff rc 0/-1 job:'osp-pre-0-1.0' uid:0 gid:0 [ 7012.337668] Lustre: 3682:0:(client.c:2346:ptlrpc_expire_one_request()) Skipped 5 previous similar messages [ 7012.349231] Lustre: lustre-OST0000: Client lustre-MDT0001-mdtlov_UUID (at 0@lo) reconnecting [ 7012.352790] Lustre: Skipped 4 previous similar messages [ 7076.052142] Lustre: DEBUG MARKER: == recovery-small test 120: flock race: completion vs. evict ========================================================== 05:36:12 (1743500172) [ 7079.241468] Lustre: 81647:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 5980e814-fcb1-4059-b5f1-60da4bd91f27 at adminstrative request [ 7085.473601] Lustre: 81696:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 5980e814-fcb1-4059-b5f1-60da4bd91f27 at adminstrative request [ 7093.740547] Lustre: 81745:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 5980e814-fcb1-4059-b5f1-60da4bd91f27 at adminstrative request [ 7097.128924] Lustre: 81794:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 5980e814-fcb1-4059-b5f1-60da4bd91f27 at adminstrative request [ 7107.442821] Lustre: 81843:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 5980e814-fcb1-4059-b5f1-60da4bd91f27 at adminstrative request [ 7115.460909] Lustre: 81894:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 5980e814-fcb1-4059-b5f1-60da4bd91f27 at adminstrative request [ 7135.755133] Lustre: 82061:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting 5980e814-fcb1-4059-b5f1-60da4bd91f27 at adminstrative request [ 7135.759585] Lustre: 82061:0:(genops.c:1678:obd_export_evict_by_uuid()) Skipped 2 previous similar messages [ 7149.911973] Lustre: DEBUG MARKER: == recovery-small test 113: ldlm enqueue dropped reply should not cause deadlocks ========================================================== 05:37:26 (1743500246) [ 7151.864942] Lustre: *** cfs_fail_loc=157, val=2147483648*** [ 7151.868079] Lustre: Skipped 2 previous similar messages [ 7151.870021] LustreError: 31167:0:(ldlm_lib.c:3251:target_send_reply_msg()) @@@ dropping reply req@ffff94bac61d7340 x1828185266410112/t0(0) o101->5980e814-fcb1-4059-b5f1-60da4bd91f27@192.168.206.53@tcp:694/0 lens 576/688 e 0 to 0 dl 1743500299 ref 1 fl Interpret:/200/0 rc 0/0 job:'stat.0' uid:0 gid:0 [ 7151.879790] LustreError: 31167:0:(ldlm_lib.c:3251:target_send_reply_msg()) Skipped 2 previous similar messages [ 7221.964901] Lustre: DEBUG MARKER: == recovery-small test 130a: enqueue resend on not existing file ========================================================== 05:38:38 (1743500318) [ 7224.703464] LustreError: 9961:0:(mdt_handler.c:5274:mdt_intent_opc()) cfs_fail_timeout id 160 sleeping for 10000ms [ 7234.711137] LustreError: 9961:0:(mdt_handler.c:5274:mdt_intent_opc()) cfs_fail_timeout id 160 awake [ 7288.952768] Lustre: DEBUG MARKER: == recovery-small test 130b: enqueue resend on a stale inode ========================================================== 05:39:45 (1743500385) [ 7291.587319] LustreError: 31167:0:(mdt_handler.c:5274:mdt_intent_opc()) cfs_fail_timeout id 160 sleeping for 10000ms [ 7301.591112] LustreError: 31167:0:(mdt_handler.c:5274:mdt_intent_opc()) cfs_fail_timeout id 160 awake [ 7346.098815] Lustre: *** cfs_fail_loc=217, val=0*** [ 7354.425164] Lustre: DEBUG MARKER: == recovery-small test 130c: layout intent resend on a stale inode ========================================================== 05:40:50 (1743500450) [ 7358.774830] LustreError: 6558:0:(mdt_handler.c:5274:mdt_intent_opc()) cfs_fail_timeout id 160 sleeping for 10000ms [ 7368.791144] LustreError: 6558:0:(mdt_handler.c:5274:mdt_intent_opc()) cfs_fail_timeout id 160 awake [ 7391.040807] Lustre: DEBUG MARKER: == recovery-small test 132: long punch =================== 05:41:27 (1743500487) [ 7393.113150] LustreError: 8438:0:(ofd_dev.c:2062:ofd_punch_hdl()) cfs_fail_timeout id 236 sleeping for 120000ms [ 7435.231884] Lustre: ll_ost_io00_001: service thread pid 8438 was inactive for 42.118 seconds. The thread might be hung, or it might only be slow and will resume later. Dumping the stack trace for debugging purposes: [ 7435.238375] Pid: 8438, comm: ll_ost_io00_001 4.18.0rh8.10-debug #7 SMP Sat Jan 18 21:01:29 EST 2025 [ 7435.242416] Call Trace TBD: [ 7435.243392] [<0>] __cfs_fail_timeout_set+0x13b/0x240 [libcfs] [ 7435.246693] [<0>] ofd_punch_hdl+0x4ef/0xbb0 [ofd] [ 7435.249112] [<0>] tgt_handle_request0+0x137/0xaf0 [ptlrpc] [ 7435.252155] [<0>] tgt_request_handle+0x351/0x1c10 [ptlrpc] [ 7435.254552] [<0>] ptlrpc_server_handle_request+0x374/0x1320 [ptlrpc] [ 7435.257937] [<0>] ptlrpc_main+0xd2a/0x1450 [ptlrpc] [ 7435.260450] [<0>] kthread+0x1d7/0x210 [ 7435.262673] [<0>] ret_from_fork+0x1f/0x30 [ 7513.119100] LustreError: 8438:0:(ofd_dev.c:2062:ofd_punch_hdl()) cfs_fail_timeout id 236 awake [ 7513.123471] Lustre: ll_ost_io00_001: service thread pid 8438 completed after 120.010s. This likely indicates the system was overloaded (too many service threads, or not enough hardware resources). [ 7524.115752] Lustre: DEBUG MARKER: == recovery-small test 131: IO vs evict results to IO under staled lock ========================================================== 05:43:40 (1743500620) [ 7528.318840] Lustre: 84049:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-OST0000: evicting 5980e814-fcb1-4059-b5f1-60da4bd91f27 at adminstrative request [ 7528.325439] LustreError: 6544:0:(ldlm_lockd.c:2993:ldlm_bl_thread_exports()) cfs_fail_timeout id 31e sleeping for 4000ms [ 7531.136564] LustreError: 6544:0:(ldlm_lockd.c:2993:ldlm_bl_thread_exports()) cfs_fail_timeout interrupted [ 7538.051209] Lustre: DEBUG MARKER: == recovery-small test 133: don't fail on flock resend === 05:43:54 (1743500634) [ 7586.883400] Lustre: DEBUG MARKER: == recovery-small test 134: race between failover and search for reply data free slot ========================================================== 05:44:43 (1743500683) [ 7589.019329] Lustre: DEBUG MARKER: SKIP: recovery-small test_134 Need 2+ clients, have 1 [ 7591.255106] Lustre: DEBUG MARKER: == recovery-small test 135: DOM: open/create resend to return size ========================================================== 05:44:47 (1743500687) [ 7615.395857] Lustre: lustre-MDT0001: Client 5980e814-fcb1-4059-b5f1-60da4bd91f27 (at 192.168.206.53@tcp) reconnecting [ 7615.402082] Lustre: Skipped 5 previous similar messages [ 7615.411538] Lustre: 31167:0:(mdt_recovery.c:128:mdt_req_from_lrd()) @@@ restoring transno req@ffff94bad052bf80 x1828185266500608/t12884901909(0) o101->5980e814-fcb1-4059-b5f1-60da4bd91f27@192.168.206.53@tcp:370/0 lens 648/3488 e 0 to 0 dl 1743500730 ref 1 fl Interpret:/202/0 rc 0/0 job:'openfile.0' uid:0 gid:0 [ 7624.392922] Lustre: DEBUG MARKER: SKIP: recovery-small test_136 skipping excluded test 136 [ 7626.694691] Lustre: DEBUG MARKER: == recovery-small test 137: late resend must be skipped if already applied ========================================================== 05:45:23 (1743500723) [ 7629.094101] LustreError: 9962:0:(mdt_reint.c:917:mdt_reint_setattr()) cfs_race id 525 sleeping [ 7634.399485] LustreError: 9962:0:(mdt_reint.c:917:mdt_reint_setattr()) cfs_fail_race id 525 awake: rc=0 [ 7634.447518] LustreError: 16003:0:(mdt_reint.c:917:mdt_reint_setattr()) cfs_fail_race id 525 waking [ 7652.004073] Lustre: DEBUG MARKER: == recovery-small test 138: Umount MDT during recovery === 05:45:48 (1743500748) [ 7656.220803] Lustre: Failing over lustre-MDT0000 [ 7656.233264] LustreError: 85340:0:(lod_dev.c:1110:lod_process_config()) cfs_fail_timeout id 724 sleeping for 10000ms [ 7657.442358] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 7657.449312] LustreError: Skipped 1 previous similar message [ 7657.451316] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 7657.459351] Lustre: Skipped 14 previous similar messages [ 7657.466564] Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) [ 7658.472760] Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) [ 7658.475790] Lustre: Skipped 3 previous similar messages [ 7663.586375] Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) [ 7663.589077] Lustre: Skipped 2 previous similar messages [ 7666.336401] LustreError: 85340:0:(lod_dev.c:1110:lod_process_config()) cfs_fail_timeout id 724 awake [ 7666.502886] Lustre: server umount lustre-MDT0000 complete [ 7668.704664] LustreError: 16003:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. [ 7668.712557] LustreError: 16003:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 29 previous similar messages [ 7683.909055] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 7684.030768] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 7684.035964] LustreError: Skipped 1 previous similar message [ 7684.230575] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 7684.233752] Lustre: Skipped 1 previous similar message [ 7684.274796] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 7684.277368] Lustre: Skipped 4 previous similar messages [ 7687.264755] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 7689.359169] LustreError: 85797:0:(lod_dev.c:456:lod_sub_recovery_thread()) cfs_fail_timeout id 724 awake [ 7689.709769] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 7689.713772] Lustre: Skipped 16 previous similar messages [ 7724.946948] LustreError: 85797:0:(lod_dev.c:456:lod_sub_recovery_thread()) cfs_fail_timeout id 724 awake [ 7724.950438] LustreError: 85797:0:(lod_dev.c:456:lod_sub_recovery_thread()) Skipped 6 previous similar messages [ 7746.036244] Lustre: Failing over lustre-MDT0000 [ 7746.527219] Lustre: 85798:0:(ldlm_lib.c:2289:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 7746.531170] Lustre: 85798:0:(ldlm_lib.c:2289:target_recovery_overseer()) Skipped 1 previous similar message [ 7750.263140] LustreError: 85797:0:(lod_dev.c:506:lod_sub_recovery_thread()) lustre-MDT0001-osp-MDT0000: get update log duration 66, retries 12, failed: rc = -5 [ 7750.277745] Lustre: 85798:0:(ldlm_lib.c:2289:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 7751.154634] Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) [ 7751.161076] Lustre: Skipped 3 previous similar messages [ 7756.213286] Lustre: server umount lustre-MDT0000 complete [ 7764.588909] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 7768.195198] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 7770.086557] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 1 client reconnects [ 7770.091062] Lustre: Skipped 1 previous similar message [ 7770.095956] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 7770.106504] Lustre: Skipped 2 previous similar messages [ 7770.126246] Lustre: lustre-MDT0000: Recovery over after 0:01, of 1 clients 1 recovered and 0 were evicted. [ 7770.131697] Lustre: Skipped 1 previous similar message [ 7770.156917] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6224 to 0x280000401:6241) [ 7770.157142] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6222 to 0x2c0000401:6241) [ 7779.046070] Lustre: DEBUG MARKER: == recovery-small test 139: corrupted catid won't cause crash ========================================================== 05:47:55 (1743500875) [ 7780.846815] Lustre: Failing over lustre-MDT0000 [ 7781.059607] Lustre: server umount lustre-MDT0000 complete [ 7788.759849] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 7789.056982] Lustre: *** cfs_fail_loc=2106, val=104*** [ 7789.059382] LustreError: 87919:0:(osp_sync.c:1565:osp_sync_llog_init()) lustre-OST0000-osc-MDT0000: the catid [0x0:0x68:0x0] for init llog 0 is bad [ 7792.661075] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 7794.706113] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6224 to 0x280000401:6273) [ 7794.710244] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6222 to 0x2c0000401:6273) [ 7802.545156] Lustre: DEBUG MARKER: == recovery-small test 140a: local mount is flagged properly ========================================================== 05:48:18 (1743500898) [ 7806.018116] Lustre: lustre-MDT0000: local client 7beda3ec-29dc-44ef-8308-9860f8b4ca4f w/o recovery [ 7806.028438] Lustre: Skipped 1 previous similar message [ 7806.083227] Lustre: Mounted lustre-client [ 7808.478783] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 7812.528895] Lustre: Unmounted lustre-client [ 7815.633272] Lustre: Mounted lustre-client [ 7818.152599] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 7822.185288] Lustre: Unmounted lustre-client [ 7832.647238] Lustre: DEBUG MARKER: == recovery-small test 140b: local mount is excluded from recovery ========================================================== 05:48:48 (1743500928) [ 7835.905572] Lustre: lustre-MDT0000: local client 572b0ab8-54ed-4850-8322-a977b54ab0a2 w/o recovery [ 7835.929878] Lustre: Mounted lustre-client [ 7838.368450] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 7844.137361] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 [ 7846.598292] Lustre: Unmounted lustre-client [ 7848.771433] Lustre: Failing over lustre-MDT0000 [ 7848.944113] Lustre: server umount lustre-MDT0000 complete [ 7867.344501] Lustre: 3685:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743500949/real 1743500949] req@ffff94bad0528bc0 x1828185288276096/t0(0) o400->MGC192.168.206.153@tcp@0@lo:26/25 lens 224/224 e 0 to 1 dl 1743500965 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 7870.209333] LDISKFS-fs (dm-0): recovery complete [ 7870.215523] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 7877.600310] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bad138dc40 x1828185288284032/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 7881.450589] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 7883.358257] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6224 to 0x280000401:6305) [ 7883.364424] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6222 to 0x2c0000401:6305) [ 7889.324923] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 7891.756678] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 7904.568032] Lustre: DEBUG MARKER: == recovery-small test 141: do not lose locks on MGS restart ========================================================== 05:50:00 (1743501000) [ 7908.511284] Lustre: DEBUG MARKER: SKIP: recovery-small test_141 cannot run in local mode or from build tree [ 7910.993478] Lustre: DEBUG MARKER: == recovery-small test 142: orphan name stub can be cleaned up in startup ========================================================== 05:50:07 (1743501007) [ 7912.096521] Lustre: *** cfs_fail_loc=165, val=0*** [ 7913.677382] Lustre: Failing over lustre-MDT0000 [ 7913.911707] Lustre: server umount lustre-MDT0000 complete [ 7921.493414] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 7925.283930] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 7927.277147] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 7927.281421] Lustre: Skipped 11 previous similar messages [ 7927.328790] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6224 to 0x280000401:6337) [ 7927.328844] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6307 to 0x2c0000401:6337) [ 7927.330775] LustreError: 93487:0:(osd_handler.c:273:osd_idc_find_or_init()) can't lookup: rc = -2 [ 7936.132731] Lustre: DEBUG MARKER: == recovery-small test 143: orphan cleanup thread shouldn't be blocked even delete failed ========================================================== 05:50:32 (1743501032) [ 7937.976394] Lustre: Failing over lustre-MDT0000 [ 7938.205492] Lustre: server umount lustre-MDT0000 complete [ 7943.646160] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null) [ 7951.500161] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 7955.202941] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 7957.042897] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:6224 to 0x280000401:6369) [ 7957.043341] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:6307 to 0x2c0000401:6369) [ 7959.769293] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 [ 7968.826128] Lustre: DEBUG MARKER: == recovery-small test 144a: MDT failover should stop precreation threads ========================================================== 05:51:05 (1743501065) [ 7975.819955] Lustre: Failing over lustre-OST0000 [ 7976.312754] Lustre: server umount lustre-OST0000 complete [ 7994.442408] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 7994.509622] Lustre: 96056:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 7994.513435] Lustre: 96056:0:(mgc_request_server.c:553:mgc_llog_local_copy()) Skipped 1 previous similar message [ 8002.091906] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8025.817017] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 8035.541471] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 8105.088733] Lustre: Failing over lustre-MDT0000 [ 8105.443374] Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) [ 8105.941169] Lustre: server umount lustre-MDT0000 complete [ 8123.406752] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 8126.434339] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8129.038523] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:31306 to 0x280000401:31329) [ 8129.040287] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:31434 to 0x2c0000401:31457) [ 8133.282754] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 8135.484854] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 8139.113866] Lustre: Failing over lustre-MDT0000 [ 8139.246252] Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) [ 8139.424246] Lustre: server umount lustre-MDT0000 complete [ 8155.895675] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 8158.920933] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8161.280983] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:31306 to 0x280000401:31361) [ 8161.281422] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:31434 to 0x2c0000401:31489) [ 8165.498103] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid [ 8167.658652] Lustre: DEBUG MARKER: mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec [ 8177.151074] Lustre: 31167:0:(osd_handler.c:2068:osd_trans_start()) lustre-MDT0000: credits 30372 > trans_max 3200 [ 8177.154663] Lustre: 31167:0:(osd_handler.c:1967:osd_trans_dump_creds()) create: 4/16/0, destroy: 1/4/0 [ 8177.158052] Lustre: 31167:0:(osd_handler.c:1974:osd_trans_dump_creds()) attr_set: 2003/2003/0, xattr_set: 3004/28148/0 [ 8177.162652] Lustre: 31167:0:(osd_handler.c:1984:osd_trans_dump_creds()) write: 20/109/0, punch: 0/0/0, quota 0/0/0 [ 8177.166638] Lustre: 31167:0:(osd_handler.c:1991:osd_trans_dump_creds()) insert: 5/84/0, delete: 2/5/0 [ 8177.169782] Lustre: 31167:0:(osd_handler.c:1998:osd_trans_dump_creds()) ref_add: 1/1/0, ref_del: 2/2/0 [ 8177.173368] CPU: 1 PID: 31167 Comm: mdt00_006 Kdump: loaded Tainted: G W O -------- - - 4.18.0rh8.10-debug #7 [ 8177.177126] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 8177.180060] Call Trace: [ 8177.180899] ? dump_stack+0xbb/0x10e [ 8177.182240] ? osd_trans_start+0x87f/0x8f0 [osd_ldiskfs] [ 8177.184237] ? top_trans_start+0x5a2/0xda0 [ptlrpc] [ 8177.186272] ? lod_trans_start+0x109/0x4c0 [lod] [ 8177.187822] ? mdd_trans_start+0x18/0x30 [mdd] [ 8177.189288] ? mdd_unlink+0x778/0x1350 [mdd] [ 8177.190982] ? mdt_reint_unlink+0x1588/0x1a90 [mdt] [ 8177.192665] ? mdt_reint_rec+0x139/0x2c0 [mdt] [ 8177.194372] ? mdt_reint_internal+0x6a0/0xbf0 [mdt] [ 8177.196084] ? mdt_reint+0x163/0x190 [mdt] [ 8177.197592] ? tgt_handle_request0+0x137/0xaf0 [ptlrpc] [ 8177.199744] ? tgt_request_handle+0x351/0x1c10 [ptlrpc] [ 8177.201745] ? ptlrpc_server_handle_request+0x374/0x1320 [ptlrpc] [ 8177.204135] ? lprocfs_counter_add+0x14d/0x220 [obdclass] [ 8177.206264] ? ptlrpc_main+0xd2a/0x1450 [ptlrpc] [ 8177.208030] ? ptlrpc_wait_event+0x9c0/0x9c0 [ptlrpc] [ 8177.209923] ? kthread+0x1d7/0x210 [ 8177.211182] ? set_kthread_struct+0x70/0x70 [ 8177.212539] ? ret_from_fork+0x1f/0x30 [ 8256.219511] Lustre: DEBUG MARKER: == recovery-small test 144b: orphan cleanup shouldn't be blocked for no objects+failover situation ========================================================== 05:55:51 (1743501351) [ 8263.178121] Lustre: Failing over lustre-OST0000 [ 8263.227314] LustreError: lustre-OST0000-osc-MDT0000: operation ost_create to node 0@lo failed: rc = -19 [ 8263.230540] LustreError: Skipped 8 previous similar messages [ 8263.232640] Lustre: lustre-OST0000-osc-MDT0000: Connection to lustre-OST0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 8263.238141] Lustre: Skipped 32 previous similar messages [ 8263.698956] Lustre: server umount lustre-OST0000 complete [ 8272.815400] LustreError: 28537:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-OST0000: not available for connect from 192.168.206.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 8272.823347] LustreError: 28537:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 125 previous similar messages [ 8280.936272] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 8281.015689] Lustre: 99700:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 8283.114918] Lustre: lustre-OST0000-osc-MDT0000: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 8283.118672] Lustre: Skipped 17 previous similar messages [ 8288.520948] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8307.534525] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 8314.978369] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 8461.821367] Lustre: DEBUG MARKER: == recovery-small test 144c: reconnection during orphan cleanup shouldn't lose LAST_ID synchronization ========================================================== 05:59:16 (1743501556) [ 8517.877463] Lustre: Failing over lustre-MDT0000 [ 8519.589776] Lustre: lustre-MDT0000: Not available for connect from 192.168.206.53@tcp (stopping) [ 8519.592996] Lustre: Skipped 3 previous similar messages [ 8520.161921] Lustre: server umount lustre-MDT0000 complete [ 8526.013056] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 8526.107888] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 8526.112387] LustreError: Skipped 7 previous similar messages [ 8526.302657] Lustre: lustre-MDT0000: Imperative Recovery not enabled, recovery window 60-180 [ 8526.305654] Lustre: Skipped 9 previous similar messages [ 8526.344094] Lustre: lustre-MDT0000: in recovery but waiting for the first client to connect [ 8526.346734] Lustre: Skipped 11 previous similar messages [ 8528.791774] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8529.825136] Lustre: lustre-MDT0000: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 8529.828595] Lustre: Skipped 8 previous similar messages [ 8532.361707] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 [ 8533.076393] Lustre: lustre-MDT0000: Recovery over after 0:04, of 2 clients 2 recovered and 0 were evicted. [ 8533.080223] Lustre: Skipped 8 previous similar messages [ 8533.095724] LustreError: 100157:0:(ofd_dev.c:1504:ofd_create_hdl()) cfs_fail_timeout id 254 sleeping for 5000ms [ 8533.099595] LustreError: 100157:0:(ofd_dev.c:1504:ofd_create_hdl()) Skipped 15 previous similar messages [ 8538.191210] LustreError: 27754:0:(ofd_dev.c:1504:ofd_create_hdl()) cfs_fail_timeout id 254 awake [ 8538.194151] LustreError: 27754:0:(ofd_dev.c:1504:ofd_create_hdl()) Skipped 6 previous similar messages [ 8538.197081] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:65442 to 0x280000401:65473) [ 8538.199135] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:56410 to 0x2c0000401:56449) [ 8539.026375] Lustre: lustre-OST0000: Client lustre-MDT0000-mdtlov_UUID (at 0@lo) reconnecting [ 8539.028834] Lustre: Skipped 1 previous similar message [ 8559.623124] Lustre: DEBUG MARKER: == recovery-small test 145: connect mdtlovs and process update logs after recovery expire ========================================================== 06:00:56 (1743501656) [ 8561.655378] Lustre: DEBUG MARKER: SKIP: recovery-small test_145 needs >= 3 MDTs [ 8563.620237] Lustre: DEBUG MARKER: == recovery-small test 146: test eviction is counted properly ========================================================== 06:01:00 (1743501660) [ 8565.192070] Lustre: 102485:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting b8661d80-1707-4d14-ac38-c8e51562e581 at adminstrative request [ 8572.665145] Lustre: DEBUG MARKER: == recovery-small test 147: Check client reconnect ======= 06:01:09 (1743501669) [ 8574.208081] Lustre: 102824:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-OST0000: evicting b8661d80-1707-4d14-ac38-c8e51562e581 at adminstrative request [ 8574.280953] Lustre: *** cfs_fail_loc=225, val=0*** [ 8589.729629] Lustre: *** cfs_fail_loc=225, val=0*** [ 8609.183687] Lustre: *** cfs_fail_loc=225, val=0*** [ 8664.480189] Lustre: *** cfs_fail_loc=225, val=0*** [ 8664.481756] Lustre: Skipped 1 previous similar message [ 8733.664335] Lustre: lustre-OST0000: haven't heard from client b8661d80-1707-4d14-ac38-c8e51562e581 (at 192.168.206.53@tcp) in 160 seconds. I think it's dead, and I am evicting it. exp ffff94badd01e800, cur 1743501831 expire 1743501801 last 1743501671 [ 8741.856955] Lustre: DEBUG MARKER: == recovery-small test 148: data corruption through resend ========================================================== 06:03:58 (1743501838) [ 8746.262745] LustreError: 51787:0:(tgt_handler.c:2868:tgt_brw_write()) cfs_fail_timeout id 227 sleeping for 27000ms [ 8758.241214] Lustre: lustre-MDT0000: haven't heard from client lustre-MDT0000-lwp-OST0001_UUID (at 0@lo) in 31 seconds. I think it's dead, and I am evicting it. exp ffff94bace2db800, cur 1743501855 expire 1743501825 last 1743501824 [ 8760.226145] Lustre: MGS: haven't heard from client 39195cf1-576a-4061-a1d9-feecb89eb989 (at 0@lo) in 33 seconds. I think it's dead, and I am evicting it. exp ffff94bbd0e4a000, cur 1743501857 expire 1743501827 last 1743501824 [ 8760.234187] Lustre: Skipped 1 previous similar message [ 8773.303134] LustreError: 51787:0:(tgt_handler.c:2868:tgt_brw_write()) cfs_fail_timeout id 227 awake [ 8773.306328] LustreError: 51787:0:(tgt_handler.c:2868:tgt_brw_write()) Skipped 1 previous similar message [ 8774.640206] Lustre: Evicted from MGS (at 192.168.206.153@tcp) after server handle changed from 0x623b88e7ff46bc68 to 0x623b88e7ff46c3a0 [ 8785.325716] Lustre: DEBUG MARKER: == recovery-small test 149: skip orphan removal at umount ========================================================== 06:04:41 (1743501881) [ 8787.424776] Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping) [ 8793.523172] Lustre: server umount lustre-MDT0001 complete [ 8797.364052] Lustre: server umount lustre-MDT0000 complete [ 8802.313219] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 8802.654147] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:65477 to 0x280000401:65505) [ 8802.654465] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:56451 to 0x2c0000401:56481) [ 8804.873529] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8810.106975] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 8810.167048] Lustre: 105223:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 8810.401183] Lustre: lustre-OST0000: new connection from lustre-MDT0001-mdtlov (cleaning up unused objects from 0x280000400:8481 to 0x280000400:8801) [ 8810.401598] Lustre: lustre-OST0001: new connection from lustre-MDT0001-mdtlov (cleaning up unused objects from 0x2c0000400:9632 to 0x2c0000400:9729) [ 8812.690842] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8828.649319] Lustre: DEBUG MARKER: == recovery-small test 150: statfs when MDT0 offline with lazystatfs option ========================================================== 06:05:25 (1743501925) [ 8830.276511] Lustre: Failing over lustre-MDT0000 [ 8830.449844] Lustre: server umount lustre-MDT0000 complete [ 8840.804419] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 8843.454994] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8846.375842] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000401:65477 to 0x280000401:65536) [ 8846.376132] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:56451 to 0x2c0000401:56513) [ 8846.453553] Lustre: lustre-OST0000-osc-MDT0000: update sequence from 0x280000401 to 0x280000402 [ 8846.802786] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 [ 8850.207694] Lustre: DEBUG MARKER: == recovery-small test 152: QoS object allocation could be awakened in case of OST failover ========================================================== 06:05:46 (1743501946) [ 8852.667195] LustreError: 104842:0:(lod_qos.c:789:lod_ost_alloc_rr()) cfs_fail_timeout id 173 sleeping for 20000ms [ 8860.639138] Lustre: 3682:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743501897/real 1743501897] req@ffff94bad138dc40 x1828185301935744/t0(0) o400->lustre-MDT0000-lwp-OST0000@0@lo:12/10 lens 224/224 e 0 to 1 dl 1743501958 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 8860.639138] Lustre: 3685:0:(client.c:2346:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1743501897/real 1743501897] req@ffff94bad138c540 x1828185301935488/t0(0) o400->lustre-MDT0000-lwp-OST0001@0@lo:12/10 lens 224/224 e 0 to 1 dl 1743501958 ref 1 fl Rpc:XNQr/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 8872.743123] LustreError: 104842:0:(lod_qos.c:789:lod_ost_alloc_rr()) cfs_fail_timeout id 173 awake [ 8880.708303] Lustre: 104510:0:(osd_handler.c:2068:osd_trans_start()) lustre-MDT0000: credits 15372 > trans_max 3200 [ 8880.711488] Lustre: 104510:0:(osd_handler.c:1967:osd_trans_dump_creds()) create: 4/16/0, destroy: 1/4/0 [ 8880.714167] Lustre: 104510:0:(osd_handler.c:1974:osd_trans_dump_creds()) attr_set: 1003/1003/0, xattr_set: 1504/14148/0 [ 8880.717575] Lustre: 104510:0:(osd_handler.c:1984:osd_trans_dump_creds()) write: 20/109/0, punch: 0/0/0, quota 0/0/0 [ 8880.720148] Lustre: 104510:0:(osd_handler.c:1991:osd_trans_dump_creds()) insert: 5/84/0, delete: 2/5/0 [ 8880.723168] Lustre: 104510:0:(osd_handler.c:1998:osd_trans_dump_creds()) ref_add: 1/1/0, ref_del: 2/2/0 [ 8880.726141] CPU: 2 PID: 104510 Comm: mdt00_000 Kdump: loaded Tainted: G W O -------- - - 4.18.0rh8.10-debug #7 [ 8880.729878] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-1.fc39 04/01/2014 [ 8880.732756] Call Trace: [ 8880.733558] ? dump_stack+0xbb/0x10e [ 8880.734820] ? osd_trans_start+0x87f/0x8f0 [osd_ldiskfs] [ 8880.736549] ? top_trans_start+0x5a2/0xda0 [ptlrpc] [ 8880.738388] ? lod_trans_start+0x109/0x4c0 [lod] [ 8880.739875] ? mdd_trans_start+0x18/0x30 [mdd] [ 8880.741361] ? mdd_unlink+0x778/0x1350 [mdd] [ 8880.742806] ? mdt_reint_unlink+0x1588/0x1a90 [mdt] [ 8880.744464] ? mdt_reint_rec+0x139/0x2c0 [mdt] [ 8880.745989] ? mdt_reint_internal+0x6a0/0xbf0 [mdt] [ 8880.747663] ? mdt_reint+0x163/0x190 [mdt] [ 8880.749090] ? tgt_handle_request0+0x137/0xaf0 [ptlrpc] [ 8880.751084] ? tgt_request_handle+0x351/0x1c10 [ptlrpc] [ 8880.753045] ? ptlrpc_server_handle_request+0x374/0x1320 [ptlrpc] [ 8880.755192] ? lprocfs_counter_add+0x14d/0x220 [obdclass] [ 8880.757131] ? ptlrpc_main+0xd2a/0x1450 [ptlrpc] [ 8880.758914] ? ptlrpc_wait_event+0x9c0/0x9c0 [ptlrpc] [ 8880.760819] ? kthread+0x1d7/0x210 [ 8880.761908] ? set_kthread_struct+0x70/0x70 [ 8880.763262] ? ret_from_fork+0x1f/0x30 [ 8882.939725] Lustre: DEBUG MARKER: == recovery-small test 153: evict vs reconnect race ====== 06:06:19 (1743501979) [ 8887.268037] Lustre: *** cfs_fail_loc=174, val=0*** [ 8887.269793] Lustre: Skipped 3 previous similar messages [ 8908.509849] Lustre: Failing over lustre-MDT0000 [ 8908.629339] Lustre: server umount lustre-MDT0000 complete [ 8908.709840] LustreError: 107276:0:(ldlm_lib.c:1095:target_handle_connect()) lustre-MDT0000: not available for connect from 192.168.206.53@tcp (no target). If you are running an HA pair check that the target is mounted on the other server. [ 8908.715079] LustreError: 107276:0:(ldlm_lib.c:1095:target_handle_connect()) Skipped 24 previous similar messages [ 8912.870225] LustreError: lustre-MDT0000-osp-MDT0001: operation mds_statfs to node 0@lo failed: rc = -107 [ 8912.870538] Lustre: lustre-MDT0000-lwp-OST0000: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 8912.873474] LustreError: Skipped 4 previous similar messages [ 8912.880211] Lustre: Skipped 22 previous similar messages [ 8914.185373] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 8916.805970] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8919.529473] Lustre: lustre-MDT0000-lwp-OST0001: Connection restored to 192.168.206.153@tcp (at 0@lo) [ 8919.532942] Lustre: Skipped 19 previous similar messages [ 8919.568109] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:56996 to 0x2c0000401:57025) [ 8919.569062] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000402:523 to 0x280000402:545) [ 8919.997962] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 1475 [ 8927.406851] Lustre: DEBUG MARKER: == recovery-small test 154a: corruption update llog can be skipped ========================================================== 06:07:04 (1743502024) [ 8929.013185] Lustre: Failing over lustre-MDT0001 [ 8929.119273] Lustre: server umount lustre-MDT0001 complete [ 8933.690281] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null) [ 8939.988359] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 8940.063990] Lustre: 109897:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 8942.633932] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8944.899974] Lustre: Failing over lustre-MDT0000 [ 8945.028425] Lustre: server umount lustre-MDT0000 complete [ 8950.043364] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 8952.638926] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8955.406435] LustreError: 110820:0:(llog_osd.c:249:llog_osd_read_header()) lustre-MDT0001-osp-MDT0000: bad log [0x240000407:0x1:0x0] header magic: 0x54bef243 (expected 0x10645539) [ 8955.408794] Lustre: lustre-OST0000: new connection from lustre-MDT0001-mdtlov (cleaning up unused objects from 0x280000400:8481 to 0x280000400:8833) [ 8955.409827] Lustre: lustre-OST0001: new connection from lustre-MDT0001-mdtlov (cleaning up unused objects from 0x2c0000400:9632 to 0x2c0000400:9761) [ 8955.412520] Lustre: 110820:0:(lod_sub_object.c:966:lod_sub_prep_llog()) lustre-MDT0000-mdtlov: renew invalid update log [0x240000407:0x1:0x0]: rc = -22 [ 8955.486185] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:56996 to 0x2c0000401:57057) [ 8955.490969] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000402:523 to 0x280000402:577) [ 8955.685816] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 20 [ 8963.265344] Lustre: DEBUG MARKER: == recovery-small test 154b: restore update llog after failed recovery ========================================================== 06:07:39 (1743502059) [ 8964.734243] Lustre: Failing over lustre-MDT0000 [ 8964.853663] Lustre: server umount lustre-MDT0000 complete [ 8970.591940] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 8970.723226] Lustre: lustre-MDT0000: Not available for connect from 0@lo (not set up) [ 8970.726642] Lustre: Skipped 8 previous similar messages [ 8970.867326] LustreError: 112196:0:(lod_dev.c:456:lod_sub_recovery_thread()) cfs_fail_timeout id 724 sleeping for 5000ms [ 8970.884334] Lustre: lustre-MDT0000: Aborting client recovery [ 8970.885840] LustreError: 112164:0:(ldlm_lib.c:2907:target_stop_recovery_thread()) lustre-MDT0000: Aborting recovery [ 8970.888636] Lustre: 112197:0:(ldlm_lib.c:2289:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 8970.891502] Lustre: 112197:0:(ldlm_lib.c:2289:target_recovery_overseer()) Skipped 1 previous similar message [ 8975.967108] LustreError: 112196:0:(lod_dev.c:456:lod_sub_recovery_thread()) cfs_fail_timeout id 724 awake [ 8975.970452] LustreError: 112196:0:(lod_dev.c:506:lod_sub_recovery_thread()) lustre-MDT0001-osp-MDT0000: get update log duration 5, retries 0, failed: rc = -5 [ 8975.975936] Lustre: 112197:0:(genops.c:1508:class_disconnect_stale_exports()) lustre-MDT0000: disconnect stale client b8661d80-1707-4d14-ac38-c8e51562e581@ [ 8975.979827] Lustre: lustre-MDT0000: disconnecting 2 stale clients [ 8975.983214] Lustre: 112197:0:(ldlm_lib.c:2289:target_recovery_overseer()) recovery is aborted, evict exports in recovery [ 8975.988507] Lustre: lustre-MDT0000-osd: cancel update llog [0x200009870:0x1:0x0] [ 8976.019103] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:56996 to 0x2c0000401:57089) [ 8976.019233] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000402:523 to 0x280000402:609) [ 8978.417238] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 8981.454443] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0000.recovery_status 30 [ 8997.210715] Lustre: DEBUG MARKER: == recovery-small test 155: failover after client remount ========================================================== 06:08:14 (1743502094) [ 9003.963398] Lustre: DEBUG MARKER: mds1 REPLAY BARRIER on lustre-MDT0000 [ 9005.133921] Lustre: Failing over lustre-MDT0000 [ 9005.284867] Lustre: server umount lustre-MDT0000 complete [ 9023.183029] LDISKFS-fs (dm-0): recovery complete [ 9023.184971] LDISKFS-fs (dm-0): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc [ 9033.187127] LustreError: 3681:0:(client.c:1292:ptlrpc_import_delay_req()) @@@ invalidate in flight req@ffff94bade5d9180 x1828185302235520/t0(0) o250->MGC192.168.206.153@tcp@0@lo:26/25 lens 520/544 e 0 to 0 dl 0 ref 1 fl Rpc:NQU/200/ffffffff rc 0/-1 job:'kworker.0' uid:0 gid:0 [ 9035.335272] Lustre: lustre-MDT0000: Denying connection for new client 3a0de227-ae04-40e0-97e3-a72651a79b39 (at 192.168.206.53@tcp), waiting for 1 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 0:59 [ 9035.765514] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 9037.218238] Lustre: lustre-MDT0000: Denying connection for new client 3a0de227-ae04-40e0-97e3-a72651a79b39 (at 192.168.206.53@tcp), waiting for 1 known clients (0 recovered, 0 in progress, and 0 evicted) to recover in 0:57 [ 9038.859683] Lustre: lustre-OST0000: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x280000402:523 to 0x280000402:641) [ 9038.859776] Lustre: lustre-OST0001: new connection from lustre-MDT0000-mdtlov (cleaning up unused objects from 0x2c0000401:57091 to 0x2c0000401:57121) [ 9050.113684] Lustre: DEBUG MARKER: == recovery-small test 156: tot_granted miscount after client eviction ========================================================== 06:09:06 (1743502146) [ 9051.464599] Lustre: Setting parameter general.timeout in log params [ 9056.830482] Lustre: DEBUG MARKER: ost1 REPLAY BARRIER on lustre-OST0000 [ 9058.502466] Lustre: Failing over lustre-OST0000 [ 9061.035821] Lustre: server umount lustre-OST0000 complete [ 9079.472812] LDISKFS-fs (dm-2): recovery complete [ 9079.475025] LDISKFS-fs (dm-2): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 9079.532670] Lustre: 115929:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 9083.123396] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 9120.000242] Lustre: lustre-OST0000: recovery is timed out, evict stale exports [ 9120.003211] Lustre: 115945:0:(genops.c:1508:class_disconnect_stale_exports()) lustre-OST0000: disconnect stale client 3a0de227-ae04-40e0-97e3-a72651a79b39@192.168.206.53@tcp [ 9120.009805] Lustre: 115945:0:(genops.c:1508:class_disconnect_stale_exports()) Skipped 1 previous similar message [ 9120.013379] Lustre: lustre-OST0000: disconnecting 1 stale clients [ 9120.015792] Lustre: 115945:0:(ldlm_lib.c:1971:extend_recovery_timer()) lustre-OST0000: extended recovery timer reached hard limit: 45, extend: 1 [ 9120.025959] Lustre: 115945:0:(ldlm_lib.c:2854:target_recovery_thread()) too long recovery - read logs [ 9120.030426] LustreError: dumping log to /tmp/lustre-log.1743502217.115945 [ 9129.187792] Lustre: DEBUG MARKER: oleg653-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid [ 9131.112671] Lustre: DEBUG MARKER: osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec [ 9140.758792] Lustre: Modifying parameter general.timeout in log params [ 9142.464594] Lustre: DEBUG MARKER: == recovery-small test 157: eviction during mmaped i/o === 06:10:39 (1743502239) [ 9144.610053] Lustre: 116938:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-OST0000: evicting 3a0de227-ae04-40e0-97e3-a72651a79b39 at adminstrative request [ 9153.589567] Lustre: DEBUG MARKER: == recovery-small test 158a: connect without access right ========================================================== 06:10:50 (1743502250) [ 9157.562493] Lustre: Failing over lustre-MDT0001 [ 9157.605661] Lustre: lustre-MDT0001: Not available for connect from 0@lo (stopping) [ 9157.608035] Lustre: Skipped 9 previous similar messages [ 9157.678610] Lustre: server umount lustre-MDT0001 complete [ 9173.391665] LDISKFS-fs (dm-1): mounted filesystem with ordered data mode. Opts: user_xattr,acl,no_mbcache,nodelalloc [ 9173.461487] Lustre: 117700:0:(mgc_request_server.c:553:mgc_llog_local_copy()) MGC192.168.206.153@tcp: no remote llog for lustre-sptlrpc, check MGS config [ 9173.558711] Lustre: *** cfs_fail_loc=2402, val=0*** [ 9173.560524] Lustre: Skipped 73 previous similar messages [ 9173.568137] Lustre: lustre-MDT0000-osp-MDT0001: connection denied by lustre-MDT0000_UUID: rc = -13 [ 9173.671499] Lustre: lustre-MDT0001: Imperative Recovery not enabled, recovery window 60-180 [ 9173.675354] Lustre: Skipped 9 previous similar messages [ 9173.708271] Lustre: lustre-MDT0001: in recovery but waiting for the first client to connect [ 9173.711482] Lustre: Skipped 9 previous similar messages [ 9175.301609] Lustre: lustre-MDT0001: Will be in recovery for at least 1:00, or until 2 clients reconnect [ 9175.304811] Lustre: Skipped 6 previous similar messages [ 9176.232163] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing set_default_debug -1 all [ 9179.112478] Lustre: lustre-MDT0001-osp-MDT0000: connection denied by lustre-MDT0001_UUID: rc = -13 [ 9179.116179] Lustre: Skipped 1 previous similar message [ 9184.231463] Lustre: lustre-MDT0001: Denying connection for new client lustre-MDT0000-mdtlov_UUID (at 0@lo), waiting for 2 known clients (0 recovered, 0 in progress, and 2 evicted) to recover in 1:40 [ 9184.262363] Lustre: lustre-MDT0001: Recovery over after 0:09, of 2 clients 0 recovered and 2 were evicted. [ 9184.265380] Lustre: Skipped 6 previous similar messages [ 9184.284802] Lustre: lustre-OST0000: new connection from lustre-MDT0001-mdtlov (cleaning up unused objects from 0x280000400:8481 to 0x280000400:8865) [ 9184.285382] Lustre: lustre-OST0001: new connection from lustre-MDT0001-mdtlov (cleaning up unused objects from 0x2c0000400:9632 to 0x2c0000400:9793) [ 9185.019822] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 [ 9189.356866] LustreError: lustre-MDT0001-osp-MDT0000: This client was evicted by lustre-MDT0001; in progress operations using this service will fail. [ 9202.261426] Lustre: DEBUG MARKER: == recovery-small test 160: MDT destroys are blocked by grouplocks ========================================================== 06:11:38 (1743502298) [ 9205.502280] Lustre: 118567:0:(genops.c:1678:obd_export_evict_by_uuid()) lustre-MDT0000: evicting cca905fc-699b-4590-b84c-97c4175039f3 at adminstrative request [ 9205.769397] LustreError: 8733:0:(ofd_dev.c:1765:ofd_destroy_hdl()) lustre-OST0000: error destroying object [0x280000402:0x284:0x0]: -5 [ 9211.874602] LustreError: 99210:0:(ofd_dev.c:1765:ofd_destroy_hdl()) lustre-OST0000: error destroying object [0x280000402:0x285:0x0]: -5 [ 9211.878955] LustreError: 99210:0:(ofd_dev.c:1765:ofd_destroy_hdl()) Skipped 4 previous similar messages [ 9252.156542] Lustre: DEBUG MARKER: == recovery-small test complete, duration 8621 sec ======= 06:12:28 (1743502348) [ 9253.962963] Lustre: DEBUG MARKER: === recovery-small: start cleanup 06:12:30 (1743502350) === [ 9397.645409] Lustre: DEBUG MARKER: === recovery-small: finish cleanup 06:14:54 (1743502494) === [ 9402.031964] Lustre: server umount lustre-MDT0000 complete [ 9407.444931] LustreError: 7479:0:(ldlm_lockd.c:2591:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1743502505 with bad export cookie 7078401769344534517 [ 9407.449036] LustreError: 7479:0:(ldlm_lockd.c:2591:ldlm_cancel_handler()) Skipped 2 previous similar messages [ 9407.449250] LustreError: MGC192.168.206.153@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 9407.455068] LustreError: Skipped 7 previous similar messages [ 9407.646665] Lustre: server umount lustre-MDT0001 complete [ 9423.411960] Lustre: server umount lustre-OST0000 complete [ 9438.810750] Lustre: server umount lustre-OST0001 complete [ 9457.224549] Lustre: DEBUG MARKER: oleg653-server.virtnet: executing unload_modules_local [ 9460.356704] Key type lgssc unregistered [ 9460.713403] LNet: 121704:0:(lib-ptl.c:966:lnet_clear_lazy_portal()) Active lazy portal 0 on exit [ 9461.733986] LNet: Removed LNI 192.168.206.153@tcp [ 9462.735240] Key type .llcrypt unregistered [ 9462.736875] Key type ._llcrypt unregistered