From massimo at cedoc.mo.it Fri Apr 24 00:11:59 2009 From: massimo at cedoc.mo.it (Massimo Lusetti) Date: Thu, 23 Apr 2009 16:11:59 +0200 Subject: [flashboot] Kernel panic 4.4-current Message-ID: <20090423161159.3526b3f1@intanto> On a LE565 board with "OpenBSD 4.4-current (GENERIC) #0: Fri Feb 6 16:55:11 CET 2009" with GENERIC-RD from flashboot i'm getting this panics: panic: pool_do_get(cryptop): free list modified page 0xd6bc2000; item addr 0xd6bc2a84; offset 0x14-0x16 The machine is "very remote" so i was only able to have a "human interface" to read the panic from the attached monitor. I've already tried to substitute the hardware with a plain new one but the panic still pop up. What do yuo think? Any hint is really appreciated cause i'm sort of lost in here. Thanks, regards -- Massimo From stu at spacehopper.org Fri Apr 24 00:34:31 2009 From: stu at spacehopper.org (Stuart Henderson) Date: Thu, 23 Apr 2009 15:34:31 +0100 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090423161159.3526b3f1@intanto> References: <20090423161159.3526b3f1@intanto> Message-ID: <20090423143431.GC23883@symphytum.spacehopper.org> try a newer kernel, I think some pool corruption was fixed since then. On 2009/04/23 16:11, Massimo Lusetti wrote: > On a LE565 board with "OpenBSD 4.4-current (GENERIC) #0: Fri Feb 6 > 16:55:11 CET 2009" with GENERIC-RD from flashboot i'm getting this > panics: > > panic: pool_do_get(cryptop): free list modified page 0xd6bc2000; item > addr 0xd6bc2a84; offset 0x14-0x16 > > The machine is "very remote" so i was only able to have a "human > interface" to read the panic from the attached monitor. > > I've already tried to substitute the hardware with a plain new one but > the panic still pop up. > > What do yuo think? Any hint is really appreciated cause i'm sort of > lost in here. > > Thanks, regards > -- > Massimo > _______________________________________________ > flashboot mailing list > flashboot at mindrot.org > https://lists.mindrot.org/mailman/listinfo/flashboot From massimo at cedoc.mo.it Fri Apr 24 20:11:49 2009 From: massimo at cedoc.mo.it (Massimo Lusetti) Date: Fri, 24 Apr 2009 12:11:49 +0200 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090423143431.GC23883@symphytum.spacehopper.org> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> Message-ID: <20090424121149.694a2150@intanto> On Thu, 23 Apr 2009 15:34:31 +0100 Stuart Henderson wrote: > try a newer kernel, I think some pool corruption was fixed since then. I'm building a system from OPENBSD_4_5 cvs tag i'll try to report back. The strange thing is that we have 42 box like the one having problems, all box have the same kernel and hw. We already have replaced the hw with a new box thinking about ram problems but the failure still happens. Thanks again. Ciao -- Massimo From massimo at cedoc.mo.it Wed Apr 29 17:36:26 2009 From: massimo at cedoc.mo.it (Massimo Lusetti) Date: Wed, 29 Apr 2009 09:36:26 +0200 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090423143431.GC23883@symphytum.spacehopper.org> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> Message-ID: <20090429093626.5834fc52@intanto> On Thu, 23 Apr 2009 15:34:31 +0100 Stuart Henderson wrote: > try a newer kernel, I think some pool corruption was fixed since then. I'm preparing to upgrade a whole bunch of box to 4.5 in the mean time i would like to ask if the panic could be caused by: 012: RELIABILITY FIX: April 8, 2009 All architectures The OpenSSL ASN.1 handling code could be forced to perform invalid memory accesses through the use of certain invalid strings (CVE-2009-0590) or under certain error conditions triggerable by invalid ASN.1 structures (CVE-2009-0789). These vulnerabilities could be exploited to achieve a denial-of-service. A more detailed description of these problems is available in the OpenSSL security advisory, but note that the other issue described there "Incorrect Error Checking During CMS verification" relates to code not enabled in OpenBSD. A source code patch exists which remedies this problem. Regards -- Massimo From stu at spacehopper.org Wed Apr 29 19:16:23 2009 From: stu at spacehopper.org (Stuart Henderson) Date: Wed, 29 Apr 2009 10:16:23 +0100 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090429093626.5834fc52@intanto> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> <20090429093626.5834fc52@intanto> Message-ID: <20090429091623.GH5825@symphytum.spacehopper.org> On 2009/04/29 09:36, Massimo Lusetti wrote: > On Thu, 23 Apr 2009 15:34:31 +0100 > Stuart Henderson wrote: > > > try a newer kernel, I think some pool corruption was fixed since then. > > I'm preparing to upgrade a whole bunch of box to 4.5 in the mean time i > would like to ask if the panic could be caused by: > > 012: RELIABILITY FIX: April 8, 2009 All architectures > The OpenSSL ASN.1 handling code could be forced to perform invalid > memory accesses through the use of certain invalid strings it's not that; that would be userland processes only, not kernel. From massimo at cedoc.mo.it Wed Apr 29 19:35:15 2009 From: massimo at cedoc.mo.it (Massimo Lusetti) Date: Wed, 29 Apr 2009 11:35:15 +0200 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090423143431.GC23883@symphytum.spacehopper.org> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> Message-ID: <20090429113515.6eca0e96@intanto> On Thu, 23 Apr 2009 15:34:31 +0100 Stuart Henderson wrote: > try a newer kernel, I think some pool corruption was fixed since then. I was able to get panic, trace and ps. The kenrnel is GENERIC-RD from flashboot sources with just DMA disabled on wd0* (0x0ffc) and pcibios disabled cause on that board (Commell LE-565) it would cause watchdog timeout on re(4) Do you think i could also post to misc@ even if the kernel is not GENERIC and the platform not a plain OpenBSD install? Anyway here it is all info about the panic. Note that these are hand copied since i only manage to get image (made with a cellular phone) of the attached monitor ;) panic: pool_do_get(cryptop): free list modified: page 0xd6c33000; item addr 0xd6c33600; offset 0x14=0x16 Stopped at Debugger+0x4; leave ddb> trace Debugger(da17db34,100,da17db78,d03bf66c,d083dde0) at Debugger+0x4 panic(d083dde0,d0851460,d6c33000,d6c33600,14) at panic+0x8b pool_do_get(d1d6d240,0,d0201f3a,d03d986a) at pool_do_get+0x359 pool_get(d1d6d240,0,d0201f3a,da17dc60) at pool_get+0x22 crypto_getreq(2,9,0,da17dc5f) at crypto_getreq+0x79 esp_output(d6bdba00,d2777000,0,14,9,14,f,0) at esp_output+0x5f1 ipsp_process_packet(d6bdba00, d2777000,2,0) at ipsp_process_packet+0x746 ip_output(d6cbc900,0,d1d5dea4,1,0,0,3c,1) at ip_output+0x17cb ip_forward(d6cbc900,0,0,0) at ip_forward+0x341 ipv4_input(d6cbc900,0,d26ab280,d26ab200) at ipv4_input+0x70c ipintr(58,10,10,10,0) at ipintr+0x8a Bad frame pointer: 0xda17df28 ddb> ps PID PPID PGRP UID S FLAGS WAIT COMMAND 21347 3791 21347 0 3 0x4082 poll top 13265 1402 1402 91 3 0x180 kqread snmpd 1402 1 1402 0 3 0x80 kqread snmpd 3791 17215 3791 0 3 0x4082 pause ksh 17215 24803 17215 0 3 0x4080 select sshd 18095 1 1 0 3 0x4080 ttyopn getty 6253 1 6253 0 3 0x4082 ttyin getty 16074 1 16074 0 3 0x80 mfsidl mount_mfs 16092 1 16092 0 3 0x80 select cron 24803 1 24803 0 3 0x80 select sshd 13177 5463 5463 68 3 0x180 select isakmpd 5463 1 5463 0 3 0x80 netio isakmpd 15901 1 15901 0 3 0x180 pause inetd 31658 23866 23866 73 3 0x180 poll syslogd 23866 1 23866 0 3 0x88 netio syslogd 18 0 0 0 3 0x100200 bored crypto 17 0 0 0 3 0x100200 aioddoned aiodoned 16 0 0 0 3 0x100200 syncer update 15 0 0 0 3 0x100200 cleaner cleaner 14 0 0 0 3 0x100200 reaper reaper 13 0 0 0 3 0x100200 pgdaemon pagedaemon 12 0 0 0 3 0x100200 pftm pfpurge 11 0 0 0 3 0x100200 usbevt usb4 10 0 0 0 3 0x100200 usbevt usb3 9 0 0 0 3 0x100200 usbevt usb2 8 0 0 0 3 0x100200 usbevt usb1 7 0 0 0 3 0x100200 usbtsk usbtask 6 0 0 0 3 0x100200 usbevt usb0 5 0 0 0 3 0x100200 apmev apm0 4 0 0 0 3 0x100200 bored syswp * 3 0 0 0 3 0x100200 idle0 2 0 0 0 3 0x100200 kmalloc kmthread 1 0 1 0 3 0x4080 wait init 0 -1 0 0 3 0x80200 scheduler swapper Thank for any hints you could give. Regards -- Massimo From massimo at cedoc.mo.it Wed Apr 29 19:38:52 2009 From: massimo at cedoc.mo.it (Massimo Lusetti) Date: Wed, 29 Apr 2009 11:38:52 +0200 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090429091623.GH5825@symphytum.spacehopper.org> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> <20090429093626.5834fc52@intanto> <20090429091623.GH5825@symphytum.spacehopper.org> Message-ID: <20090429113852.02061af5@intanto> On Wed, 29 Apr 2009 10:16:23 +0100 Stuart Henderson wrote: > > it's not that; that would be userland processes only, not kernel. > Thanks, i was almost sure but asked anyway... again thanks. Regards -- Massimo From massimo at cedoc.mo.it Wed Apr 29 19:44:30 2009 From: massimo at cedoc.mo.it (Massimo Lusetti) Date: Wed, 29 Apr 2009 11:44:30 +0200 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090429113515.6eca0e96@intanto> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> <20090429113515.6eca0e96@intanto> Message-ID: <20090429114430.4c3e8f1d@intanto> On Wed, 29 Apr 2009 11:35:15 +0200 Massimo Lusetti wrote: > I was able to get panic, trace and ps. If it was not clear the panic is with 4.4-current, i'm currently testing 4.5 in a lab. Regards -- Massimo From stu at spacehopper.org Wed Apr 29 20:40:19 2009 From: stu at spacehopper.org (Stuart Henderson) Date: Wed, 29 Apr 2009 11:40:19 +0100 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090429113515.6eca0e96@intanto> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> <20090429113515.6eca0e96@intanto> Message-ID: <20090429104019.GI5825@symphytum.spacehopper.org> On 2009/04/29 11:35, Massimo Lusetti wrote: > On Thu, 23 Apr 2009 15:34:31 +0100 > Stuart Henderson wrote: > > > try a newer kernel, I think some pool corruption was fixed since then. > > I was able to get panic, trace and ps. > > The kenrnel is GENERIC-RD from flashboot sources with just DMA disabled > on wd0* (0x0ffc) and pcibios disabled cause on that board (Commell > LE-565) it would cause watchdog timeout on re(4) > > Do you think i could also post to misc@ even if the kernel is not > GENERIC and the platform not a plain OpenBSD install? I think you need to reproduce it on GENERIC first. From stu at spacehopper.org Wed Apr 29 23:54:22 2009 From: stu at spacehopper.org (Stuart Henderson) Date: Wed, 29 Apr 2009 14:54:22 +0100 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090429153413.3575db5b@intanto> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> <20090429113515.6eca0e96@intanto> <20090429104019.GI5825@symphytum.spacehopper.org> <20090429153413.3575db5b@intanto> Message-ID: <20090429135422.GL5825@symphytum.spacehopper.org> On 2009/04/29 15:34, Massimo Lusetti wrote: > On Wed, 29 Apr 2009 11:40:19 +0100 > Stuart Henderson wrote: > > > On 2009/04/29 11:35, Massimo Lusetti wrote: > > > On Thu, 23 Apr 2009 15:34:31 +0100 > > > Stuart Henderson wrote: > > > > > > > try a newer kernel, I think some pool corruption was fixed since > > > > then. > > > > > > I was able to get panic, trace and ps. > > > > > > The kenrnel is GENERIC-RD from flashboot sources with just DMA > > > disabled on wd0* (0x0ffc) and pcibios disabled cause on that board > > > (Commell LE-565) it would cause watchdog timeout on re(4) > > > > > > Do you think i could also post to misc@ even if the kernel is not > > > GENERIC and the platform not a plain OpenBSD install? > > > > I think you need to reproduce it on GENERIC first. > > > > Unfortunately it's not so easy in my environment... > > Regards > -- > Massimo You'll need to do more debugging yourself then, because others can't reproduce your environment. If you can read code, http://www.benzedrine.cx/crashreport.html and the crash(7) manual page may help. From massimo at cedoc.mo.it Wed Apr 29 23:58:28 2009 From: massimo at cedoc.mo.it (Massimo Lusetti) Date: Wed, 29 Apr 2009 15:58:28 +0200 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090429135422.GL5825@symphytum.spacehopper.org> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> <20090429113515.6eca0e96@intanto> <20090429104019.GI5825@symphytum.spacehopper.org> <20090429153413.3575db5b@intanto> <20090429135422.GL5825@symphytum.spacehopper.org> Message-ID: <20090429155828.7c396b9e@intanto> On Wed, 29 Apr 2009 14:54:22 +0100 Stuart Henderson wrote: > You'll need to do more debugging yourself then, because others can't > reproduce your environment. > > If you can read code, http://www.benzedrine.cx/crashreport.html and > the crash(7) manual page may help. That's great, great page from Daniel! I wasn't aware of that, thanks Stuart. BTW do you mean crash(8), right? Again thanks for your time. -- Massimo From massimo at cedoc.mo.it Wed Apr 29 23:34:13 2009 From: massimo at cedoc.mo.it (Massimo Lusetti) Date: Wed, 29 Apr 2009 15:34:13 +0200 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090429104019.GI5825@symphytum.spacehopper.org> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> <20090429113515.6eca0e96@intanto> <20090429104019.GI5825@symphytum.spacehopper.org> Message-ID: <20090429153413.3575db5b@intanto> On Wed, 29 Apr 2009 11:40:19 +0100 Stuart Henderson wrote: > On 2009/04/29 11:35, Massimo Lusetti wrote: > > On Thu, 23 Apr 2009 15:34:31 +0100 > > Stuart Henderson wrote: > > > > > try a newer kernel, I think some pool corruption was fixed since > > > then. > > > > I was able to get panic, trace and ps. > > > > The kenrnel is GENERIC-RD from flashboot sources with just DMA > > disabled on wd0* (0x0ffc) and pcibios disabled cause on that board > > (Commell LE-565) it would cause watchdog timeout on re(4) > > > > Do you think i could also post to misc@ even if the kernel is not > > GENERIC and the platform not a plain OpenBSD install? > > I think you need to reproduce it on GENERIC first. > Unfortunately it's not so easy in my environment... Regards -- Massimo From stu at spacehopper.org Thu Apr 30 00:17:43 2009 From: stu at spacehopper.org (Stuart Henderson) Date: Wed, 29 Apr 2009 15:17:43 +0100 Subject: [flashboot] Kernel panic 4.4-current In-Reply-To: <20090429155828.7c396b9e@intanto> References: <20090423161159.3526b3f1@intanto> <20090423143431.GC23883@symphytum.spacehopper.org> <20090429113515.6eca0e96@intanto> <20090429104019.GI5825@symphytum.spacehopper.org> <20090429153413.3575db5b@intanto> <20090429135422.GL5825@symphytum.spacehopper.org> <20090429155828.7c396b9e@intanto> Message-ID: <20090429141743.GM5825@symphytum.spacehopper.org> On 2009/04/29 15:58, Massimo Lusetti wrote: > On Wed, 29 Apr 2009 14:54:22 +0100 > Stuart Henderson wrote: > > > > You'll need to do more debugging yourself then, because others can't > > reproduce your environment. > > > > If you can read code, http://www.benzedrine.cx/crashreport.html and > > the crash(7) manual page may help. > > That's great, great page from Daniel! I wasn't aware of that, thanks > Stuart. > > BTW do you mean crash(8), right? > > Again thanks for your time. oops, yes crash(8). this will only show you where the crash occurred of course; the real problem may well have happened in a completely different place in the kernel, but it's a great starting point.