[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-users] Decoding ram errors on supermicro
From: |
Albert Chu |
Subject: |
Re: [Freeipmi-users] Decoding ram errors on supermicro |
Date: |
Tue, 04 Dec 2018 10:40:22 -0800 |
On Tue, 2018-12-04 at 11:39 +0100, Tom Hetmer wrote:
> Sure. It seems there's a similar ticket
> already: https://github.com/chu11/freeipmi-mirror/issues/19
Ahh, if you could, update it with info from ipmitool / ipmiutil. I was
reluctant to add support based on reverse engineering. But if other
tools have "official" interpretations from Supermicro, I'm more
confident in the addition.
> Yep, that's the code. ipmitool and a few others decode it too.
>
>
> We have a *lot* of Supermicros so I can help with testing if needed -
> but we don't get that much CRC errors though :)
The one thing I'll need is product ID numbers (you can get from bmc-
info) and the name of the product. This goes into the documentation
and some of the code.
Thanks,
Al
> So I guess we'd have to wait till one pops up. But I hope the 'ver 2'
> method from ipmiutil works fine.
> We used ipmitool in our monitoring before and it was accurate but
> slow, that's why I rewrote it all to use freeipmi.
>
>
> Thanks!
>
>
> Best,
> Tom Hetmer
>
>
> CDN77 Operations
> address@hidden / +44 (0) 20 3514 2399 / www.cdn77.com
>
> ----- Původní zpráva -----
> > Odesilatel: "Albert Chu" <address@hidden>
> > Příjemce: "Tom Hetmer" <address@hidden>, address@hidden
> > .org
> > Datum: 12/03/18 21:06
> > Předmět: Re: [Freeipmi-users] Decoding ram errors on supermicro
> >
> > Hi Tom,
> >
> > Thanks for the pointer to ipmiutil's code. I assume you found this
> > comment:
> >
> > ---
> > /* ver 2 method: 2A 80 = P1_DIMMB1
> > */
> >
> > /* SuperMicro
> > says:
> >
> > * pair: %c (data2 >> 4) + 0x40 + (data3 & 0x3) * 3,
> > (='B')
> >
> > * dimm: %c (data2 & 0xf) +
> > 0x27,
> >
> > * cpu: %x (data3 & 0x03) +
> > 1);
> >
> > */
> > ---
> >
> > I can definitely add it to my todo list.
> >
> > Would you mind writing up an issue on github here?
> >
> > https://github.com/chu11/freeipmi-mirror
> >
> > Al
> >
> > On Mon, 2018-12-03 at 17:55 +0100, Tom Hetmer wrote:
> > > Hi,
> > >
> > > it'd be good if freeipmi supported decoding the supermicro ECC
> > > errors.
> > >
> > >
> > > Manufacturer: Supermicro
> > > Product Name: X10DRH LN4
> > > eg.
> > > freeipmi
> > > 1,Dec-01-2018,06:37:53,Sensor #0,Memory,Critical,Uncorrectable
> > > memory
> > > error ; OEM Event Data2 code = 3Ah ; OEM Event Data3 code = 81h
> > >
> > >
> > > web interface
> > > 1 | 12/01/2018 | 06:37:53 | Memory | Uncorrectable ECC
> > > (@DIMMG1(CPU2)) | Asserted
> > >
> > >
> > > something like this worked for me (stolen from ipmiutil)
> > >
> > >
> > > $cpu = ($data3 & 0x03) + 1;
> > >
> > >
> > > $NPAIRS = 26;
> > > $rgpairs = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
> > >
> > >
> > > $bdata = "0x".$data2.$data3;
> > > $bdata = hexdec($bdata);
> > > $pair = (($bdata & 0xF0) >> 4) - 1;
> > >
> > >
> > > if ($pair < 0) $pair = 0;
> > > if ($pair > $NPAIRS) $pair = $NPAIRS - 1;
> > >
> > >
> > > $pair = $rgpairs[$pair - 1];
> > >
> > >
> > > $dimm = $bdata & 0x0F;
> > >
> > >
> > > $dimm may be incorrect as the original code decrements 9, but on
> > > that
> > > board it was wrong so i changed it to get the right result -
> > > we'll
> > > see if it keeps getting the right values.
> > >
> > > Best,
> > > Tom Hetmer
> > >
> > >
> > > CDN77 Operations
> > > address@hidden / +44 (0) 20 3514 2399 / www.cdn77.com
> > >
> > > _______________________________________________
> > > Freeipmi-users mailing list
> > > address@hidden
> > > https://lists.gnu.org/mailman/listinfo/freeipmi-users
> >
> > --
> > Albert Chu
> > address@hidden
> > Computer Scientist
> > High Performance Systems Division
> > Lawrence Livermore National Laboratory
>
> _______________________________________________
> Freeipmi-users mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/freeipmi-users
--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
- [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/03
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Albert Chu, 2018/12/03
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/04
- Re: [Freeipmi-users] Decoding ram errors on supermicro,
Albert Chu <=
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/04
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Albert Chu, 2018/12/05
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Al Chu, 2018/12/10
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/11
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/11
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/11
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Albert Chu, 2018/12/11
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Tom Hetmer, 2018/12/12
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Albert Chu, 2018/12/12
- Re: [Freeipmi-users] Decoding ram errors on supermicro, Albert Chu, 2018/12/12