freeipmi-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Freeipmi-users] Disabled temp sensors


From: Al Chu
Subject: Re: [Freeipmi-users] Disabled temp sensors
Date: Wed, 05 May 2010 09:47:45 -0700

Hey Eric,

On Tue, 2010-05-04 at 19:16 -0700, Eric Pooch wrote:
> From:     address@hidden
>         Subject:        Re: [Freeipmi-users] Disabled temp sensors
>         Date:   May 4, 2010 7:15:43 PM PDT
>         To:       address@hidden
> 
> Al,
> see below:
> On May 4, 2010, at 10:48 AM, Al Chu wrote:
> 
> 
> > Hey Eric,
> >
> > Ahhhh.  That would do it.  The slave address is probably wrong in the
> > SDR.  When you run w/ bridging, the sensors attempts to bridge to an
> > address that is probably non-functional/illegal.
> >
> >
> Yep.
> 
> > LMK what your final patch looks like.  I can work it into a workaround
> > of some sort for ipmi-sensors.  (e.g.
> > --workaround-flags=assumebmcslaveaddr).
> >
> 
> I can work on it, but I wanted to make sure that I don't just have
> some errors in the SDR that are causing the problem.  If I issue the
> Clear SDR Repository command, with this cause me to lose information,
> or will the SDR repository get rebuilt fresh on its own?

I don't know of HW that will rebuild an SDR, so I wouldn't recommend
that.  Best bet is a firmware reflash.

> > How does ipmi-sel look like?  I'm wondering if SEL events are
> > reporting
> > right/wrong slave addresses and sensor related outputs are outputting
> > correctly or not.
> >
> >
> sudo ipmi-sel
> ipmi_sel_parse: internal IPMI error

Hmmm.  Maybe a similar internal issue.  Can you send --debug output.

> >> My fans still look messed up.
> >>
> >
> > It certainly depends on if the SDR is correct or not.  From the output
> > below, it looks as though the Fans are "transition" fans.  They only
> > report the transition state instead of fan instead of an RPM.  If they
> > aren't "transition" fans, then the SDR might be wrong which is leading
> > to this kind of output.
> >
> There is also a valid sensor reading , but it doesn't look like the
> library supports that.
> 
> >
> > BTW, you forgot the debug output from your previous e-mail.
> >
> >
> I did send it as an attachment, but I think it got filtered out.

Maybe it did.  What are you sending as?  If the output is too big, a
gzip should be sufficient.

Al

> Thanks a lot
> --Eric
> 
> > Al
> >
> > On Mon, 2010-05-03 at 21:32 -0700, Eric Pooch wrote:
> >
> >> OK,  I think I found the problem on my computer's implementation
> >> of IPMI
> >> I edited:
> >> /freeipmi-0.8.5/libfreeipmi/src/sensor-read/ipmi-sensor-read.c
> >>
> >>    if (slave_address == IPMI_SLAVE_ADDRESS_BMC)*/
> >>    if (slave_address != IPMI_SLAVE_ADDRESS_BMC)
> >> And received what looks like good data:
> >>
> >> 1  | Fan 1            | Fan                      | N/A        | N/A
> >> | 'transition to Off Line'
> >> 2  | Fan 2            | Fan                      | N/A        | N/A
> >> | 'transition to Running' 'transition to On Line'
> >> 3  | Fan 3            | Fan                      | N/A        | N/A
> >> | 'transition to Running' 'transition to On Line'
> >> 4  | Fan 4            | Fan                      | N/A        | N/A
> >> | 'transition to Running' 'transition to On Line'
> >> 5  | PCI Fan          | Fan                      | N/A        | N/A
> >> | 'transition to Off Line'
> >> 6  | Memory           | Memory                   | N/A        | N/A
> >> | 'OK'
> >> 7  | CPU 1            | Processor                | N/A        | N/A
> >> | 'Processor Presence detected'
> >> 8  | CPU 2            | Processor                | N/A        | N/A
> >> | 'Processor Presence detected'
> >> 9  | VRM              | Voltage                  | N/A        | N/A
> >> | 'OK'
> >> 10 | CPU1 Temperature | Temperature              | 35.00      | C
> >> | 'OK'
> >> 11 | CPU2 Temperature | Temperature              | 33.00      | C
> >> | 'OK'
> >> 12 | Thermal Trip     | Temperature              | N/A        | N/A
> >> | 'OK'
> >> 13 | Sys Temperature  | Temperature              | 31.00      | C
> >> | 'OK'
> >> 14 | DDR 1.25V        | Voltage                  | 1.25       | V
> >> | 'OK'
> >> 15 | Sys 3.3V         | Voltage                  | 3.25       | V
> >> | 'OK'
> >> 16 | Sys 5V           | Voltage                  | 5.00       | V
> >> | 'OK'
> >> 17 | CIOBE 1.2V       | Voltage                  | 1.21       | V
> >> | 'OK'
> >> 18 | CIOBE 2.5V       | Voltage                  | 2.52       | V
> >> | 'OK'
> >> 19 | BIOS Progress    | System Firmware Progress | N/A        | N/A
> >> | N/A
> >> 20 | Watchdog         | Watchdog 2               | N/A        | N/A
> >> | N/A
> >>
> >> This is much better, and I get info for almost all of the sensors
> >> that just showed N/A before. My fans still look messed up.  I will
> >> figure out more details, make it a bit cleaner and send a patch for
> >> users of this flawed IPMI implementation
> >> Thanks
> >> --Eric
> >>
> >> On May 3, 2010, at 8:42 PM, Eric Pooch wrote:
> >>
> >>
> >>> Ok, I updated to 0.8.5 and attached an archive of the debug log
> >>> from:
> >>> $ sudo ipmi-sensors --debug
> >>> see below:
> >>> On May 3, 2010, at 5:23 PM, Al Chu wrote:
> >>>
> >>>
> >>>> Hey Eric,
> >>>>
> >>>>
> >>>>> Also, bmc-info returns IPMI version 1.0 that is probably not
> >>>>> supported by FreeIPMI, but ipmi-locate, returns "IPMI Version:
> >>>>> 1.5"
> >>>>> for all of the devices.
> >>>>>
> >>>>
> >>>> Doing a quick online search, this machine appears to be pretty
> >>>> old.  It
> >>>> is possible that it does not support IPMI 1.5.  The output from
> >>>> ipmi-locate you're seeing may be the defaults and not actual
> >>>> outputs
> >>>> from the machine (this is confusing many people so I'm changing
> >>>> this
> >>>> output for the next 0.9.1 release).  If it is only IPMI 1.0,
> >>>> there's
> >>>> probably not much I can do to help you, since many of the IPMI
> >>>> commands
> >>>> will just not be supported on your motherboard.
> >>>>
> >>>>
> >>>
> >>> Ok.  I understand
> >>>
> >>>>> First, all my sensor values come back as [NA] even though most
> >>>>> work
> >>>>> properly under ipmitool.
> >>>>>
> >>>>
> >>>> I assume you're using FreeIPMI 0.7.X b/c the newest one (0.8.X
> >>>> line)
> >>>> does not have "[NA]" output.  There have certainly been fixes since
> >>>> then, so you may wish to upgrade.  My initial guess was
> >>>> bridging, but
> >>>> you seem to have tried that.
> >>>>
> >>>> I've noticed on some motherboards that there are issues b/c I find
> >>>> errors/problems in other parts of IPMI that ipmitool doesn't,
> >>>> thus I
> >>>> output errors and they don't.  We need to dig into the core of the
> >>>> errors on your board to figure out what they/I are doing
> >>>> wrong/differently.  Can you provide --debug output?
> >>>>
> >>>>
> >>>>> So, I think maybe there something that is  disabling the Temp
> >>>>> sensor
> >>>>> at another level.  I noticed on the HP lightsout user guide that
> >>>>> they
> >>>>> have a setting "o PEF Control—Enables or disables the sensor. "
> >>>>>
> >>>>
> >>>> Based on some of the error messages you posted from ipmitool
> >>>> (BTW, in
> >>>> the future could you indicate what tools the error messages came
> >>>> from, I
> >>>> thought you were indicating FreeIPMI errors and couldn't find
> >>>> them at
> >>>> first),
> >>>>
> >>> Sorry, I thought I was listing FreeIPMI errors, but I guess I
> >>> posted errors from the wrong log.
> >>>
> >>>> my guess is bridging is not supported on your motherboard and/or
> >>>> there is a firmware issue w/ bridging, so the temp sensors can't be
> >>>> reached.
> >>>>
> >>> I would agree, except that the standard IMPI raw "get sensor
> >>> reading" command works fine.  It is almost like ipmi-sensors and
> >>> ipmitool are finding something they don't like in the sdr and not
> >>> trying to read the sensor at all.
> >>>
> >>> $ sudo ipmi-raw 0 04 2d 0A
> >>> rcvd: 2D 00 23 C0 00 00
> >>>
> >>> 0x23 =  35 degrees celsius, which seems right for my processor
> >>> temp.  As I mentioned before, it varies proportionally with server
> >>> load, seems like the value I need, and is the correct command as
> >>> far as I can tell from the IPMI v 1.5 specs
> >>>
> >>>
> >>>> It's hard to say.  If you can provide me --debug output from
> >>>> ipmi-sensors, I can maybe analyze it deeper.
> >>>>
> >>>
> >>>
> >>> $ sudo ipmi-sensors --debug
> >>> see attachment
> >>>
> >>> $ sudo ipmi-sensors --bridge
> >>> ipmi_sensor_read: internal IPMI error
> >>>
> >>>
> >>>> Does any HP specific software work for you for all these
> >>>> sensors?  If
> >>>> their software does, and ipmitool/FreeIPMI does not, it indicates
> >>>> there
> >>>> is something kooky on your motherboard.
> >>>>
> >>>>
> >>> I don't know, I don't have access to Windows.  If it won't work
> >>> with FreeIPMI, I understand that my motherboard is old, but it just
> >>> seems strange that I can get the sensor reading using ipmi-raw, but
> >>> not ipmi-sensors.
> >>>
> >>> Thanks a lot for your help.
> >>> --Eric
> >>>
> >>>
> >>>> Al
> >>>>
> >>>> On Sun, 2010-05-02 at 10:10 -0700, Eric Pooch wrote:
> >>>>
> >>>>> I am having several problems on my HP proliant dl140 G1
> >>>>>
> >>>>> First, all my sensor values come back as [NA] even though most
> >>>>> work
> >>>>> properly under ipmitool.
> >>>>> I get the debug errors from ipmi-sensors:
> >>>>>
> >>>>> Error reading event status for sensor #09: Invalid command
> >>>>> ...
> >>>>> Error reading event enable for sensor #09: Invalid command
> >>>>>
> >>>>> When I try ipmi-raw to send those commands, I also get the same
> >>>>> error, so I think the commands are not supported on the sensors.
> >>>>> The
> >>>>> sensors are returning the proper information when I send a raw
> >>>>> command to get their readings. (see below)
> >>>>>
> >>>>> However, none of my temp sensors work properly in either
> >>>>> freeipmi or
> >>>>> ipmitool and I get a debug error:
> >>>>> Error reading sensor CPU1 Temperature (#0a): Destination
> >>>>> unavailable
> >>>>>
> >>>>> I get the same "destination unavailable message from event status
> >>>>> and
> >>>>> event enable. However, when I enter the raw ipmi command to
> >>>>> read the
> >>>>> temp sensor:
> >>>>> sudo ipmi-raw 0 04 2d 0A
> >>>>>
> >>>>> it responds correctly:
> >>>>> rcvd: 2D 00 1B C0 00 00
> >>>>>
> >>>>> The 1B is the correct temperature in Celsius that rises with
> >>>>> processor load.  It is definitely the correct temperature.
> >>>>> I have tried the bridge mode but I get an error also.
> >>>>> It seems like the sensor is responding correctly, but is
> >>>>> disabled as
> >>>>> far as the sdr is concerned?  I can't enable it through a raw
> >>>>> command
> >>>>> because none of the sensors respond to the "event status" or
> >>>>> "event
> >>>>> enable" commands.  So, I think maybe there something that is
> >>>>> disabling the Temp sensor at another level.  I noticed on the HP
> >>>>> lightsout user guide that they have a setting "o PEF Control—
> >>>>> Enables
> >>>>> or disables the sensor. "
> >>>>> I am not really sure how to make a change that would cause the
> >>>>> sensor
> >>>>> to be enabled.
> >>>>>
> >>>>> Also, bmc-info returns IPMI version 1.0 that is probably not
> >>>>> supported by FreeIPMI, but ipmi-locate, returns "IPMI Version:
> >>>>> 1.5"
> >>>>> for all of the devices.
> >>>>>
> >>>>> Thanks for any help!
> >>>>> _______________________________________________
> >>>>> Freeipmi-users mailing list
> >>>>> address@hidden
> >>>>> http://***lists.gnu.org/mailman/listinfo/freeipmi-users
> >>>>>
> >>>>>
> >>>> --
> >>>> Albert Chu
> >>>> address@hidden
> >>>> Computer Scientist
> >>>> High Performance Systems Division
> >>>> Lawrence Livermore National Laboratory
> >>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> Freeipmi-users mailing list
> >>> address@hidden
> >>> http://**lists.gnu.org/mailman/listinfo/freeipmi-users
> >>>
> >>
> >>
> >>
> >> _______________________________________________
> >> Freeipmi-users mailing list
> >> address@hidden
> >> http://**lists.gnu.org/mailman/listinfo/freeipmi-users
> >>
> >>
> > --
> > Albert Chu
> > address@hidden
> > Computer Scientist
> > High Performance Systems Division
> > Lawrence Livermore National Laboratory
> >
> >
> 
> _______________________________________________
> Freeipmi-users mailing list
> address@hidden
> http://*lists.gnu.org/mailman/listinfo/freeipmi-users
> 
-- 
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory





reply via email to

[Prev in Thread] Current Thread [Next in Thread]