[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Freeipmi-users] Disable alerting for watchdog timer expiration
From: |
Al Chu |
Subject: |
Re: [Freeipmi-users] Disable alerting for watchdog timer expiration |
Date: |
Wed, 01 Feb 2012 18:27:14 -0800 |
Hi Ryan,
Do the options in bmc-watchdog for turning off logging not work? Or
perhaps you're using the ipmi kernel driver bmc watchdog?
Al
On Wed, 2012-02-01 at 16:31 -0800, Ryan Cox wrote:
> Okay... so I figured it out after looking at the IPMI spec.
> ipmi-raw 0 6 0x24 0x80 0x01 0x00 0x00 0x96 0x00
>
> The 0x80 is the trick. The bit that is set is a "don't log" bit. That
> takes care of it properly. The command above uses a 15 second timer,
> don't log, and hard reset.
>
> The information about the fields for the Set Watchdog Timer command are
> documented at
> ftp://download.intel.com/design/servers/ipmi/IPMIv2_0rev1_0.pdf on page 378.
>
> Ryan
>
> On 02/01/2012 03:29 PM, Ryan Cox wrote:
> > Hello all,
> >
> > I would like to change the default behavior for our Dell servers
> > (mostly blades) to stop alerting at all when the watchdog timer
> > expires. Our HP ProLiant BL460c G1 servers don't alert on timer
> > expiration. I was hoping to see if there was a difference between the
> > configs, but the HP servers don't work with ipmi-pef-config ("Unable
> > to get Number of Alert Policy Entries") and have very few entries in
> > ipmi-sensors, none of which are related to the watchdog.
> >
> > What I would like to happen when a watchdog timer expires:
> > 1) The system will reboot
> > 2) *No* SNMP trap sent by the server itself
> > 3) *No* SNMP trap sent by the chassis (if the server is a blade)
> > 4) *No* event inserted in the SEL
> > 5) *No* amber lights on the server or chassis
> >
> > What I have accomplished:
> > 1) The system will reboot
> > 2) *No* SNMP trap sent by the server itself (the following worked:
> > "ipmi-pef-config -c -e Event_Filter_17:Enable_Filter=No")
> >
> > The SEL is populated and an alert sent whether the action is to reboot
> > the server or do nothing.
> >
> > What I have tried:
> > I set everything in "ipmi-sensors-config -S 44_OS_Watch" to be "No":
> > Section 44_OS_Watch
> > ## Possible values: Yes/No
> >
> > Enable_All_Event_Messages
> > No
> > ## Possible values: Yes/No
> >
> > Enable_Scanning_On_This_Sensor
> > No
> > ## Possible values: Yes/No
> >
> > Enable_Assertion_Event_Timer_Expired
> > No
> > ## Possible values: Yes/No
> >
> > Enable_Assertion_Event_Hard_Reset
> > No
> > ## Possible values: Yes/No
> >
> > Enable_Assertion_Event_Power_Down
> > No
> > ## Possible values: Yes/No
> >
> > Enable_Assertion_Event_Power_Cycle
> > No
> > ## Possible values: Yes/No
> >
> > Enable_Deassertion_Event_Timer_Expired
> > No
> > ## Possible values: Yes/No
> >
> > Enable_Deassertion_Event_Hard_Reset
> > No
> > ## Possible values: Yes/No
> >
> > Enable_Deassertion_Event_Power_Down
> > No
> > ## Possible values: Yes/No
> >
> > Enable_Deassertion_Event_Power_Cycle
> > No
> > EndSection
> >
> > This changes the output of ipmi-sensors for that host to:
> > 44 | OS Watch | Watchdog 2 | N/A | N/A
> > | N/A
> >
> > An unmodified host has this:
> > 44 | OS Watch | Watchdog 2 | N/A | N/A
> > | 'OK'
> >
> > After the timer expires, this shows up in the SEL:
> > ID | Date | Time | Name |
> > Type | Event Direction | Event
> > 1 | Feb-01-2012 | 07:39:18 | SEL | Event Logging
> > Disabled | Assertion Event | Log Area Reset/Cleared
> > 2 | Feb-01-2012 | 07:39:23 | OS Watch | Watchdog
> > 2 | Assertion Event | Timer expired, status only
> > 3 | Feb-01-2012 | 07:39:23 | OS Watch | Watchdog
> > 2 | Assertion Event | Timer expired, status only
> >
> > If I don't disable the SNMP traps from the server for watchdog timer
> > expiration, I get a trap for DELL-ASF-MIB::asfTrapASRTimeout. A blade
> > chassis will always send a trap stating that the blade changed from
> > normal to critical.
> >
> > Any other ideas? Is this something I need to ask Dell about?
> >
> > Thanks,
> > Ryan
> >
> >
> > --
> > Ryan Cox
> > Systems Administrator
> > Fulton Supercomputing Lab
> > Brigham Young University
> >
> > http://tech.ryancox.net
>
> _______________________________________________
> Freeipmi-users mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/freeipmi-users
--
Albert Chu
address@hidden
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory