grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] sparc64: boot performance improvements


From: Eric Snowberg
Subject: Re: [PATCH] sparc64: boot performance improvements
Date: Tue, 27 Oct 2015 10:34:02 -0600

> On Oct 25, 2015, at 9:20 AM, Vladimir 'φ-coder/phcoder' Serbinenko 
> <address@hidden> wrote:
> 
> On 11.10.2015 18:49, Eric Snowberg wrote:
>> 
>>> On Oct 10, 2015, at 1:35 AM, Vladimir 'phcoder' Serbinenko <address@hidden> 
>>> wrote:
>>> 
>>> 
>>> Le 10 oct. 2015 3:31 AM, "Eric Snowberg" <address@hidden> a écrit :
>>>> 
>>>> 
>>>>> On Oct 9, 2015, at 12:50 AM, Andrei Borzenkov <address@hidden> wrote:
>>>>> 
>>>>> On Thu, Oct 8, 2015 at 2:20 AM, Eric Snowberg <address@hidden> wrote:
>>>>>> 
>>>>>>> On Oct 7, 2015, at 2:36 AM, Andrei Borzenkov <address@hidden> wrote:
>>>>>>> 
>>>>>>> On Tue, Oct 6, 2015 at 8:39 PM, Eric Snowberg <address@hidden> wrote:
>>>>>>>> Keep of devices open.  This can save 6 - 7 seconds per open call and
>>>>>>>> can decrease boot times from over 10 minutes to 29 seconds on
>>>>>>>> larger SPARC systems.  The open/close calls with some vendors'
>>>>>>>> SAS controllers cause the entire card to be reinitialized after
>>>>>>>> each close.
>>>>>>>> 
>>>>>>> 
>>>>>>> Is it necessary to close these handles before launching kernel?
>>>>>> 
>>>>>> On SPARC hardware it is not necessary.  I would be interested to hear if 
>>>>>> it is necessary on other IEEE1275 platforms.
>>>>>> 
>>>>>>> It sounds like it can accumulate quite a lot of them - are there any
>>>>>>> memory limits/restrictions? Also your patch is rather generic and so
>>>>>>> applies to any IEEE1275 platform, I think some of them may have less
>>>>>>> resources. Just trying to understand what run-time cost is.
>>>>>> 
>>>>>> The most I have seen are two entries in the linked list.  So the 
>>>>>> additional memory requirements are very small.  I have tried this on a 
>>>>>> T2000, T4, and T5.
>>>>> 
>>>>> I thought you were speaking about *larger* SPARC servers :)
>>>>> 
>>>>> Anyway I'd still prefer if this would be explicitly restricted to
>>>>> Oracle servers. Please look at
>>>>> grub-core/kern/ieee1275/cmain.c:grub_ieee1275_find_options() if new
>>>>> quirk can be added to detect your platform(s).
>>>>> 
>>>> 
>>>> That makes sense, I’ll restrict it to all sun4v equipment.
>>>> 
>>> Not good enough. Some of sun4v probably have the bug I told about in the 
>>> other e-mail
>> 
>> I can get access to every sun4v platform.  Could you provide any additional 
>> information on which sun4v systems had this failure.  If so, I’ll try to 
>> submit a fix for that problem as well.  It seems like there could be a 
>> better way to handle this than resetting the SAS or SATA HBA before each 
>> read operation.
>> 
>> 
> See http://permalink.gmane.org/gmane.comp.boot-loaders.grub.devel/10533
> for holding open handle.

The patch I submitted will prevent the same device from being opened multiple 
times concurrently.   Just as the link above describes. The only time this will 
not be the case is when grub references the same device name differently.  For 
example I have seen on one system where it alternates between the SAS WWN and 
the typical OpenBoot alias for the drive name.  I intend on fixing this 
problem.  But so far with this system OpenBoot has been able to handle it.

The problem with the last_ihandle caching within ofdisk.c is that the 
last_devpath pointer changes with each file open call.  So there really isn’t 
much benefit, since the device will always be closed and reopened, since the 
disk->data pointer changes, even though it contains the same string.  So on a 
typical boot, there are around 33 open and close calls taking place.  This 
causes the SAS controller to be restarted 33 times.  With the patch I 
submitted, all the last_ihandle caching, could be removed. 

> Do you readily have a scenario when you need multiple handles open?

The other problem my patch solves is where on some systems, grub will walk the 
entire device tree and open every single device, multiple times. I do not 
understand why it does this on some systems, but when it does, it takes 20+ 
minutes before the grub menu appears.  With the patch I submitted, the menu 
will appear in under 29 seconds on all sun4v systems.

I have made a V2 patch with the changes Andrei recommended, where the code will 
only be used on sun4v platforms.  I have tested it on almost every generation 
of the sun4v platform.  I still have a few platforms to go.   Please let me 
know how you want me to proceed with this V2 patch if this testing looks 
promising.

> Can you try following naive optimisation:
> diff --git a/grub-core/disk/ieee1275/ofdisk.c
> b/grub-core/disk/ieee1275/ofdisk.c
> index 331769b..34237f9 100644
> --- a/grub-core/disk/ieee1275/ofdisk.c
> +++ b/grub-core/disk/ieee1275/ofdisk.c
> @@ -448,13 +448,6 @@ grub_ofdisk_open (const char *name, grub_disk_t disk)
> static void
> grub_ofdisk_close (grub_disk_t disk)
> {
> -  if (disk->data == last_devpath)
> -    {
> -      if (last_ihandle)
> -       grub_ieee1275_close (last_ihandle);
> -      last_ihandle = 0;
> -      last_devpath = NULL;
> -    }
>   disk->data = 0;
> }

This optimization will not work.  It will cause the same device node to be 
opened multiple times concurrently, which can lead to problems on some sun4v 
systems.


> 
>> There are many problems with the upstream grub2 implementation for sun4v.  I 
>> will be submitting additional follow on patches, since it currently will not 
>> even boot to the grub menu.  My intention is to get grub2 working on every 
>> sun4v platform.
>> 
> Could you start with

I think something got cut off here.

>> 
>>>>> 
>>>>>> 
>>>>>> The path name sent into the grub_ieee1275_open function is not 
>>>>>> consistent throughout the code, even though it is referencing the same 
>>>>>> device.  Originally I tried having a global containing the path sent 
>>>>>> into grub_ieee1275_open, but this failed due to the various ways of 
>>>>>> referencing the same device name.  That is why I added the linked list 
>>>>>> just to be safe.  Caching the grub_ieee1275_ihandle_t this way saves a 
>>>>>> significant amount of time, since OF calls are expensive.  We were 
>>>>>> seeing the same device opened 50+ times in the process of displaying the 
>>>>>> grub menu and then launching the kernel.
>>>>>> 
>>>>>>> 
>>>>>>>> Signed-off-by: Eric Snowberg <address@hidden>
>>>>>>>> ---
>>>>>>>> grub-core/kern/ieee1275/ieee1275.c |   39 
>>>>>>>> ++++++++++++++++++++++++++++++++++++
>>>>>>>> 1 files changed, 39 insertions(+), 0 deletions(-)
>>>>>>>> 
>>>>>>>> diff --git a/grub-core/kern/ieee1275/ieee1275.c 
>>>>>>>> b/grub-core/kern/ieee1275/ieee1275.c
>>>>>>>> index 9821702..30f973b 100644
>>>>>>>> --- a/grub-core/kern/ieee1275/ieee1275.c
>>>>>>>> +++ b/grub-core/kern/ieee1275/ieee1275.c
>>>>>>>> @@ -19,11 +19,24 @@
>>>>>>>> 
>>>>>>>> #include <grub/ieee1275/ieee1275.h>
>>>>>>>> #include <grub/types.h>
>>>>>>>> +#include <grub/misc.h>
>>>>>>>> +#include <grub/list.h>
>>>>>>>> +#include <grub/mm.h>
>>>>>>>> 
>>>>>>>> #define IEEE1275_PHANDLE_INVALID  ((grub_ieee1275_cell_t) -1)
>>>>>>>> #define IEEE1275_IHANDLE_INVALID  ((grub_ieee1275_cell_t) 0)
>>>>>>>> #define IEEE1275_CELL_INVALID     ((grub_ieee1275_cell_t) -1)
>>>>>>>> 
>>>>>>>> +struct grub_of_opened_device
>>>>>>>> +{
>>>>>>>> +  struct grub_of_opened_device *next;
>>>>>>>> +  struct grub_of_opened_device **prev;
>>>>>>>> +  grub_ieee1275_ihandle_t ihandle;
>>>>>>>> +  char *path;
>>>>>>>> +};
>>>>>>>> +
>>>>>>>> +static struct grub_of_opened_device *grub_of_opened_devices;
>>>>>>>> +
>>>>>>>> 
>>>>>>>> 
>>>>>>>> int
>>>>>>>> @@ -452,6 +465,18 @@ grub_ieee1275_open (const char *path, 
>>>>>>>> grub_ieee1275_ihandle_t *result)
>>>>>>>> }
>>>>>>>> args;
>>>>>>>> 
>>>>>>>> +  struct grub_of_opened_device *dev;
>>>>>>>> +
>>>>>>>> +  FOR_LIST_ELEMENTS(dev, grub_of_opened_devices)
>>>>>>>> +    if (grub_strcmp(dev->path, path) == 0)
>>>>>>>> +      break;
>>>>>>>> +
>>>>>>>> +  if (dev)
>>>>>>>> +  {
>>>>>>>> +    *result = dev->ihandle;
>>>>>>>> +    return 0;
>>>>>>>> +  }
>>>>>>>> +
>>>>>>>> INIT_IEEE1275_COMMON (&args.common, "open", 1, 1);
>>>>>>>> args.path = (grub_ieee1275_cell_t) path;
>>>>>>>> 
>>>>>>>> @@ -460,6 +485,11 @@ grub_ieee1275_open (const char *path, 
>>>>>>>> grub_ieee1275_ihandle_t *result)
>>>>>>>> *result = args.result;
>>>>>>>> if (args.result == IEEE1275_IHANDLE_INVALID)
>>>>>>>>   return -1;
>>>>>>>> +
>>>>>>>> +  dev = grub_zalloc(sizeof(struct grub_of_opened_device));
>>>>>>> 
>>>>>>> Error check
>>>>>> 
>>>>>> I’ll resubmit V2 with this change
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>>> +  dev->path = grub_strdup(path);
>>>>>>> 
>>>>>>> Ditto
>>>>>>> 
>>>>>> 
>>>>>> I’ll resubmit V2 with this change
>>>>>> 
>>>>>> 
>>>>>>>> +  dev->ihandle = args.result;
>>>>>>>> +  grub_list_push(GRUB_AS_LIST_P (&grub_of_opened_devices), 
>>>>>>>> GRUB_AS_LIST (dev));
>>>>>>>> return 0;
>>>>>>>> }
>>>>>>>> 
>>>>>>>> @@ -473,6 +503,15 @@ grub_ieee1275_close (grub_ieee1275_ihandle_t 
>>>>>>>> ihandle)
>>>>>>>> }
>>>>>>>> args;
>>>>>>>> 
>>>>>>>> +  struct grub_of_opened_device *dev;
>>>>>>>> +
>>>>>>>> +  FOR_LIST_ELEMENTS(dev, grub_of_opened_devices)
>>>>>>>> +    if (dev->ihandle == ihandle)
>>>>>>>> +      break;
>>>>>>>> +
>>>>>>>> +  if (dev)
>>>>>>>> +    return 0;
>>>>>>>> +
>>>>>>> 
>>>>>>> How can we come here (i.e. can we get open handle without 
>>>>>>> grub_ieee1275_open)?
>>>>>>> 
>>>>>> 
>>>>>> I don’t see it being possible with SPARC and would assume other 
>>>>>> platforms could not get there either. I’m not familiar with the other 
>>>>>> IEEE1275 platforms to know if this would be appropriate, but If there 
>>>>>> isn’t a platform that needs this close function, could the function be 
>>>>>> changed to do nothing but return 0?
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>>> INIT_IEEE1275_COMMON (&args.common, "close", 1, 0);
>>>>>>>> args.ihandle = ihandle;
>>>>>>>> 
>>>>>>>> --
>>>>>>>> 1.7.1
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> Grub-devel mailing list
>>>>>>>> address@hidden
>>>>>>>> https://lists.gnu.org/mailman/listinfo/grub-devel
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Grub-devel mailing list
>>>>>>> address@hidden
>>>>>>> https://lists.gnu.org/mailman/listinfo/grub-devel
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Grub-devel mailing list
>>>>>> address@hidden
>>>>>> https://lists.gnu.org/mailman/listinfo/grub-devel
>>>>> 
>>>>> _______________________________________________
>>>>> Grub-devel mailing list
>>>>> address@hidden
>>>>> https://lists.gnu.org/mailman/listinfo/grub-devel
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Grub-devel mailing list
>>>> address@hidden
>>>> https://lists.gnu.org/mailman/listinfo/grub-devel
>>> _______________________________________________
>>> Grub-devel mailing list
>>> address@hidden
>>> https://lists.gnu.org/mailman/listinfo/grub-devel
>> 
>> 
>> _______________________________________________
>> Grub-devel mailing list
>> address@hidden
>> https://lists.gnu.org/mailman/listinfo/grub-devel
>> 
> 
> 
> _______________________________________________
> Grub-devel mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/grub-devel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]