grub-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About firmware facilities


From: Brendan Trotter
Subject: Re: About firmware facilities
Date: Wed, 16 Sep 2009 01:31:46 +0930

Hi,

On Tue, Sep 15, 2009 at 6:29 PM, Vladimir 'phcoder' Serbinenko
<address@hidden> wrote:
> On Tue, Sep 15, 2009 at 1:23 AM, Brendan Trotter <address@hidden> wrote:
>> On Tue, Sep 15, 2009 at 6:19 AM, Vladimir 'phcoder' Serbinenko
>> <address@hidden> wrote:
>>> On Mon, Sep 14, 2009 at 10:12 PM, Brendan Trotter <address@hidden> wrote:
>>>> On Tue, Sep 15, 2009 at 4:41 AM, Pavel Roskin <address@hidden> wrote:
>>>>> On Tue, 2009-09-15 at 04:27 +0930, Brendan Trotter wrote:

>>>>> GRUB can serve as BIOS together with Coreboot.
>>>>
>>>> I know. It'll break my code.
>>>>
>>>> The multi-boot specification says "However, other machine state should
>>>> be left by the boot loader in normal working order, i.e. as
>>>> initialized by the bios (or DOS, if that's what the boot loader runs
>>>> from)."; although I seem to remember it saying words to the effect of
>>>> "the firmware should be left in a usable state".
>>>>
>>> Firmware if present is left in usable state. However it may simply not
>>> be present.
>>
>> So if the firmware is present, GRUB won't alter the state that the
>> firmware left the hardware in (or if it does it'll restore all
>> hardware to the default firmware state before starting the OS);
>> including all devices that GRUB uses via. it's own device drivers
>> (except for specific things mentioned in the multi-boot specification,
>> like the A20 gate)?
>>
> No. Usuable means only that firmware isn't destroyed. Any device may
> be in a different state

Any device (that the firmware assumes is in a certain state) may be
left in a different state (that the firmware no longer knows about)?

For a very simple example, imagine if the BIOS leaves the floppy motor
on, and GRUB's own floppy driver uses the floppy and then turns the
motor off. Then the OS uses the firmware to read from floppy, but the
firmware thinks the floppy motor is still on and attempts to read from
the floppy without turning the floppy motor on.

If GRUB has it's own device drivers, and GRUB doesn't restore devices
to the state that the firmware expects the devices to be in, then the
firmware is unusable.

If an OS can't use the firmware, then the OS must rely on GRUB for
everything instead, including strange "OS specific" things that nobody
has seen any other OS do before.

>>>> Due to limitations in the original multi-boot specification my code
>>>> switches back to real mode and uses the BIOS to do memory detection,
>>>> do video mode detection, switch video modes and gather other
>>>> information.
>>>
>>> Have you actually read the multiboot specification? Booter passes info
>>> about memory and video mode in mbi (video for multiboot isn't
>>> implemented yet). If you need firmware for basic bootup you're clearly
>>> doing something wrong and are firmware-dependent. Of course it's your
>>> freedom to make suboptimal software.
>>
>> I've read the multi-boot specification. I've also read the code in
>> GRUB-legacy that does memory detection, and I'm unwilling to allow my
>> code to rely on it for "quality control" reasons. Without going into
>> details, GRUB-legacy tends to do a "minimal" job and then expects the
>> user to fix the problem if/when it goes wrong (but even then it only
>> offers a "uppermem" command without providing a way for the user to
>> specify a complete system memory map).
>>
> What is "minimal job" and "quality control"? We use standard
> E820+(optionally)badram command. I've seen no OS do any more than
> this.

My code tries "int 0x15, eax=0xE820" expecting 24 bytes per area (ACPI
3.0); then it tries "int 0x15, eax=0xE820" expecting 20 bytes per
area. If "int 0x15, eax=0xE820" isn't supported by the BIOS then you
can assume it's an old computer (and old computers are painful).

It tries "int 0x15, ax=0xE801", then "int 0x15, ah=0xC7", then "int
0x15, ah=0x8A", then "int 0x15, ah=0xDA88", then "int 0x15, ah=0x88",
then CMOS locations 0x70 and 0x71. If all of this fails (which does
happen on some computers) then it does manual probing.

Some of the old BIOS functions have limited range - for example
they'll return "number of KiB blocks at 0x00100000" as a 16-bit
integer, and can't return more than 64 MiB. In this case my code won't
know if the value returned by the BIOS has been limited to 0xFFFF, so
it'll do manual probing to detect any more RAM above the reported
area. Some computers have an "ISA hole" from 0x00F00000 to 0x00FFFFFF.
Because of this all older BIOS functions that report "amount of RAM at
0x00100000" may return 14 MiB (from 0x00100000 to 0x00EFFFFF) when
there's actually another area of RAM at 0x01000000. In this case my
code won't know if the value returned by the BIOS is right or not, and
it'll do manual probing to detect any more RAM at 0x00100000. If my
code has to do manual probing, then it assumes there's an "ISA hole"
from 0x00F00000 to 0x00FFFFFF (regardless of whether there is or not)
as this hole was used for memory mapped video cards (which would seem
like RAM).

For all BIOS functions used my code avoids all known BIOS bugs (and
there's plenty of them). This includes "sanitizing" the data returned
from "int 0x15, eax=0xE820" - sorting the list and handling any
overlapping areas.

I've never needed to provide a way for the end-user to override my
memory detection.

>From what I remember, Linux uses about 3 of these BIOS functions
(probably the same 3 BIOS function GRUB uses) and does work around
most of the BIOS bugs. Linux does (need to) provide a way for the
end-user to override its memory detection (and Linux does provide
adequate facilities for this).

I checked the code for GRUB 1.96 and it uses 3 BIOS functions ("int
0x15, eax=0xE820", "int 0x15, ax=0xE801" and "int 0x15, ah=0x88") and
doesn't work around any BIOS bugs at all. The memory detection code in
Linux is a better (but the memory detection code in Linux is still not
good enough for my purposes, because in rare cases the end-user may
need to override it).

>> For memory detection, ACPI 3.0 allows the BIOS (" INT 15H, E820H") to
>> return extended attributes - mostly only a volatile/non-volatile flag.
>> This isn't in GRUB's information. ACPI 3.0 also allows the BIOS to
>> return areas of the type "AddressRangeUnusable" (e.g. faulty RAM).
> This is mostly unnecessary. Basically you need only to know if you can
> use a memory range or not. The only useful additional code would be
> ReclaimMemory

To handle standby states correctly the OS may need to know which areas
are volatile and which areas aren't (which can include knowing the
difference between volatile system areas and non-volatile system
areas). Some OSs also want to know if there's any faulty RAM present
in the system or not (and additional information about any area
reserved for "hot-plug" RAM, and NUMA ranges, but that information
comes from ACPI tables not BIOS functions so the OS can get this
information without GRUB).

>> If
>> the BIOS reports that the area from 0x00100000 to 0x001FFFFF is
>> faulty, then how is GRUB planning to handle that? My own code (if
>> using it's own boot loaders rather than GRUB) can tolerate RAM
>> failures anywhere in RAM except for about 64 KiB in low memory.
> GRUB can't do this right now because it doesn't recieve badram info
> soon enough. And even if it does most kernels expect first MiB to be
> usable.

You're right - all kernels that are designed to use "multi-boot
specification version 1" expect to be loaded at 0x00100000 and that
RAM below the EBDA is usable. I'm not sure what kernels designed for
"multi-boot specification version 2" expect...

>> While I'm on the subject, I also want a list of RAM areas that the
>> boot loader has relied on. That way if a 24/7 server detects that any
>> RAM that the boot loader relies on has become faulty it can warn the
>> user that the computer may not be able to boot. Currently I assume
>> that all multi-boot compliant boot loaders (including GRUB) rely on
>> all RAM below 0x00100000, but it's a dodgy hack I'd rather avoid.
> Such list is a blatant encapsulation breach. If you want such test,
> add it to bootloader, not OS.

When the OS is running and detects a RAM fault, you want the OS to run
a copy of GRUB (maybe inside an emulator or something) so the OS can
tell GRUB about the RAM fault, and GRUB can tell the OS if the RAM
fault might cause problems if the computer is rebooted (and so the OS
can send an email to the network administrators or something *before*
the computer is rebooted)?

You can't assume that the OS that is running is the same OS that
installed GRUB; or that the OS that is running has access to wherever
GRUB is installed; or that GRUB will be able to detect any faulty RAM
during boot.

>> In the memory map, GRUB should mark areas that it has used to pass
>> information to the OS (e.g. the multi-boot information structure) as
>> "multi-boot reclaimable" (in a similar way that ACPI uses the "ACPI
>> reclaimable" type). This would make it easier for the OS to avoid
>> overwriting this data before it attempts to read it.
> This information is available with a simple loop over mbi. I would
> rathjer avoid overcomplicating the standard because it increases a
> chance of having "half-compliant" OSes and "half-compliant" booters.

I'd rather have "fully compliant" OSes that are easier to write than
"fully compliant" OSes that are a pain in the neck to write because
you have to parse everything in the multi-boot information structure
before you can write to any RAM (except for your own ".bss").

If I can't rely on the firmware (like I currently do) then I have to
rely on GRUB, and have to copy everything from the multi-boot
information structure into my ".bss". So, how much extra space do I
need to allow in my ".bss"? What is the maximum number of drive
structures? What's the maximum number of memory map entries? What's
the maximum length of the command line? The multi-boot specification
doesn't say.

The alternative is for the OS to create it's own memory management
structures in its ".bss" that excludes RAM used by the multi-boot
information structure, then use these memory management structures to
allocate "actually free" RAM. In this case, the OS ends up building
the equivalent of a system memory map that includes "multi-boot
reclaimable" areas because the boot loader didn't.

>> The area types
>> need to be "architecture independent" types too (e.g. GRUB converts
>> ACPI area types, UEFI area types, etc into "standard multi-boot area
>> types").
> It already is.

You mean (from the multi-boot specification) "`type' is the variety of
address range represented, where a value of 1 indicates available ram,
and all other values currently indicated a reserved area."?

No sane OS complies with the specification (because the specification
isn't adequate, and doesn't allow ACPI reclaimable areas to be
reclaimed or anything else). All sane (non-compliant) OSs assume that
GRUB copies data "as is" from "int 0x15, eax=0xE820" directly into the
multi-boot information structure (because this is what "GRUB-legacy"
actually does do), and that 'type' means the same as it does for "int
0x15, eax=0xE820".

>> The "Boot device" field in the multi-boot information structure should
>> be improved to handle a variety of conditions; including if the disk
>> was an emulated disk (e.g. "El Torito" emulating a hard drive). The
>> BIOS drive number isn't much use (especially if the firmware is
>> coreboot, UEFI, OpenFirmware, etc), and should be replaced with
>> something that identifies the corresponding drive structure (this
>> includes USB).
> Boot device shouldn't be used at all. It was a mistake. Booter has no
> good way to know how OS will see the device. You should pass this
> parameter via commandline either as device name or UUID. You have
> scripting to automate this

A user who's using GRUB to boot Ubuntu decides to install my code in
another partition, then modify GRUB's configuration (in Ubuntu) so
that GRUB will also boot my code. Now my code needs to rely on the
user to not stuff up GRUB's configuration for my code?

>> The "boot loader name" field is nice, but it needs a "boot loader
>> version" field to go with it.
> it's a part of name. This field is more for displaying anyway and OS
> shouldn't do any checks based on it
>> A "firmware type" field is also needed.
> Can you respond in appropriate thread?

Ok - where is the appropriate thread?

>> The OS image also needs a different magic number to indicate that the
>> OS image is designed for future versions of the multi-boot
>> specification (rather than the old/current version). If the OS image
>> uses the new magic number, then the OS image must also include an
>> "version of the multi-boot specification that this image complies
>> with" field. If the OS image indicates that it's intended for a newer
>> version of the multi-boot specification than the boot loader complies
>> with, then the boot loader refuses to boot and displays a "this boot
>> loader needs to be upgraded" error. If the OS image has the old magic
>> number, and if the firmware is "PC BIOS" then the boot loader should
>> boot the old OS image. If the OS image has the old magic number, and
>> if the firmware is not "PC BIOS" then the boot loader refuses to boot
>> and displays a "this OS requires a PC BIOS" error message.
> Already implemented through feature fields

Is there a private version of the multi-boot specification that I'm
not aware of yet; or does GRUB fail to comply with the current
multi-boot specification?

>> In a similar way, if the OS image has the old magic number and the
>> boot loader is not running on 80x86 then the boot loader refuses to
>> boot and displays an error message. If the OS image has the new magic
>> number then it must also include a field that indicates which
>> architecture the OS is intended for, and the boot loader must check
>> this field and display a "this OS is intended for a different
>> architecture" error message (and refuse to boot) if it's wrong.
>>
> multiboot1 and multiboot2 have different magics

See above.

>>> Read it here: 
>>> http://www.gnu.org/software/grub/manual/multiboot/multiboot.html
>>
>> That's the old version of the multi-boot specification (for
>> GRUB_legacy). I'm looking forward to a new version of the multi-boot
>> specification (for GRUB 2) that is designed to support different
>> architectures, different types of firmware on the same architecture,
>> etc; and hopefully other improvements.
>>
> multiboot1 ISN'T depreceated and grub2 supports multiboot1

See above.


Cheers,

Brendan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]