[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: About firmware facilities

From: Brendan Trotter
Subject: Re: About firmware facilities
Date: Sun, 20 Sep 2009 18:34:56 +0930


On Sat, Sep 19, 2009 at 11:36 PM, Vladimir 'phcoder' Serbinenko
<address@hidden> wrote:
> Brendan Trotter wrote:
>>> No. Usuable means only that firmware isn't destroyed. Any device may
>>> be in a different state
>> Any device (that the firmware assumes is in a certain state) may be
>> left in a different state (that the firmware no longer knows about)?
>> For a very simple example, imagine if the BIOS leaves the floppy motor
>> on, and GRUB's own floppy driver uses the floppy and then turns the
>> motor off. Then the OS uses the firmware to read from floppy, but the
>> firmware thinks the floppy motor is still on and attempts to read from
>> the floppy without turning the floppy motor on.
>> If GRUB has it's own device drivers, and GRUB doesn't restore devices
>> to the state that the firmware expects the devices to be in, then the
>> firmware is unusable.
> Most OSes should use their own drivers to access devices.

Most device manufacturers should provide full documentation so that
programmers can write drivers to access the devices; and manufacturers
should provide hardware samples (and documentation) to these
programmers so that the device driver is ready before the device is
made available to the general public. Unfortunately the real world
just doesn't work the same as "should".

The multi-boot specification says the firmware is left in a usable
state. If GRUB doesn't leave the firmware in a usable state, then
either GRUB is wrong or the multi-boot specification is wrong. You
can't have it both ways.

Of course I'm forgetting that GRUB also supports chainloading (e.g.
the chainloaded OS tries to use the firmware to load more of it's
data, and the firmware fails because GRUB left a device in an
unexpected state) - non-compliance with the multi-boot specification
isn't the only issue.

>> If an OS can't use the firmware, then the OS must rely on GRUB for
>> everything instead, including strange "OS specific" things that nobody
>> has seen any other OS do before.
> If nobody uses a particular feature in firmware then you shouldn't use
> it either. Unused firmware features are often buggy. Moreover firmware
> on x86 is useful only for bootstrap and once bootstrap is completed you
> should forget it exists except some firmware-specific tasks as setting
> boot device.

So, can I rely on GRUB to (for e.g.) setup video in a way that is
suitable for my code, or do I need to use the firmware myself (and
hope that GRUB hasn't left a device in an unexpected state)?

>>>>>> Due to limitations in the original multi-boot specification my code
>>>>>> switches back to real mode and uses the BIOS to do memory detection,
>>>>>> do video mode detection, switch video modes and gather other
>>>>>> information.
>>>>> Have you actually read the multiboot specification? Booter passes info
>>>>> about memory and video mode in mbi (video for multiboot isn't
>>>>> implemented yet). If you need firmware for basic bootup you're clearly
>>>>> doing something wrong and are firmware-dependent. Of course it's your
>>>>> freedom to make suboptimal software.
>>>> I've read the multi-boot specification. I've also read the code in
>>>> GRUB-legacy that does memory detection, and I'm unwilling to allow my
>>>> code to rely on it for "quality control" reasons. Without going into
>>>> details, GRUB-legacy tends to do a "minimal" job and then expects the
>>>> user to fix the problem if/when it goes wrong (but even then it only
>>>> offers a "uppermem" command without providing a way for the user to
>>>> specify a complete system memory map).
>>> What is "minimal job" and "quality control"? We use standard
>>> E820+(optionally)badram command. I've seen no OS do any more than
>>> this.
>> My code tries "int 0x15, eax=0xE820" expecting 24 bytes per area (ACPI
>> 3.0); then it tries "int 0x15, eax=0xE820" expecting 20 bytes per
>> area. If "int 0x15, eax=0xE820" isn't supported by the BIOS then you
>> can assume it's an old computer (and old computers are painful).
>> It tries "int 0x15, ax=0xE801", then "int 0x15, ah=0xC7", then "int
>> 0x15, ah=0x8A", then "int 0x15, ah=0xDA88", then "int 0x15, ah=0x88",
>> then CMOS locations 0x70 and 0x71.
> Read code. GRUB fallback to old methods if newer aren't available.

I read the code (for both GRUB 1.96 and GRUB 0.97) and wrote down
exactly which BIOS functions GRUB does use in my last post. You didn't
read the code (and didn't read what I wrote either), and now you're
telling me to read the code?

>>  If all of this fails (which does
>> happen on some computers) then it does manual probing.
>> Some of the old BIOS functions have limited range - for example
>> they'll return "number of KiB blocks at 0x00100000" as a 16-bit
>> integer, and can't return more than 64 MiB. In this case my code won't
>> know if the value returned by the BIOS has been limited to 0xFFFF, so
>> it'll do manual probing to detect any more RAM above the reported
>> area. Some computers have an "ISA hole" from 0x00F00000 to 0x00FFFFFF.
>> Because of this all older BIOS functions that report "amount of RAM at
>> 0x00100000" may return 14 MiB (from 0x00100000 to 0x00EFFFFF) when
>> there's actually another area of RAM at 0x01000000. In this case my
>> code won't know if the value returned by the BIOS is right or not, and
>> it'll do manual probing to detect any more RAM at 0x00100000. If my
>> code has to do manual probing, then it assumes there's an "ISA hole"
>> from 0x00F00000 to 0x00FFFFFF (regardless of whether there is or not)
>> as this hole was used for memory mapped video cards (which would seem
>> like RAM).
> Have you thought that manual probe may detect MMIO as additional "RAM"?
> Video RAM isn't the only case of MMIO.

Yes. For old computers there's plenty of unused area in the physical
address space and PCI devices are almost always assigned areas
starting from higher addresses and working down (which leaves a
massive "unused" area between the end of RAM and the first memory
mapped PCI device. For older systems (with ISA) the only usable space
is the "ISA hole" just below 0x01000000 (I already explained that my
code does probe this area).

For newer computers (e.g. anything made in the last 15 years) where
this might be a problem, the BIOS functions work and nothing needs to
be probed anyway.

> I prefer to function correctly on bug-free firmware at cost of quirks on
> buggy ones rather than other way round

I prefer to function correctly on modern BIOSs (where GRUB currently
works), and function correctly on almost all old BIOSs (where GRUB
currently fails) and function correctly on almost all buggy BIOSs
(where GRUB currently fails).

>> For all BIOS functions used my code avoids all known BIOS bugs (and
>> there's plenty of them). This includes "sanitizing" the data returned
>> from "int 0x15, eax=0xE820" - sorting the list and handling any
>> overlapping areas.
> have you ever looked at mmap folder?

There is no mmap folder in the source code for GRUB 1.96 or GRUB 0.97.

>> I've never needed to provide a way for the end-user to override my
>> memory detection.
> Neither did we. But test your manual probing at 4GiB system - it's
> likely to detect all MMIO addresses as RAM.

It's extremely unlikely that a computer with 4 GiB of RAM will fail on
all of the previous BIOS functions. If all of the previous BIOS
functions do fail, then you're probably running on an 80486 or older
computer which is unlikely to have more than 128 MiB of RAM.

If you think GRUB's memory detection never needs to be overridden then
you're obviously not testing it on anything that predates "int 0x15,
eax = 0xE820".

>>>> For memory detection, ACPI 3.0 allows the BIOS (" INT 15H, E820H") to
>>>> return extended attributes - mostly only a volatile/non-volatile flag.
>>>> This isn't in GRUB's information. ACPI 3.0 also allows the BIOS to
>>>> return areas of the type "AddressRangeUnusable" (e.g. faulty RAM).
>>> This is mostly unnecessary. Basically you need only to know if you can
>>> use a memory range or not. The only useful additional code would be
>>> ReclaimMemory
>> To handle standby states correctly the OS may need to know which areas
>> are volatile and which areas aren't (which can include knowing the
>> difference between volatile system areas and non-volatile system
>> areas). Some OSs also want to know if there's any faulty RAM present
>> in the system or not (and additional information about any area
>> reserved for "hot-plug" RAM, and NUMA ranges, but that information
>> comes from ACPI tables not BIOS functions so the OS can get this
>> information without GRUB).
> I'm ok with defining additional types in multiboot1. But OS considering
> multiboot type to be BIOS type is buggy

I agree - most OSs that use multi-boot are buggy because they don't
comply with the specification (except mine, because I ignore GRUB's
memory map and get the information directly from the BIOS). The
question is which new types would be needed to ensure that
non-compliance isn't "deemed necessary" by OS developers in the
future, and how GRUB will know if the kernel image will understand the
new types correctly or if the kernel is an older (buggy) kernel that
(incorrectly) assumes ACPI types.

>>> GRUB can't do this right now because it doesn't recieve badram info
>>> soon enough. And even if it does most kernels expect first MiB to be
>>> usable.
>> You're right - all kernels that are designed to use "multi-boot
>> specification version 1" expect to be loaded at 0x00100000 and that
>> RAM below the EBDA is usable. I'm not sure what kernels designed for
>> "multi-boot specification version 2" expect...
> Read what I said

In which way does existing kernels (that were designed for
GRUB-legacy) include future kernels (that might be designed to support
features that have been/could be introduced with GRUB2)?

>>> Such list is a blatant encapsulation breach. If you want such test,
>>> add it to bootloader, not OS.
>> When the OS is running and detects a RAM fault, you want the OS to run
>> a copy of GRUB (maybe inside an emulator or something) so the OS can
>> tell GRUB about the RAM fault, and GRUB can tell the OS if the RAM
>> fault might cause problems if the computer is rebooted (and so the OS
>> can send an email to the network administrators or something *before*
>> the computer is rebooted)?
>> You can't assume that the OS that is running is the same OS that
>> installed GRUB; or that the OS that is running has access to wherever
>> GRUB is installed; or that GRUB will be able to detect any faulty RAM
>> during boot.
> You're in circular logic. You assume that booter is using faulty RAM but
> supplying RAM it used correctly.

No. All RAM is OK when the boot loader boots the OS, but then
(possibly several months of running "24 hours per day" later) a RAM
fault occurs and the OS detects it, and the OS tells the user (or
administrator) that rebooting might cause problems due to the RAM
fault (because the OS knows that the faulty RAM will be used by the
boot loader).

>>> This information is available with a simple loop over mbi. I would
>>> rathjer avoid overcomplicating the standard because it increases a
>>> chance of having "half-compliant" OSes and "half-compliant" booters.
>> I'd rather have "fully compliant" OSes that are easier to write than
>> "fully compliant" OSes that are a pain in the neck to write because
>> you have to parse everything in the multi-boot information structure
>> before you can write to any RAM (except for your own ".bss").
>> If I can't rely on the firmware (like I currently do) then I have to
>> rely on GRUB, and have to copy everything from the multi-boot
>> information structure into my ".bss". So, how much extra space do I
>> need to allow in my ".bss"? What is the maximum number of drive
>> structures? What's the maximum number of memory map entries? What's
>> the maximum length of the command line? The multi-boot specification
>> doesn't say.
> First do a small parse and count how many memory the structures you need
> to take.

For something like a "live" CD; during boot you want the OS to do a
small parse and determine how much memory these structures will take,
then write to a read-only boot CD to change the kernel's ".bss" size?
And you want the OS to do this before GRUB has allocated memory for
the kernel or executed any of the OS's code?

>>>> The "Boot device" field in the multi-boot information structure should
>>>> be improved to handle a variety of conditions; including if the disk
>>>> was an emulated disk (e.g. "El Torito" emulating a hard drive). The
>>>> BIOS drive number isn't much use (especially if the firmware is
>>>> coreboot, UEFI, OpenFirmware, etc), and should be replaced with
>>>> something that identifies the corresponding drive structure (this
>>>> includes USB).
>>> Boot device shouldn't be used at all. It was a mistake. Booter has no
>>> good way to know how OS will see the device. You should pass this
>>> parameter via commandline either as device name or UUID. You have
>>> scripting to automate this
>> A user who's using GRUB to boot Ubuntu decides to install my code in
>> another partition, then modify GRUB's configuration (in Ubuntu) so
>> that GRUB will also boot my code. Now my code needs to rely on the
>> user to not stuff up GRUB's configuration for my code?
> You can simply tell him to add "source" line

How does my code know if the user has set this "source" line correctly?

If someone is making a bootable CD that's meant to be used on 100
different computers, how should they set the "source" line?

>>>> The OS image also needs a different magic number to indicate that the
>>>> OS image is designed for future versions of the multi-boot
>>>> specification (rather than the old/current version). If the OS image
>>>> uses the new magic number, then the OS image must also include an
>>>> "version of the multi-boot specification that this image complies
>>>> with" field. If the OS image indicates that it's intended for a newer
>>>> version of the multi-boot specification than the boot loader complies
>>>> with, then the boot loader refuses to boot and displays a "this boot
>>>> loader needs to be upgraded" error. If the OS image has the old magic
>>>> number, and if the firmware is "PC BIOS" then the boot loader should
>>>> boot the old OS image. If the OS image has the old magic number, and
>>>> if the firmware is not "PC BIOS" then the boot loader refuses to boot
>>>> and displays a "this OS requires a PC BIOS" error message.
>>> Already implemented through feature fields
>> Is there a private version of the multi-boot specification that I'm
>> not aware of yet; or does GRUB fail to comply with the current
>> multi-boot specification?
> No. Read specification again

The current multi-boot specification (version 0.6.95)? The one at:

For the "flags" field in the kernel's Multiboot header, this version
of the specification says "Naturally, all as-yet-undefined bits in the
`flags' word must be set to zero in OS images." and there are no flags
defined that allows a kernel to indicate that it supports other types
of firmware (or any other feature/s introduced by GRUB2).

Can I assume that one or more of these "as-yet-undefined" bits have
been defined in some private version of the multi-boot specification
that I'm not aware of?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]