[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH gnumach] smp: Remove hardcoded AP_BOOT_ADDR

From: Jessica Clarke
Subject: Re: [PATCH gnumach] smp: Remove hardcoded AP_BOOT_ADDR
Date: Tue, 30 Jan 2024 16:07:29 +0000

On 30 Jan 2024, at 09:02, Samuel Thibault <samuel.thibault@gnu.org> wrote:
> Jessica Clarke, le mar. 30 janv. 2024 02:32:07 +0000, a ecrit:
>> On 29 Jan 2024, at 10:20, Samuel Thibault <samuel.thibault@gnu.org> wrote:
>>> Damien Zammit, le lun. 29 janv. 2024 10:07:30 +0000, a ecrit:
>>>> - ljmp $BOOT_CS, $M(0f)
>>>> + xorl %eax, %eax
>>>> + mov %cs, %ax
>>>> + shll $4, %eax
>>>> + addl $M(0f), %eax
>>>> + movl %eax, M(ljmp_offset32)
>>> This won't work with pipelined processors, which assume a complete
>>> separation between code and data, and will thus have already loaded
>>> the jmp instruction before your modify it.
>> That’s true of most architectures, but not x86. It architecturally
>> guarantees that self-modifying code works,
> ?? It was a very common way to detect pentium processors, back in the
> time.

Ok, so I went and read 12.6 Self-Modiyfing Code of the Intel SDM Volume
3A (from December 2023), and it has this to say:

> A write to a memory location in a code segment that is currently cached
> in the processor causes the associated cache line (or lines) to be
> invalidated. This check is based on the physical address of the
> instruction. In addition, the P6 family and Pentium processors check
> whether a write to a code segment may modify an instruction that has
> been prefetched for execution. If the write affects a prefetched
> instruction, the prefetch queue is invalidated. This latter check is
> based on the linear address of the instruction. For the Pentium 4 and
> Intel Xeon processors, a write or a snoop of an instruction in a code
> segment, where the target instruction is already decoded and resident
> in the trace cache, invalidates the entire trace cache. The latter
> behavior means that programs that self-modify code can cause severe
> degradation of performance when run on the Pentium 4 and Intel Xeon
> processors.
> In practice, the check on linear addresses should not create
> compatibility problems among IA-32 processors. Appli- cations that
> include self-modifying code use the same linear address for modifying
> and fetching the instruction. Systems software, such as a debugger,
> that might possibly modify an instruction using a different linear
> address than that used to fetch the instruction, will execute a
> serializing operation, such as a CPUID instruction, before the modified
> instruction is executed, which will automatically resynchronize the
> instruction cache and prefetch queue. (See Section 9.1.3, “Handling
> Self- and Cross-Modifying Code,” for more information about the use of
> self-modi- fying code.)
> For Intel486 processors, a write to an instruction in the cache will
> modify it in both the cache and memory, but if the instruction was
> prefetched before the write, the old version of the instruction could
> be the one executed. To prevent the old instruction from being
> executed, flush the instruction prefetch unit by coding a jump
> instruction immediately after any write that modifies an instruction.

So, for anything above a 486, this code is correct. For a 386 and 486
you need to jump to the next instruction to invalidate the prefetch
unit. I guess that’s what you were getting at? I had interpreted your
comments as meaning that *modern* processors needed it.

>>> Rather either perform the relocation from the C code,
>> Were your statement true, that wouldn’t fix the problem,
> Isn't an IPI a synchronizing thing?

Oh that’s true, I was being stupid and was thinking the C code would be
running on the AP, but of course that’s nonsense. Patching the code
from the BSP makes sense, and I believe is what FreeBSD does.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]