[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] [PATCH v2 22/23] x86: make Xen early boot code relocatab
From: |
Konrad Rzeszutek Wilk |
Subject: |
Re: [Xen-devel] [PATCH v2 22/23] x86: make Xen early boot code relocatable |
Date: |
Tue, 11 Aug 2015 12:48:06 -0400 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Mon, Jul 20, 2015 at 04:29:17PM +0200, Daniel Kiper wrote:
> Every multiboot protocol (regardless of version) compatible image must
> specify its load address (in ELF or multiboot header). Multiboot protocol
> compatible loader have to load image at specified address. However, there
> is no guarantee that the requested memory region (in case of Xen it starts
> at 1 MiB and ends at 17 MiB) where image should be loaded initially is a RAM
> and it is free (legacy BIOS platforms are merciful for Xen but I found at
> least one EFI platform on which Xen load address conflicts with EFI boot
> services; it is Dell PowerEdge R820 with latest firmware). To cope with
> that problem we must make Xen early boot code relocatable. This patch does
> that. However, it does not add multiboot2 protocol interface which is done
> in next patch.
s/next patch/"x86: add multiboot2 protocol support for relocatable image."
>
> This patch changes following things:
> - default load address is changed from 1 MiB to 2 MiB; I did that because
> initial page tables are using 2 MiB huge pages and this way required
> updates for them are quite easy; it means that e.g. we avoid spacial
> cases for beginning and end of required memory region if it live at
> address not aligned to 2 MiB,
> - %ebp register is used as a storage for Xen image base address; this way
> we can get this value very quickly if it is needed; however, %ebp register
> is not used directly to access a given memory region,
> - %fs register is filled with segment descriptor which describes memory
> region
> with Xen image (it could be relocated or not); it is used to access some
> of
'memory region with Xen image' ? Not sure I follow?
Perhaps:
segment descriptor which starts (0) at Xen image base (_start).
> Xen data in early boot code; potentially we can use above mentioned
> segment
> descriptor to access data using %ds:%esi and/or %es:%esi (e.g. movs*);
> however,
> I think that it could unnecessarily obfuscate code (e.g. we need at least
> to operations to reload a given segment descriptor) and current solution
s/to/two/ ?
> looks quite optimal.
>
> Signed-off-by: Daniel Kiper <address@hidden>
> ---
> xen/arch/x86/Makefile | 6 +-
> xen/arch/x86/Rules.mk | 4 +
> xen/arch/x86/boot/head.S | 165
> ++++++++++++++++++++++++++++++----------
> xen/arch/x86/boot/trampoline.S | 11 ++-
> xen/arch/x86/boot/wakeup.S | 6 +-
> xen/arch/x86/boot/x86_64.S | 34 ++++-----
> xen/arch/x86/setup.c | 33 ++++----
> xen/arch/x86/x86_64/mm.c | 2 +-
> xen/arch/x86/xen.lds.S | 2 +-
> xen/include/asm-x86/config.h | 3 +
> xen/include/asm-x86/page.h | 2 +-
> 11 files changed, 182 insertions(+), 86 deletions(-)
>
> diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
> index 82c5a93..93069a8 100644
> --- a/xen/arch/x86/Makefile
> +++ b/xen/arch/x86/Makefile
> @@ -72,8 +72,10 @@ efi-$(x86_64) := $(shell if [ ! -r
> $(BASEDIR)/include/xen/compile.h -o \
> echo '$(TARGET).efi'; fi)
>
> $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32
> - ./boot/mkelf32 $(TARGET)-syms $(TARGET) 0x100000 \
> - `$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'`
> +# THIS IS UGLY HACK! PLEASE DO NOT COMPLAIN. I WILL FIX IT IN NEXT
> RELEASE.
OK :-)
> + ./boot/mkelf32 $(TARGET)-syms $(TARGET) $(XEN_IMG_PHYS_START)
> 0xffff82d081000000
> +# ./boot/mkelf32 $(TARGET)-syms $(TARGET) 0x100000 \
> +# `$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'`
>
>
> ALL_OBJS := $(BASEDIR)/arch/x86/boot/built_in.o
> $(BASEDIR)/arch/x86/efi/built_in.o $(ALL_OBJS)
> diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk
> index 4a04a8a..7ccb8a0 100644
> --- a/xen/arch/x86/Rules.mk
> +++ b/xen/arch/x86/Rules.mk
> @@ -15,6 +15,10 @@ HAS_GDBSX := y
> HAS_PDX := y
> xenoprof := y
>
> +XEN_IMG_PHYS_START = 0x200000
> +
> +CFLAGS += -DXEN_IMG_PHYS_START=$(XEN_IMG_PHYS_START)
> +
> CFLAGS += -I$(BASEDIR)/include
> CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-generic
> CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-default
> diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S
> index 3f1054d..d484f68 100644
> --- a/xen/arch/x86/boot/head.S
> +++ b/xen/arch/x86/boot/head.S
> @@ -12,13 +12,15 @@
> .text
> .code32
>
> -#define sym_phys(sym) ((sym) - __XEN_VIRT_START)
> +#define sym_phys(sym) ((sym) - __XEN_VIRT_START + XEN_IMG_PHYS_START -
> XEN_IMG_OFFSET)
> +#define sym_offset(sym) ((sym) - __XEN_VIRT_START)
>
> #define BOOT_CS32 0x0008
> #define BOOT_CS64 0x0010
> #define BOOT_DS 0x0018
> #define BOOT_PSEUDORM_CS 0x0020
> #define BOOT_PSEUDORM_DS 0x0028
> +#define BOOT_FS 0x0030
>
> #define MB2_HT(name) (MULTIBOOT2_HEADER_TAG_##name)
> #define MB2_TT(name) (MULTIBOOT2_TAG_TYPE_##name)
> @@ -105,12 +107,13 @@ multiboot1_header_end:
>
> .word 0
> gdt_boot_descr:
> - .word 6*8-1
> - .long sym_phys(trampoline_gdt)
> + .word 7*8-1
> +gdt_boot_descr_addr:
> + .long sym_offset(trampoline_gdt)
> .long 0 /* Needed for 64-bit lgdt */
>
> cs32_switch_addr:
> - .long sym_phys(cs32_switch)
> + .long sym_offset(cs32_switch)
> .word BOOT_CS32
>
> .Lbad_cpu_msg: .asciz "ERR: Not a 64-bit CPU!"
> @@ -120,13 +123,13 @@ cs32_switch_addr:
> .section .init.text, "ax", @progbits
>
> bad_cpu:
> - mov $(sym_phys(.Lbad_cpu_msg)),%esi # Error message
> + lea sym_offset(.Lbad_cpu_msg)(%ebp),%esi # Error message
> jmp print_err
> not_multiboot:
> - mov $(sym_phys(.Lbad_ldr_msg)),%esi # Error message
> + lea sym_offset(.Lbad_ldr_msg)(%ebp),%esi # Error message
> jmp print_err
> mb2_too_old:
> - mov $(sym_phys(.Lbad_mb2_ldr)),%esi # Error message
> + lea sym_offset(.Lbad_mb2_ldr)(%ebp),%esi # Error message
> print_err:
> mov $0xB8000,%edi # VGA framebuffer
> 1: mov (%esi),%bl
> @@ -151,6 +154,9 @@ print_err:
> __efi64_start:
> cld
>
> + /* Load default Xen image base address. */
> + mov $sym_phys(__image_base__),%ebp
> +
> /* Check for Multiboot2 bootloader. */
> cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax
> je efi_multiboot2_proto
> @@ -235,9 +241,11 @@ x86_32_switch:
> cli
>
> /* Initialise GDT. */
> + add %ebp,gdt_boot_descr_addr(%rip)
> lgdt gdt_boot_descr(%rip)
>
> /* Reload code selector. */
> + add %ebp,cs32_switch_addr(%rip)
> ljmpl *cs32_switch_addr(%rip)
>
> .code32
> @@ -263,12 +271,8 @@ __start:
> cld
> cli
>
> - /* Initialise GDT and basic data segments. */
> - lgdt %cs:sym_phys(gdt_boot_descr)
> - mov $BOOT_DS,%ecx
> - mov %ecx,%ds
> - mov %ecx,%es
> - mov %ecx,%ss
> + /* Load default Xen image base address. */
> + mov $sym_phys(__image_base__),%ebp
>
> /* Bootloaders may set multiboot{1,2}.mem_lower to a nonzero value.
> */
> xor %edx,%edx
> @@ -319,6 +323,19 @@ multiboot2_proto:
> jmp 0b
>
> trampoline_bios_setup:
> + mov %ebp,%esi
> +
> + /* Initialise GDT and basic data segments. */
> + add %ebp,sym_offset(gdt_boot_descr_addr)(%esi)
> + lgdt sym_offset(gdt_boot_descr)(%esi)
> +
> + mov $BOOT_DS,%ecx
> + mov %ecx,%ds
> + mov %ecx,%es
> + mov %ecx,%fs
> + mov %ecx,%gs
> + mov %ecx,%ss
> +
The non-EFI boot path is now:
start
\- __start
\- multiboot2_proto
| jmp trampoline_bios_setup
|
\-and if not MB2: jmp trampoline_bios_setup.
In here you tweak the GDT and reload the %ds - but during
this call chain we do touch the %ds - via:
__start+27>: testb $0x1,(%rbx)
__start+30>: cmovne 0x4(%rbx),%edx
which is OK (as MB1 says that the %ds has to cover up to 4GB).
But I wonder why the __start code had the segments reloaded so early?
Was the bootloader not setting the proper segments?
Let me double-check what SYSLINUX's mboot.c32 does. Perhaps
it had done something odd in the past.
> /* Set up trampoline segment 64k below EBDA */
> movzwl 0x40e,%ecx /* EBDA segment */
> cmp $0xa000,%ecx /* sanity check (high) */
> @@ -340,33 +357,58 @@ trampoline_bios_setup:
> cmovb %edx,%ecx /* and use the smaller */
>
> trampoline_setup:
Would it make sense to add:
/* Gets called from EFI (from x86_32_switch) and legacy (see above) boot
loaders. */
> + mov %ebp,%esi
> +
> + /* Initialize 0-15 bits of BOOT_FS segment descriptor base address.
> */
> + mov %ebp,%edx
> + shl $16,%edx
> + or %edx,(sym_offset(trampoline_gdt)+BOOT_FS)(%esi)
> +
> + /* Initialize 16-23 bits of BOOT_FS segment descriptor base address.
> */
> + mov %ebp,%edx
> + shr $16,%edx
> + and $0x000000ff,%edx
> + or %edx,(sym_offset(trampoline_gdt)+BOOT_FS+4)(%esi)
> +
> + /* Initialize 24-31 bits of BOOT_FS segment descriptor base address.
> */
> + mov %ebp,%edx
> + and $0xff000000,%edx
> + or %edx,(sym_offset(trampoline_gdt)+BOOT_FS+4)(%esi)
> +
> + /* Initialize %fs and later use it to access Xen data if possible. */
> + mov $BOOT_FS,%edx
> + mov %edx,%fs
> +
We just modified the GDT. Should we reload it (lgdt?)?
> /* Reserve 64kb for the trampoline. */
> sub $0x1000,%ecx
>
> /* From arch/x86/smpboot.c: start_eip had better be page-aligned! */
> xor %cl, %cl
> shl $4, %ecx
> - mov %ecx,sym_phys(trampoline_phys)
> + mov %ecx,%fs:sym_offset(trampoline_phys)
> +
> + /* Save Xen image base address for later use. */
> + mov %ebp,%fs:sym_offset(xen_img_base_phys_addr)
>
> /* Save the Multiboot info struct (after relocation) for later use.
> */
> - mov $sym_phys(cpu0_stack)+1024,%esp
> + lea (sym_offset(cpu0_stack)+1024)(%ebp),%esp
> push %eax /* Multiboot magic. */
> push %ebx /* Multiboot information address. */
> push %ecx /* Boot trampoline address. */
> call reloc
> add $12,%esp /* Remove reloc() args from stack. */
> - mov %eax,sym_phys(multiboot_ptr)
> + mov %eax,%fs:sym_offset(multiboot_ptr)
>
> /*
> * Do not zero BSS on EFI platform here.
> * It was initialized earlier.
> */
> - cmpb $1,sym_phys(skip_realmode)
> + cmpb $1,%fs:sym_offset(skip_realmode)
> je 1f
>
> /* Initialize BSS (no nasty surprises!). */
> - mov $sym_phys(__bss_start),%edi
> - mov $sym_phys(__bss_end),%ecx
> + lea sym_offset(__bss_start)(%ebp),%edi
> + lea sym_offset(__bss_end)(%ebp),%ecx
> sub %edi,%ecx
> shr $2,%ecx
> xor %eax,%eax
> @@ -381,8 +423,8 @@ trampoline_setup:
> jbe 1f
> mov $0x80000001,%eax
> cpuid
> -1: mov %edx,sym_phys(cpuid_ext_features)
> - mov
> %edx,sym_phys(boot_cpu_data)+CPUINFO_FEATURE_OFFSET(X86_FEATURE_LM)
> +1: mov %edx,%fs:sym_offset(cpuid_ext_features)
> + mov
> %edx,%fs:(sym_offset(boot_cpu_data)+CPUINFO_FEATURE_OFFSET(X86_FEATURE_LM))
>
> /* Check for availability of long mode. */
> bt $X86_FEATURE_LM & 0x1f,%edx
> @@ -390,72 +432,111 @@ trampoline_setup:
>
> /* Stash TSC to calculate a good approximation of time-since-boot */
> rdtsc
> - mov %eax,sym_phys(boot_tsc_stamp)
> - mov %edx,sym_phys(boot_tsc_stamp+4)
> + mov %eax,%fs:sym_offset(boot_tsc_stamp)
> + mov %edx,%fs:sym_offset(boot_tsc_stamp+4)
>
> - /* Initialise L2 boot-map page table entries (16MB). */
> - mov $sym_phys(l2_bootmap),%edx
> - mov $PAGE_HYPERVISOR|_PAGE_PSE,%eax
> - mov $8,%ecx
> + /* Update frame addreses in page tables. */
> + lea sym_offset(__page_tables_start)(%ebp),%edx
> + mov $((__page_tables_end-__page_tables_start)/8),%ecx
> +1: testl $_PAGE_PRESENT,(%edx)
> + jz 2f
> + add %ebp,(%edx)
> +2: add $8,%edx
> + loop 1b
> +
> + /* Initialise L2 boot-map page table entries (14MB). */
> + lea sym_offset(l2_bootmap)(%ebp),%edx
> + lea sym_offset(start)(%ebp),%eax
> + and $~((1<<L2_PAGETABLE_SHIFT)-1),%eax
> + mov %eax,%ebx
> + shr $(L2_PAGETABLE_SHIFT-3),%ebx
> + and $(L2_PAGETABLE_ENTRIES*4*8-1),%ebx
> + add %ebx,%edx
> + add $(PAGE_HYPERVISOR|_PAGE_PSE),%eax
> + mov $7,%ecx
> 1: mov %eax,(%edx)
> add $8,%edx
> add $(1<<L2_PAGETABLE_SHIFT),%eax
> loop 1b
> +
> /* Initialise L3 boot-map page directory entry. */
> - mov $sym_phys(l2_bootmap)+__PAGE_HYPERVISOR,%eax
> - mov %eax,sym_phys(l3_bootmap) + 0*8
> + lea (sym_offset(l2_bootmap)+__PAGE_HYPERVISOR)(%ebp),%eax
> + lea sym_offset(l3_bootmap)(%ebp),%ebx
> + mov $4,%ecx
> +1: mov %eax,(%ebx)
> + add $8,%ebx
> + add $(L2_PAGETABLE_ENTRIES*8),%eax
> + loop 1b
> +
> + /* Initialise L2 direct map page table entries (14MB). */
> + lea sym_offset(l2_identmap)(%ebp),%edx
> + lea sym_offset(start)(%ebp),%eax
> + and $~((1<<L2_PAGETABLE_SHIFT)-1),%eax
> + mov %eax,%ebx
> + shr $(L2_PAGETABLE_SHIFT-3),%ebx
> + and $(L2_PAGETABLE_ENTRIES*4*8-1),%ebx
> + add %ebx,%edx
> + add $(PAGE_HYPERVISOR|_PAGE_PSE),%eax
> + mov $7,%ecx
> +1: mov %eax,(%edx)
> + add $8,%edx
> + add $(1<<L2_PAGETABLE_SHIFT),%eax
> + loop 1b
> +
> /* Hook 4kB mappings of first 2MB of memory into L2. */
> - mov $sym_phys(l1_identmap)+__PAGE_HYPERVISOR,%edi
> - mov %edi,sym_phys(l2_xenmap)
> - mov %edi,sym_phys(l2_bootmap)
> + lea (sym_offset(l1_identmap)+__PAGE_HYPERVISOR)(%ebp),%edi
> + mov %edi,%fs:sym_offset(l2_bootmap)
But not to l2_xenmap?
>
> /* Apply relocations to bootstrap trampoline. */
> - mov sym_phys(trampoline_phys),%edx
> - mov $sym_phys(__trampoline_rel_start),%edi
> + mov %fs:sym_offset(trampoline_phys),%edx
> + lea sym_offset(__trampoline_rel_start)(%ebp),%edi
> + lea sym_offset(__trampoline_rel_stop)(%ebp),%esi
> 1:
> mov (%edi),%eax
> add %edx,(%edi,%eax)
> add $4,%edi
> - cmp $sym_phys(__trampoline_rel_stop),%edi
> + cmp %esi,%edi
> jb 1b
>
> /* Patch in the trampoline segment. */
> shr $4,%edx
> - mov $sym_phys(__trampoline_seg_start),%edi
> + lea sym_offset(__trampoline_seg_start)(%ebp),%edi
> + lea sym_offset(__trampoline_seg_stop)(%ebp),%esi
> 1:
> mov (%edi),%eax
> mov %dx,(%edi,%eax)
> add $4,%edi
> - cmp $sym_phys(__trampoline_seg_stop),%edi
> + cmp %esi,%edi
> jb 1b
>
> /* Do not parse command line on EFI platform here. */
> - cmpb $1,sym_phys(skip_realmode)
> + cmpb $1,%fs:sym_offset(skip_realmode)
> je 1f
>
> /* Bail if there is no command line to parse. */
> - mov sym_phys(multiboot_ptr),%ebx
> + mov %fs:sym_offset(multiboot_ptr),%ebx
> testl $MBI_CMDLINE,MB_flags(%ebx)
> jz 1f
>
> cmpl $0,MB_cmdline(%ebx)
> jz 1f
>
> - pushl $sym_phys(early_boot_opts)
> + lea sym_offset(early_boot_opts)(%ebp),%eax
> + push %eax
> pushl MB_cmdline(%ebx)
> call cmdline_parse_early
> add $8,%esp /* Remove cmdline_parse_early() args
> from stack. */
>
> 1:
> /* Switch to low-memory stack. */
> - mov sym_phys(trampoline_phys),%edi
> + mov %fs:sym_offset(trampoline_phys),%edi
> lea 0x10000(%edi),%esp
> lea trampoline_boot_cpu_entry-trampoline_start(%edi),%eax
> pushl $BOOT_CS32
> push %eax
>
> /* Copy bootstrap trampoline to low memory, below 1MB. */
> - mov $sym_phys(trampoline_start),%esi
> + lea sym_offset(trampoline_start)(%ebp),%esi
> mov $trampoline_end - trampoline_start,%ecx
> rep movsb
>
> diff --git a/xen/arch/x86/boot/trampoline.S b/xen/arch/x86/boot/trampoline.S
> index 3c2714d..a8909ce 100644
> --- a/xen/arch/x86/boot/trampoline.S
> +++ b/xen/arch/x86/boot/trampoline.S
> @@ -52,12 +52,20 @@ trampoline_gdt:
> /* 0x0028: real-mode data @ BOOT_TRAMPOLINE */
> .long 0x0000ffff
> .long 0x00009200
> + /*
> + * 0x0030: ring 0 Xen data, 16 MiB size, base
> + * address is initialized during runtime.
s/initialized/computed/
> + */
> + .quad 0x00c0920000001000
>
> .pushsection .trampoline_rel, "a"
> .long trampoline_gdt + BOOT_PSEUDORM_CS + 2 - .
> .long trampoline_gdt + BOOT_PSEUDORM_DS + 2 - .
> .popsection
>
> +GLOBAL(xen_img_base_phys_addr)
> + .long 0
> +
> GLOBAL(cpuid_ext_features)
> .long 0
>
> @@ -82,7 +90,8 @@ trampoline_protmode_entry:
> mov %ecx,%cr4
>
> /* Load pagetable base register. */
> - mov $sym_phys(idle_pg_table),%eax
> + mov bootsym_rel(xen_img_base_phys_addr,4,%eax)
> + lea sym_offset(idle_pg_table)(%eax),%eax
> add bootsym_rel(trampoline_xen_phys_start,4,%eax)
> mov %eax,%cr3
>
> diff --git a/xen/arch/x86/boot/wakeup.S b/xen/arch/x86/boot/wakeup.S
> index 08ea9b2..ff80f7f 100644
> --- a/xen/arch/x86/boot/wakeup.S
> +++ b/xen/arch/x86/boot/wakeup.S
> @@ -119,8 +119,10 @@ wakeup_32:
> mov %eax, %ss
> mov $bootsym_rel(wakeup_stack, 4, %esp)
>
> + mov bootsym_rel(xen_img_base_phys_addr, 4, %ebx)
> +
> # check saved magic again
> - mov $sym_phys(saved_magic), %eax
> + lea sym_offset(saved_magic)(%ebx), %eax
> add bootsym_rel(trampoline_xen_phys_start, 4, %eax)
> mov (%eax), %eax
> cmp $0x9abcdef0, %eax
> @@ -133,7 +135,7 @@ wakeup_32:
> mov %ecx, %cr4
>
> /* Load pagetable base register */
> - mov $sym_phys(idle_pg_table),%eax
> + lea sym_offset(idle_pg_table)(%ebx),%eax
> add bootsym_rel(trampoline_xen_phys_start,4,%eax)
> mov %eax,%cr3
>
> diff --git a/xen/arch/x86/boot/x86_64.S b/xen/arch/x86/boot/x86_64.S
> index c8bf9d0..ae4bebd 100644
> --- a/xen/arch/x86/boot/x86_64.S
> +++ b/xen/arch/x86/boot/x86_64.S
> @@ -81,7 +81,6 @@ GLOBAL(boot_cpu_compat_gdt_table)
> .quad 0x0000910000000000 /* per-CPU entry (limit == cpu) */
> .align PAGE_SIZE, 0
>
> -GLOBAL(__page_tables_start)
> /*
> * Mapping of first 2 megabytes of memory. This is mapped with 4kB mappings
> * to avoid type conflicts with fixed-range MTRRs covering the lowest
> megabyte
> @@ -101,21 +100,18 @@ GLOBAL(l1_identmap)
> .endr
> .size l1_identmap, . - l1_identmap
>
> -/* Mapping of first 16 megabytes of memory. */
Don't want to just update the comment?
> +GLOBAL(__page_tables_start)
> +
> GLOBAL(l2_identmap)
And perhaps explain how this page is being updated at runtime?
> - .quad sym_phys(l1_identmap) + __PAGE_HYPERVISOR
> - pfn = 0
> - .rept 7
> - pfn = pfn + (1 << PAGETABLE_ORDER)
> - .quad (pfn << PAGE_SHIFT) | PAGE_HYPERVISOR | _PAGE_PSE
> - .endr
> - .fill 4 * L2_PAGETABLE_ENTRIES - 8, 8, 0
> + .quad sym_offset(l1_identmap) + __PAGE_HYPERVISOR
> + .fill 4 * L2_PAGETABLE_ENTRIES - 1, 8, 0
> .size l2_identmap, . - l2_identmap
>
> GLOBAL(l2_xenmap)
> - idx = 0
> - .rept 8
> - .quad sym_phys(__image_base__) + (idx << L2_PAGETABLE_SHIFT) +
> (PAGE_HYPERVISOR | _PAGE_PSE)
> + .quad 0
> + idx = 1
> + .rept 7
> + .quad sym_offset(__image_base__) + (idx << L2_PAGETABLE_SHIFT) +
> (PAGE_HYPERVISOR | _PAGE_PSE)
> idx = idx + 1
> .endr
> .fill L2_PAGETABLE_ENTRIES - 8, 8, 0
> @@ -125,7 +121,7 @@ l2_fixmap:
> idx = 0
> .rept L2_PAGETABLE_ENTRIES
> .if idx == l2_table_offset(FIXADDR_TOP - 1)
> - .quad sym_phys(l1_fixmap) + __PAGE_HYPERVISOR
> + .quad sym_offset(l1_fixmap) + __PAGE_HYPERVISOR
> .else
> .quad 0
> .endif
> @@ -136,7 +132,7 @@ l2_fixmap:
> GLOBAL(l3_identmap)
> idx = 0
> .rept 4
> - .quad sym_phys(l2_identmap) + (idx << PAGE_SHIFT) + __PAGE_HYPERVISOR
> + .quad sym_offset(l2_identmap) + (idx << PAGE_SHIFT) +
> __PAGE_HYPERVISOR
> idx = idx + 1
> .endr
> .fill L3_PAGETABLE_ENTRIES - 4, 8, 0
> @@ -146,9 +142,9 @@ l3_xenmap:
> idx = 0
> .rept L3_PAGETABLE_ENTRIES
> .if idx == l3_table_offset(XEN_VIRT_START)
> - .quad sym_phys(l2_xenmap) + __PAGE_HYPERVISOR
> + .quad sym_offset(l2_xenmap) + __PAGE_HYPERVISOR
> .elseif idx == l3_table_offset(FIXADDR_TOP - 1)
> - .quad sym_phys(l2_fixmap) + __PAGE_HYPERVISOR
> + .quad sym_offset(l2_fixmap) + __PAGE_HYPERVISOR
> .else
> .quad 0
> .endif
> @@ -158,13 +154,13 @@ l3_xenmap:
>
> /* Top-level master (and idle-domain) page directory. */
> GLOBAL(idle_pg_table)
> - .quad sym_phys(l3_bootmap) + __PAGE_HYPERVISOR
> + .quad sym_offset(l3_bootmap) + __PAGE_HYPERVISOR
> idx = 1
> .rept L4_PAGETABLE_ENTRIES - 1
> .if idx == l4_table_offset(DIRECTMAP_VIRT_START)
> - .quad sym_phys(l3_identmap) + __PAGE_HYPERVISOR
> + .quad sym_offset(l3_identmap) + __PAGE_HYPERVISOR
> .elseif idx == l4_table_offset(XEN_VIRT_START)
> - .quad sym_phys(l3_xenmap) + __PAGE_HYPERVISOR
> + .quad sym_offset(l3_xenmap) + __PAGE_HYPERVISOR
> .else
> .quad 0
> .endif
> diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
> index 8bec67f..8172520 100644
> --- a/xen/arch/x86/setup.c
> +++ b/xen/arch/x86/setup.c
> @@ -291,9 +291,6 @@ static void *__init bootstrap_map(const module_t *mod)
> if ( start >= end )
> return NULL;
>
> - if ( end <= BOOTSTRAP_MAP_BASE )
> - return (void *)(unsigned long)start;
> -
> ret = (void *)(map_cur + (unsigned long)(start & mask));
> start &= ~mask;
> end = (end + mask) & ~mask;
> @@ -641,6 +638,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>
> printk("Command line: %s\n", cmdline);
>
> + printk("Xen image base address: 0x%08lx\n",
> + xen_phys_start ? xen_phys_start : (unsigned
> long)xen_img_base_phys_addr);
> +
> printk("Video information:\n");
>
> /* Print VGA display mode information. */
> @@ -835,10 +835,8 @@ void __init noreturn __start_xen(unsigned long mbi_p)
> uint64_t s, e, mask = (1UL << L2_PAGETABLE_SHIFT) - 1;
> uint64_t end, limit = ARRAY_SIZE(l2_identmap) << L2_PAGETABLE_SHIFT;
>
> - /* Superpage-aligned chunks from BOOTSTRAP_MAP_BASE. */
> s = (boot_e820.map[i].addr + mask) & ~mask;
> e = (boot_e820.map[i].addr + boot_e820.map[i].size) & ~mask;
> - s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE);
> if ( (boot_e820.map[i].type != E820_RAM) || (s >= e) )
> continue;
>
> @@ -876,7 +874,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
> /* Select relocation address. */
> e = end - reloc_size;
> xen_phys_start = e;
> - bootsym(trampoline_xen_phys_start) = e;
> + bootsym(trampoline_xen_phys_start) = e - xen_img_base_phys_addr;
>
> /*
> * Perform relocation to new physical address.
> @@ -886,7 +884,7 @@ void __init noreturn __start_xen(unsigned long mbi_p)
> */
> load_start = (unsigned long)_start - XEN_VIRT_START;
> barrier();
> - move_memory(e + load_start, load_start, _end - _start, 1);
> + move_memory(e + load_start, load_start + xen_img_base_phys_addr,
> _end - _start, 1);
>
> /* Walk initial pagetables, relocating page directory entries. */
> pl4e = __va(__pa(idle_pg_table));
> @@ -895,27 +893,27 @@ void __init noreturn __start_xen(unsigned long mbi_p)
> if ( !(l4e_get_flags(*pl4e) & _PAGE_PRESENT) )
> continue;
> *pl4e = l4e_from_intpte(l4e_get_intpte(*pl4e) +
> - xen_phys_start);
> + xen_phys_start -
> xen_img_base_phys_addr);
> pl3e = l4e_to_l3e(*pl4e);
> for ( j = 0; j < L3_PAGETABLE_ENTRIES; j++, pl3e++ )
> {
> /* Not present, 1GB mapping, or already relocated? */
> if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) ||
> (l3e_get_flags(*pl3e) & _PAGE_PSE) ||
> - (l3e_get_pfn(*pl3e) > 0x1000) )
> + (l3e_get_pfn(*pl3e) > PFN_DOWN(xen_phys_start)) )
> continue;
> *pl3e = l3e_from_intpte(l3e_get_intpte(*pl3e) +
> - xen_phys_start);
> + xen_phys_start -
> xen_img_base_phys_addr);
> pl2e = l3e_to_l2e(*pl3e);
> for ( k = 0; k < L2_PAGETABLE_ENTRIES; k++, pl2e++ )
> {
> /* Not present, PSE, or already relocated? */
> if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ||
> (l2e_get_flags(*pl2e) & _PAGE_PSE) ||
> - (l2e_get_pfn(*pl2e) > 0x1000) )
> + (l2e_get_pfn(*pl2e) > PFN_DOWN(xen_phys_start))
> )
> continue;
> *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) +
> - xen_phys_start);
> + xen_phys_start -
> xen_img_base_phys_addr);
> }
> }
> }
> @@ -926,10 +924,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
> PAGE_HYPERVISOR_RWX | _PAGE_PSE);
> for ( i = 1; i < L2_PAGETABLE_ENTRIES; i++, pl2e++ )
> {
> - if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
> + if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ||
> (l2e_get_pfn(*pl2e) > PFN_DOWN(xen_phys_start)))
Could this be split in two lines?
> continue;
> *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) +
> - xen_phys_start);
> + xen_phys_start -
> xen_img_base_phys_addr);
> }
>
> /* Re-sync the stack and then switch to relocated pagetables. */
> @@ -998,6 +996,9 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>
> if ( !xen_phys_start )
> panic("Not enough memory to relocate Xen.");
> +
> + printk("New Xen image base address: 0x%08lx\n", xen_phys_start);
> +
> reserve_e820_ram(&boot_e820, __pa(&_start), __pa(&_end));
>
> /* Late kexec reservation (dynamic start address). */
> @@ -1070,14 +1071,12 @@ void __init noreturn __start_xen(unsigned long mbi_p)
>
> set_pdx_range(s >> PAGE_SHIFT, e >> PAGE_SHIFT);
>
> - /* Need to create mappings above BOOTSTRAP_MAP_BASE. */
> - map_s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE);
> + map_s = s;
> map_e = min_t(uint64_t, e,
> ARRAY_SIZE(l2_identmap) << L2_PAGETABLE_SHIFT);
>
> /* Pass mapped memory to allocator /before/ creating new mappings. */
> init_boot_pages(s, min(map_s, e));
> - s = map_s;
> if ( s < map_e )
> {
> uint64_t mask = (1UL << L2_PAGETABLE_SHIFT) - 1;
> diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
> index 98310f3..baa6461 100644
> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -44,7 +44,7 @@ unsigned int __read_mostly m2p_compat_vstart =
> __HYPERVISOR_COMPAT_VIRT_START;
>
> /* Enough page directories to map into the bottom 1GB. */
> l3_pgentry_t __section(".bss.page_aligned") l3_bootmap[L3_PAGETABLE_ENTRIES];
> -l2_pgentry_t __section(".bss.page_aligned") l2_bootmap[L2_PAGETABLE_ENTRIES];
> +l2_pgentry_t __section(".bss.page_aligned") l2_bootmap[4 *
> L2_PAGETABLE_ENTRIES];
16KB?
? Confused.
>
> l2_pgentry_t *compat_idle_pg_table_l2;
>
> diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S
> index a399615..b666a3f 100644
> --- a/xen/arch/x86/xen.lds.S
> +++ b/xen/arch/x86/xen.lds.S
> @@ -38,7 +38,7 @@ SECTIONS
> . = __XEN_VIRT_START;
> __image_base__ = .;
> #endif
> - . = __XEN_VIRT_START + MB(1);
> + . = __XEN_VIRT_START + XEN_IMG_OFFSET;
> _start = .;
> .text : {
> _stext = .; /* Text and read-only data */
> diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h
> index 3e9be83..6d21cb7 100644
> --- a/xen/include/asm-x86/config.h
> +++ b/xen/include/asm-x86/config.h
> @@ -114,6 +114,7 @@ extern unsigned long trampoline_phys;
> trampoline_phys-__pa(trampoline_start)))
> extern char trampoline_start[], trampoline_end[];
> extern char trampoline_realmode_entry[];
> +extern unsigned int xen_img_base_phys_addr;
> extern unsigned int trampoline_xen_phys_start;
> extern unsigned char trampoline_cpu_started;
> extern char wakeup_start[];
> @@ -280,6 +281,8 @@ extern unsigned char boot_edid_info[128];
> #endif
> #define DIRECTMAP_VIRT_END (DIRECTMAP_VIRT_START + DIRECTMAP_SIZE)
>
> +#define XEN_IMG_OFFSET 0x200000
> +
> #ifndef __ASSEMBLY__
>
> /* This is not a fixed value, just a lower limit. */
> diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
> index 87b3341..27481ac 100644
> --- a/xen/include/asm-x86/page.h
> +++ b/xen/include/asm-x86/page.h
> @@ -283,7 +283,7 @@ extern root_pgentry_t
> idle_pg_table[ROOT_PAGETABLE_ENTRIES];
> extern l2_pgentry_t *compat_idle_pg_table_l2;
> extern unsigned int m2p_compat_vstart;
> extern l2_pgentry_t l2_xenmap[L2_PAGETABLE_ENTRIES],
> - l2_bootmap[L2_PAGETABLE_ENTRIES];
> + l2_bootmap[4*L2_PAGETABLE_ENTRIES];
? Why do we need to expand this to be 16kB?
> extern l3_pgentry_t l3_bootmap[L3_PAGETABLE_ENTRIES];
> extern l2_pgentry_t l2_identmap[4*L2_PAGETABLE_ENTRIES];
> extern l1_pgentry_t l1_identmap[L1_PAGETABLE_ENTRIES],
> --
> 1.7.10.4
>
>
> _______________________________________________
> Xen-devel mailing list
> address@hidden
> http://lists.xen.org/xen-devel
- Re: [Xen-devel] [PATCH v2 22/23] x86: make Xen early boot code relocatable,
Konrad Rzeszutek Wilk <=
Re: [Xen-devel] [PATCH v2 22/23] x86: make Xen early boot code relocatable, Konrad Rzeszutek Wilk, 2015/08/14