From d873a364ef2182af40110869f9c62813ce6f9386 Mon Sep 17 00:00:00 2001 From: Ammar Faizi Date: Wed, 30 Aug 2023 08:02:23 +0700 Subject: tools/nolibc: i386: Fix a stack misalign bug on _start MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The ABI mandates that the %esp register must be a multiple of 16 when executing a 'call' instruction. Commit 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c") simplified the _start function, but it didn't take care of the %esp alignment, causing SIGSEGV on SSE and AVX programs that use aligned move instruction (e.g., movdqa, movaps, and vmovdqa). The 'and $-16, %esp' aligns the %esp at a multiple of 16. Then 'push %eax' will subtract the %esp by 4; thus, it breaks the 16-byte alignment. Make sure the %esp is correctly aligned after the push by subtracting 12 before the push. Extra: Add 'add $12, %esp' before the 'and $-16, %esp' to avoid over-estimating for particular cases as suggested by Willy. A test program to validate the %esp alignment on _start can be found at: https://lore.kernel.org/lkml/ZOoindMFj1UKqo+s@biznet-home.integral.gnuweeb.org [ Thomas: trim Fixes tag commit id ] Cc: Zhangjin Wu Fixes: 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c") Reported-by: Nicholas Rosenberg Acked-by: Thomas Weißschuh Signed-off-by: Ammar Faizi Reviewed-by: Alviro Iskandar Setiawan Signed-off-by: Willy Tarreau Signed-off-by: Thomas Weißschuh --- tools/include/nolibc/arch-i386.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'tools') diff --git a/tools/include/nolibc/arch-i386.h b/tools/include/nolibc/arch-i386.h index 64415b9fac77..28c26a00a762 100644 --- a/tools/include/nolibc/arch-i386.h +++ b/tools/include/nolibc/arch-i386.h @@ -167,7 +167,9 @@ void __attribute__((weak, noreturn, optimize("Os", "omit-frame-pointer"))) __no_ __asm__ volatile ( "xor %ebp, %ebp\n" /* zero the stack frame */ "mov %esp, %eax\n" /* save stack pointer to %eax, as arg1 of _start_c */ - "and $-16, %esp\n" /* last pushed argument must be 16-byte aligned */ + "add $12, %esp\n" /* avoid over-estimating after the 'and' & 'sub' below */ + "and $-16, %esp\n" /* the %esp must be 16-byte aligned on 'call' */ + "sub $12, %esp\n" /* sub 12 to keep it aligned after the push %eax */ "push %eax\n" /* push arg1 on stack to support plain stack modes too */ "call _start_c\n" /* transfer to c runtime */ "hlt\n" /* ensure it does not return */ -- cgit v1.2.3-58-ga151