Skip to content

Conversation

LegendGaf
Copy link
Contributor

Passing -Wl,--hash-style=both keeps the .hash section alongside .gnu.hash so fully static glibc builds run on older loaders instead of aborting with SIGFPE on start‑up.

Signed-off-by: Gafaiti Aymane [email protected]

@LegendGaf LegendGaf marked this pull request as ready for review July 28, 2025 08:16
@Marlinski
Copy link
Collaborator

Thanks for the contribution @LegendGaf !
can you sign your commit before I merge it please?

@LegendGaf LegendGaf force-pushed the ci_fix-linux-sigfpe branch from b8b9152 to 113be18 Compare July 29, 2025 07:47
@Marlinski
Copy link
Collaborator

@LegendGaf the problem seems to persist on certain system. When shai is compiled directly on those system it works, but the prebuild binary, with or without your added flag, seems to still crash.

I believe it is related to libssl as the crash happens when it tries to perform a ssl query. I can't replicate myself but here is a crash strace from someone having the issue, it happened on startup when the app starts without config file (and tries to pull it from github raw file default .shai.config):

read(9, "N CERTIFICATE-----\nMIIFbDCCA1SgA"..., 4096) = 4096
read(9, "fp/imTYpE0RHap1VIDzYm/EDMrraQKFz"..., 4096) = 4096
read(9, "UQE49RDdT/VP68czH5GX6zfZBCK70bwk"..., 4096) = 4096
read(9, "AhZ11\n+/oxgQgiERyUYUdAZ20UEIYSfR"..., 4096) = 4058
read(9, "", 4096)                       = 0
close(9)                                = 0
mmap(NULL, 2101248, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7004a27f4000
mprotect(0x7004a27f5000, 2097152, PROT_READ|PROT_WRITE) = 0
rt_sigprocmask(SIG_BLOCK, ~[], [], 8)   = 0
clone3({flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, child_tid=0x7004a29f4990, parent_tid=0x7004a29f4990, exit_signal=0, stack=0x7004a27f4000, stack_size=0x200100, tls=0x7004a29f46c0} => {parent_tid=[112324]}, 88) = 112324
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
futex(0x7004a29f45e8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x555574d864e8, FUTEX_WAIT_PRIVATE, 1, NULL) = ?
+++ killed by SIGFPE +++ 
~ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.5 LTS
Release:        22.04
Codename:       jammy

~ uname -a
Linux laptop 6.8.0-64-generic #67~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Jun 24 15:19:46 UTC 2 x86_64 x86_64 x86_64 GNU/Linux 

It seems to be a linker problem indeed, any ideas?

@LegendGaf
Copy link
Contributor Author

I see ! I have already encountered a similar problem. I will look into it in more detail and propose a solution

@LegendGaf LegendGaf changed the title Feat (ci) - add --hash-style=both for portable static Linux binary Fix floating point exception error Jul 29, 2025
Some new Intel/AMD CPUs that support SHA-NI instructions crash with a
"floating point exception" when running binaries statically linked against old OpenSSL versions (like 1.0.2k).  => [OpenSSL* SHA Crash Bug Requires Application Update](https://www.intel.com/content/www/us/en/developer/articles/troubleshooting/openssl-sha-crash-bug-requires-application-update.html?utm_source=chatgpt.com)

- Using `vendored` features in `openssl` and `native-tls` will force static linking with a modern OpenSSL (>=1.1.1), built from source, this should fix the issue.

Signed-off-by: Gafaiti Aymane [email protected]
@LegendGaf LegendGaf force-pushed the ci_fix-linux-sigfpe branch from 6846c76 to 4188e3d Compare July 29, 2025 14:37
@Marlinski
Copy link
Collaborator

The issue still persists even though it is statically linked. Maybe we should find a way to reproduce the issue first in a controlled environment.

@LegendGaf
Copy link
Contributor Author

LegendGaf commented Jul 29, 2025

Sad that the fix didn’t help, It did allow me and some colleagues to build and run it locally using this patch even though it’s no longer fully statically linked, with this change, we’re not using OpenSSL from local libs anymore, which we suspected was the cause of the issue with .
Could you please share the hardware specs of the VM used to compile the release binary? Also, if possible, could you share the output of openssl version and lscpu | grep sha from that VM?
Thanks!
To reproduce the issue: Run a binary linked to OpenSSL 1.0.2_ (without vendored feature / my patch) on a CPU with SHA-NI and let it perform any HTTPS request the process will crash with a Floating point exception due to OpenSSL's SHA-NI bug

@nicoovh
Copy link
Collaborator

nicoovh commented Jul 30, 2025

I think that the release is done by the github CI.

@LegendGaf
Copy link
Contributor Author

Before the patch

agafaiti@laptop1576080:~/Work/tools/AI/shai/target/release$ ldd ./shai
	linux-vdso.so.1 (0x00007ffc6b1dd000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007e068c622000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007e068b519000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007e068b200000)
	/lib64/ld-linux-x86-64.so.2 (0x00007e068c657000)

After the use of musl

agafaiti@laptop1576080:~/Work/tools/AI/shai/target/x86_64-unknown-linux-musl/release$ ldd ./shai 
        statically linked

@LegendGaf LegendGaf force-pushed the ci_fix-linux-sigfpe branch from 42a5e51 to b4743bd Compare July 30, 2025 09:13
@LegendGaf
Copy link
Contributor Author

If this fixes the issue of "floating point exception error" Ii will rebase info one commit before merge

@nicoovh
Copy link
Collaborator

nicoovh commented Jul 30, 2025

I added a github action and build your change here: https://github.com/nicoovh/shai/actions/runs/16619600496/job/47020540925
I will wait for some confirmation that this binary is ok, and i will merge this PR.
Thanks again for your contribution @LegendGaf

@nicoovh nicoovh requested a review from Marlinski July 30, 2025 12:08
@nicoovh nicoovh merged commit 310b4c2 into ovh:main Jul 30, 2025
@Marlinski
Copy link
Collaborator

Thanks @LegendGaf for this PR ! I'll make a release tonight :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants