-
Notifications
You must be signed in to change notification settings - Fork 13.1k
Closed
Labels
Description
Git commit
Operating systems
Linux
GGML backends
HIP
Problem description & steps to reproduce
It seems that compilation of hipblas for gfx803 started to fail recently. This can be observed in the LocalAI CI: mudler/LocalAI#6235
First Bad Commit
Started to happen since: #15884
Compile command
cmake .. -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201 -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201 -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201
Relevant log output
cmake .. -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201 -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201 -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201 && cmake --build . --config Release -j 4 --target grpc-server
#22 11.30 -- The C compiler identification is Clang 19.0.0
#22 11.35 -- The CXX compiler identification is Clang 19.0.0
#22 11.36 -- Detecting C compiler ABI info
#22 11.43 -- Detecting C compiler ABI info - done
#22 11.44 -- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
#22 11.44 -- Detecting C compile features
#22 11.44 -- Detecting C compile features - done
#22 11.44 -- Detecting CXX compiler ABI info
#22 11.51 -- Detecting CXX compiler ABI info - done
#22 11.52 -- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
#22 11.52 -- Detecting CXX compile features
#22 11.52 -- Detecting CXX compile features - done
#22 11.52 CMAKE_BUILD_TYPE=Release
#22 11.53 -- Found Git: /usr/bin/git (found version "2.34.1")
#22 11.78 -- The ASM compiler identification is Clang with GNU-like command-line
#22 11.78 -- Found assembler: /opt/rocm/llvm/bin/clang
#22 11.79 -- Looking for pthread.h
#22 11.86 -- Looking for pthread.h - found
#22 11.86 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
#22 11.93 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
#22 11.93 -- Found Threads: TRUE
#22 11.94 -- ccache found, compilation results will be cached. Disable with GGML_CCACHE=OFF.
#22 11.96 -- CMAKE_SYSTEM_PROCESSOR: x86_64
#22 11.96 -- GGML_SYSTEM_ARCH: x86
#22 11.96 -- Including CPU backend
#22 12.21 -- Found OpenMP_C: -fopenmp=libomp
#22 12.29 -- Found OpenMP_CXX: -fopenmp=libomp
#22 12.29 -- Found OpenMP: TRUE
#22 12.30 -- x86 detected
#22 12.30 -- Adding CPU backend variant ggml-cpu: -msse4.2;-mbmi2 GGML_SSE42;GGML_BMI2
#22 13.60 -- The HIP compiler identification is Clang 19.0.0
#22 13.62 -- Detecting HIP compiler ABI info
#22 14.31 -- Detecting HIP compiler ABI info - done
#22 14.32 -- Check for working HIP compiler: /opt/rocm-6.4.3/lib/llvm/bin/clang++ - skipped
#22 14.32 -- Detecting HIP compile features
#22 14.32 -- Detecting HIP compile features - done
#22 14.34 CMake Warning (dev) at /opt/rocm/lib/cmake/hip/hip-config-amd.cmake:70 (message):
#22 14.34 AMDGPU_TARGETS is deprecated. Please use GPU_TARGETS instead.
#22 14.34 Call Stack (most recent call first):
#22 14.34 /opt/rocm/lib/cmake/hip/hip-config.cmake:149 (include)
#22 14.34 ggml/src/ggml-hip/CMakeLists.txt:39 (find_package)
#22 14.34 This warning is for project developers. Use -Wno-dev to suppress it.
#22 14.34
#22 14.34 -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
#22 14.41 -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
#22 14.48 -- HIP and hipBLAS found
#22 14.48 -- Including HIP backend
#22 14.49 -- ggml version: 0.0.6445
#22 14.49 -- ggml commit: 00681dfc
#22 14.54 -- Using protobuf version 26.1.0 | Protobuf_INCLUDE_DIRS: | CMAKE_CURRENT_BINARY_DIR: /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build/tools/grpc-server
#22 14.54 -- Configuring done
#22 14.81 -- Generating done
#22 14.83 -- Build files have been written to: /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build
#22 14.85 gmake[3]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.87 gmake[4]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.89 gmake[5]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.89 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.89 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.89 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.90 [ 0%] Generating backend.pb.cc, backend.pb.h, backend.grpc.pb.cc, backend.grpc.pb.h
#22 14.90 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.90 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.90 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.90 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.91 [ 0%] Building CXX object common/CMakeFiles/build_info.dir/build-info.cpp.o
#22 14.91 [ 1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
#22 14.91 [ 1%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o
#22 14.96 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.97 [ 1%] Built target build_info
#22 14.98 [ 1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
#22 15.46 [ 1%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
#22 15.56 [ 3%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
#22 16.72 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 16.72 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 16.73 [ 5%] Building CXX object tools/grpc-server/CMakeFiles/hw_grpc_proto.dir/backend.grpc.pb.cc.o
#22 17.17 [ 5%] Building CXX object tools/grpc-server/CMakeFiles/hw_grpc_proto.dir/backend.pb.cc.o
#22 17.34 [ 5%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
#22 17.48 [ 5%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
#22 17.96 [ 7%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o
#22 24.91 [ 7%] Linking CXX static library libggml-base.a
#22 24.94 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 24.95 [ 7%] Built target ggml-base
#22 24.96 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 24.96 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 24.97 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 24.98 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 24.99 [ 7%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.c.o
#22 25.04 Scanning dependencies of target ggml-hip
#22 25.04 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 25.05 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 25.06 [ 9%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/acc.cu.o
#22 25.80 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build/tools/grpc-server/backend.pb.cc:1920:56: warning: no previous prototype for function 'descriptor_table_backend_2eproto_getter' [-Wmissing-prototypes]
#22 25.80 1920 | PROTOBUF_ATTRIBUTE_WEAK const ::_pbi::DescriptorTable* descriptor_table_backend_2eproto_getter() {
#22 25.80 | ^
#22 25.80 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build/tools/grpc-server/backend.pb.cc:1920:31: note: declare 'static' if the function is not intended to be used outside of this translation unit
#22 25.80 1920 | PROTOBUF_ATTRIBUTE_WEAK const ::_pbi::DescriptorTable* descriptor_table_backend_2eproto_getter() {
#22 25.80 | ^
#22 25.80 | static
#22 25.80 1 warning generated.
#22 25.82 [ 9%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/add-id.cu.o
#22 26.42 [ 9%] Linking CXX static library libhw_grpc_proto.a
#22 26.46 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 26.47 [ 9%] Built target hw_grpc_proto
#22 26.48 [ 9%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.cpp.o
#22 26.52 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:42:
#22 26.52 /opt/rocm-6.4.3/lib/llvm/bin/../include/omp.h:54:9: warning: ISO C restricts enumerator values to range of 'int' (2147483648 is too large) [-Wpedantic]
#22 26.52 54 | omp_sched_monotonic = 0x80000000
#22 26.52 | ^ ~~~~~~~~~~
#22 26.52 /opt/rocm-6.4.3/lib/llvm/bin/../include/omp.h:434:7: warning: ISO C restricts enumerator values to range of 'int' (18446744073709551615 is too large) [-Wpedantic]
#22 26.52 434 | KMP_ALLOCATOR_MAX_HANDLE = UINTPTR_MAX
#22 26.52 | ^ ~~~~~~~~~~~
#22 26.52 /opt/rocm-6.4.3/lib/llvm/bin/../include/omp.h:450:7: warning: ISO C restricts enumerator values to range of 'int' (18446744073709551615 is too large) [-Wpedantic]
#22 26.52 450 | KMP_MEMSPACE_MAX_HANDLE = UINTPTR_MAX
#22 26.52 | ^ ~~~~~~~~~~~
#22 26.52 /opt/rocm-6.4.3/lib/llvm/bin/../include/omp.h:496:39: warning: ISO C restricts enumerator values to range of 'int' (18446744073709551615 is too large) [-Wpedantic]
#22 26.52 496 | typedef enum omp_event_handle_t { KMP_EVENT_MAX_HANDLE = UINTPTR_MAX } omp_event_handle_t;
#22 26.52 | ^ ~~~~~~~~~~~
#22 26.52 4 warnings generated.
#22 26.54 [ 9%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/arange.cu.o
#22 28.22 [ 11%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/repack.cpp.o
#22 31.88 [ 11%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/hbm.cpp.o
#22 31.93 [ 11%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/quants.c.o
#22 34.54 [ 13%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/traits.cpp.o
#22 35.96 [ 13%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/amx/amx.cpp.o
#22 37.75 [ 13%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/amx/mmq.cpp.o
#22 39.58 [ 13%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/binary-ops.cpp.o
#22 43.34 [ 15%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/unary-ops.cpp.o
#22 47.41 [ 15%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/vec.cpp.o
#22 48.97 [ 15%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ops.cpp.o
#22 59.35 [ 15%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/llamafile/sgemm.cpp.o
#22 60.78 [ 17%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/arch/x86/quants.c.o
#22 61.81 [ 17%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/arch/x86/repack.cpp.o
#22 62.49 [ 17%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/argmax.cu.o
#22 63.25 [ 17%] Linking CXX static library libggml-cpu.a
#22 63.28 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 63.30 [ 17%] Built target ggml-cpu
#22 63.31 [ 19%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/argsort.cu.o
#22 63.54 [ 19%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/binbcast.cu.o
#22 63.67 [ 19%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/clamp.cu.o
#22 99.14 [ 21%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/concat.cu.o
#22 100.6 [ 21%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/conv-transpose-1d.cu.o
#22 100.9 [ 21%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/conv2d-dw.cu.o
#22 137.9 [ 21%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/conv2d-transpose.cu.o
#22 138.4 [ 23%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/conv2d.cu.o
#22 138.5 [ 23%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/convert.cu.o
#22 145.0 [ 23%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/count-equal.cu.o
#22 174.9 [ 25%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/cpy.cu.o
#22 178.2 [ 25%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/cross-entropy-loss.cu.o
#22 182.5 [ 25%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/diagmask.cu.o
#22 194.6 [ 25%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/fattn-tile.cu.o
#22 215.8 [ 27%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/fattn-wmma-f16.cu.o
#22 219.3 [ 27%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/fattn.cu.o
#22 231.1 [ 27%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/getrows.cu.o
#22 255.3 [ 29%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/ggml-cuda.cu.o
#22 275.5 [ 29%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/gla.cu.o
#22 286.3 [ 29%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/im2col.cu.o
#22 301.8 [ 29%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/mean.cu.o
#22 325.2 [ 31%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/mmf.cu.o
#22 328.6 [ 31%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/mmq.cu.o
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v68, v2, v69, v68
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v67, v2, v70, v67
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v66, v2, v71, v66
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v9, v2, v72, v9
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v8, v73, v69, v8
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v7, v73, v70, v7
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v6, v73, v71, v6
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v2, v73, v72, v2
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v68, v73, v69, v68
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v67, v73, v70, v67
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v66, v73, v71, v66
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v9, v73, v72, v9
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v8, v74, v69, v8
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v7, v74, v70, v7
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v6, v74, v71, v6
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v2, v74, v72, v2
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v68, v73, v69, v68
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v67, v73, v70, v67
#22 332.2 | ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2 559 | asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2 | ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2 1 | v_dot2_f32_f16 v66, v73, v71, v66
#22 332.2 | ^
#22 332.2 fatal error: too many errors emitted, stopping now [-ferror-limit=]
#22 332.2 20 errors generated when compiling for gfx803.