Skip to content

Compile bug: Failing to compile with hipblas and gfx803 #15936

@mudler

Description

@mudler

Git commit

00681df

Operating systems

Linux

GGML backends

HIP

Problem description & steps to reproduce

It seems that compilation of hipblas for gfx803 started to fail recently. This can be observed in the LocalAI CI: mudler/LocalAI#6235

First Bad Commit

Started to happen since: #15884

Compile command

cmake .. -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201 -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201 -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201

Relevant log output

cmake .. -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201 -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201 -DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF -DGGML_NATIVE=OFF -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102,gfx1200,gfx1201 && cmake --build . --config Release -j 4 --target grpc-server
#22 11.30 -- The C compiler identification is Clang 19.0.0
#22 11.35 -- The CXX compiler identification is Clang 19.0.0
#22 11.36 -- Detecting C compiler ABI info
#22 11.43 -- Detecting C compiler ABI info - done
#22 11.44 -- Check for working C compiler: /opt/rocm/llvm/bin/clang - skipped
#22 11.44 -- Detecting C compile features
#22 11.44 -- Detecting C compile features - done
#22 11.44 -- Detecting CXX compiler ABI info
#22 11.51 -- Detecting CXX compiler ABI info - done
#22 11.52 -- Check for working CXX compiler: /opt/rocm/llvm/bin/clang++ - skipped
#22 11.52 -- Detecting CXX compile features
#22 11.52 -- Detecting CXX compile features - done
#22 11.52 CMAKE_BUILD_TYPE=Release
#22 11.53 -- Found Git: /usr/bin/git (found version "2.34.1") 
#22 11.78 -- The ASM compiler identification is Clang with GNU-like command-line
#22 11.78 -- Found assembler: /opt/rocm/llvm/bin/clang
#22 11.79 -- Looking for pthread.h
#22 11.86 -- Looking for pthread.h - found
#22 11.86 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
#22 11.93 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
#22 11.93 -- Found Threads: TRUE  
#22 11.94 -- ccache found, compilation results will be cached. Disable with GGML_CCACHE=OFF.
#22 11.96 -- CMAKE_SYSTEM_PROCESSOR: x86_64
#22 11.96 -- GGML_SYSTEM_ARCH: x86
#22 11.96 -- Including CPU backend
#22 12.21 -- Found OpenMP_C: -fopenmp=libomp  
#22 12.29 -- Found OpenMP_CXX: -fopenmp=libomp  
#22 12.29 -- Found OpenMP: TRUE   
#22 12.30 -- x86 detected
#22 12.30 -- Adding CPU backend variant ggml-cpu: -msse4.2;-mbmi2 GGML_SSE42;GGML_BMI2
#22 13.60 -- The HIP compiler identification is Clang 19.0.0
#22 13.62 -- Detecting HIP compiler ABI info
#22 14.31 -- Detecting HIP compiler ABI info - done
#22 14.32 -- Check for working HIP compiler: /opt/rocm-6.4.3/lib/llvm/bin/clang++ - skipped
#22 14.32 -- Detecting HIP compile features
#22 14.32 -- Detecting HIP compile features - done
#22 14.34 CMake Warning (dev) at /opt/rocm/lib/cmake/hip/hip-config-amd.cmake:70 (message):
#22 14.34   AMDGPU_TARGETS is deprecated.  Please use GPU_TARGETS instead.
#22 14.34 Call Stack (most recent call first):
#22 14.34   /opt/rocm/lib/cmake/hip/hip-config.cmake:149 (include)
#22 14.34   ggml/src/ggml-hip/CMakeLists.txt:39 (find_package)
#22 14.34 This warning is for project developers.  Use -Wno-dev to suppress it.
#22 14.34 
#22 14.34 -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS
#22 14.41 -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success
#22 14.48 -- HIP and hipBLAS found
#22 14.48 -- Including HIP backend
#22 14.49 -- ggml version: 0.0.6445
#22 14.49 -- ggml commit:  00681dfc
#22 14.54 -- Using protobuf version 26.1.0 | Protobuf_INCLUDE_DIRS:  | CMAKE_CURRENT_BINARY_DIR: /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build/tools/grpc-server
#22 14.54 -- Configuring done
#22 14.81 -- Generating done
#22 14.83 -- Build files have been written to: /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build
#22 14.85 gmake[3]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.87 gmake[4]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.89 gmake[5]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.89 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.89 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.89 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.90 [  0%] Generating backend.pb.cc, backend.pb.h, backend.grpc.pb.cc, backend.grpc.pb.h
#22 14.90 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.90 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.90 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.90 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.91 [  0%] Building CXX object common/CMakeFiles/build_info.dir/build-info.cpp.o
#22 14.91 [  1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
#22 14.91 [  1%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o
#22 14.96 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 14.97 [  1%] Built target build_info
#22 14.98 [  1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
#22 15.46 [  1%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
#22 15.56 [  3%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
#22 16.72 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 16.72 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 16.73 [  5%] Building CXX object tools/grpc-server/CMakeFiles/hw_grpc_proto.dir/backend.grpc.pb.cc.o
#22 17.17 [  5%] Building CXX object tools/grpc-server/CMakeFiles/hw_grpc_proto.dir/backend.pb.cc.o
#22 17.34 [  5%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
#22 17.48 [  5%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
#22 17.96 [  7%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o
#22 24.91 [  7%] Linking CXX static library libggml-base.a
#22 24.94 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 24.95 [  7%] Built target ggml-base
#22 24.96 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 24.96 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 24.97 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 24.98 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 24.99 [  7%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.c.o
#22 25.04 Scanning dependencies of target ggml-hip
#22 25.04 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 25.05 gmake[6]: Entering directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 25.06 [  9%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/acc.cu.o
#22 25.80 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build/tools/grpc-server/backend.pb.cc:1920:56: warning: no previous prototype for function 'descriptor_table_backend_2eproto_getter' [-Wmissing-prototypes]
#22 25.80  1920 | PROTOBUF_ATTRIBUTE_WEAK const ::_pbi::DescriptorTable* descriptor_table_backend_2eproto_getter() {
#22 25.80       |                                                        ^
#22 25.80 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build/tools/grpc-server/backend.pb.cc:1920:31: note: declare 'static' if the function is not intended to be used outside of this translation unit
#22 25.80  1920 | PROTOBUF_ATTRIBUTE_WEAK const ::_pbi::DescriptorTable* descriptor_table_backend_2eproto_getter() {
#22 25.80       |                               ^
#22 25.80       |                               static 
#22 25.80 1 warning generated.
#22 25.82 [  9%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/add-id.cu.o
#22 26.42 [  9%] Linking CXX static library libhw_grpc_proto.a
#22 26.46 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 26.47 [  9%] Built target hw_grpc_proto
#22 26.48 [  9%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.cpp.o
#22 26.52 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cpu/ggml-cpu.c:42:
#22 26.52 /opt/rocm-6.4.3/lib/llvm/bin/../include/omp.h:54:9: warning: ISO C restricts enumerator values to range of 'int' (2147483648 is too large) [-Wpedantic]
#22 26.52    54 |         omp_sched_monotonic = 0x80000000
#22 26.52       |         ^                     ~~~~~~~~~~
#22 26.52 /opt/rocm-6.4.3/lib/llvm/bin/../include/omp.h:434:7: warning: ISO C restricts enumerator values to range of 'int' (18446744073709551615 is too large) [-Wpedantic]
#22 26.52   434 |       KMP_ALLOCATOR_MAX_HANDLE = UINTPTR_MAX
#22 26.52       |       ^                          ~~~~~~~~~~~
#22 26.52 /opt/rocm-6.4.3/lib/llvm/bin/../include/omp.h:450:7: warning: ISO C restricts enumerator values to range of 'int' (18446744073709551615 is too large) [-Wpedantic]
#22 26.52   450 |       KMP_MEMSPACE_MAX_HANDLE = UINTPTR_MAX
#22 26.52       |       ^                         ~~~~~~~~~~~
#22 26.52 /opt/rocm-6.4.3/lib/llvm/bin/../include/omp.h:496:39: warning: ISO C restricts enumerator values to range of 'int' (18446744073709551615 is too large) [-Wpedantic]
#22 26.52   496 |     typedef enum omp_event_handle_t { KMP_EVENT_MAX_HANDLE = UINTPTR_MAX } omp_event_handle_t;
#22 26.52       |                                       ^                      ~~~~~~~~~~~
#22 26.52 4 warnings generated.
#22 26.54 [  9%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/arange.cu.o
#22 28.22 [ 11%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/repack.cpp.o
#22 31.88 [ 11%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/hbm.cpp.o
#22 31.93 [ 11%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/quants.c.o
#22 34.54 [ 13%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/traits.cpp.o
#22 35.96 [ 13%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/amx/amx.cpp.o
#22 37.75 [ 13%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/amx/mmq.cpp.o
#22 39.58 [ 13%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/binary-ops.cpp.o
#22 43.34 [ 15%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/unary-ops.cpp.o
#22 47.41 [ 15%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/vec.cpp.o
#22 48.97 [ 15%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ops.cpp.o
#22 59.35 [ 15%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/llamafile/sgemm.cpp.o
#22 60.78 [ 17%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/arch/x86/quants.c.o
#22 61.81 [ 17%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/arch/x86/repack.cpp.o
#22 62.49 [ 17%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/argmax.cu.o
#22 63.25 [ 17%] Linking CXX static library libggml-cpu.a
#22 63.28 gmake[6]: Leaving directory '/LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/build'
#22 63.30 [ 17%] Built target ggml-cpu
#22 63.31 [ 19%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/argsort.cu.o
#22 63.54 [ 19%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/binbcast.cu.o
#22 63.67 [ 19%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/clamp.cu.o
#22 99.14 [ 21%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/concat.cu.o
#22 100.6 [ 21%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/conv-transpose-1d.cu.o
#22 100.9 [ 21%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/conv2d-dw.cu.o
#22 137.9 [ 21%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/conv2d-transpose.cu.o
#22 138.4 [ 23%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/conv2d.cu.o
#22 138.5 [ 23%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/convert.cu.o
#22 145.0 [ 23%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/count-equal.cu.o
#22 174.9 [ 25%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/cpy.cu.o
#22 178.2 [ 25%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/cross-entropy-loss.cu.o
#22 182.5 [ 25%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/diagmask.cu.o
#22 194.6 [ 25%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/fattn-tile.cu.o
#22 215.8 [ 27%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/fattn-wmma-f16.cu.o
#22 219.3 [ 27%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/fattn.cu.o
#22 231.1 [ 27%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/getrows.cu.o
#22 255.3 [ 29%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/ggml-cuda.cu.o
#22 275.5 [ 29%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/gla.cu.o
#22 286.3 [ 29%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/im2col.cu.o
#22 301.8 [ 29%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/mean.cu.o
#22 325.2 [ 31%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/mmf.cu.o
#22 328.6 [ 31%] Building HIP object ggml/src/ggml-hip/CMakeFiles/ggml-hip.dir/__/ggml-cuda/mmq.cu.o
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v68, v2, v69, v68
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v67, v2, v70, v67
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v66, v2, v71, v66
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v9, v2, v72, v9
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v8, v73, v69, v8
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v7, v73, v70, v7
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v6, v73, v71, v6
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v2, v73, v72, v2
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v68, v73, v69, v68
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v67, v73, v70, v67
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v66, v73, v71, v66
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v9, v73, v72, v9
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v8, v74, v69, v8
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v7, v74, v70, v7
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v6, v74, v71, v6
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v2, v74, v72, v2
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v68, v73, v69, v68
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v67, v73, v70, v67
#22 332.2       |         ^
#22 332.2 In file included from /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/fattn-tile.cu:1:
#22 332.2 /LocalAI/backend/cpp/llama-cpp-fallback-build/llama.cpp/ggml/src/ggml-cuda/common.cuh:559:18: error: instruction not supported on this GPU
#22 332.2   559 |     asm volatile("v_dot2_f32_f16 %0, %1, %2, %0" : "+v"(acc) : "v"(v), "v"(u));
#22 332.2       |                  ^
#22 332.2 <inline asm>:1:2: note: instantiated into assembly here
#22 332.2     1 |         v_dot2_f32_f16 v66, v73, v71, v66
#22 332.2       |         ^
#22 332.2 fatal error: too many errors emitted, stopping now [-ferror-limit=]
#22 332.2 20 errors generated when compiling for gfx803.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions