ddebb5ddf6
Squashes the entire TurboQuant KV-cache feature branch from https://github.com/TheTom/llama-cpp-turboquant (tip5aeb2fdbe) onto our master. Includes: TurboQuant KV-cache types (turbo2_0, turbo3_0, turbo4_0, tq3_1s, tq4_1s), GGML_OP_TURBO_WHT op, CUDA + Metal kernels (including TQ-rotated mul_mm path), CPU reference paths, HIP template instances, perplexity tooling, and 18 post-upstream-sync fixes (CVE-2026-21869 server clamp, HIP FA pool retention, n_head_v reshape, sparse-V CUDA gating, etc.). Conflict-resolution notes (review carefully before depending on these paths): - common/arg.cpp, common/speculative.cpp: master's refactored speculative API kept (params.speculative.types / ngram_mod struct, per-sinfo n_low/i_last). - ggml-cuda/fattn.cu: head-size exclusion lists unioned (now exclude both 192 and 640 alongside other sizes). - ggml-cuda/ggml-cuda.cu: both master's ADD/SUB/MUL/DIV F16 widening AND TurboQuant's GGML_OP_TURBO_WHT support cases kept. - ggml-metal-device.h/.cpp: master's new get_pipeline_mul_mv_ext signature (const ggml_tensor * op) kept; TurboQuant's get_pipeline_turbo_wht added. - ggml-metal-ops.cpp: TurboQuant's TQ-rotated mul_mm path preserved; non-TQ else-branch adapted to master's pipeline.nr0/nr1/nsg dispatch API. - ggml-vulkan.cpp: master's spec-constant-driven flash_attn pipeline iteration taken (over TurboQuant's CREATE_FA-per-type macro approach). TURBO3_0 added to the fa_kv_ok lambda for type validation. - ggml-vulkan/flash_attn_base.glsl, vulkan-shaders-gen.cpp: master's new spec-constant FA shader generation kept; TurboQuant's DATA_A_TURBO3_0 macro path NOT carried over. *** Vulkan TURBO3_0 flash-attention paths need re-implementation against the new spec-constant API. *** Vulkan TURBO3_0 inference will likely fail until that work is redone. Squash base:7fc1c4ef78(TheTom's last upstream merge point).
87 lines
22 KiB
JSON
87 lines
22 KiB
JSON
{"status":"success","tg128":68.11,"baseline_tg128":67.94,"delta_pct":"0%","build_time_s":4,"bench_time_s":7,"gpu_temp_c":"53","experiment":1,"timestamp":"2026-04-05T16:12:51Z","kept":true}
|
|
{"status":"success","tg128":53.41,"baseline_tg128":68.11,"delta_pct":"-20.0%","build_time_s":8,"bench_time_s":8,"gpu_temp_c":"53","experiment":2,"timestamp":"2026-04-05T16:24:50Z","kept":false}
|
|
{"status":"safety_revert","error":"modified non-target files","experiment":3,"timestamp":"2026-04-05T16:31:37Z","kept":false}
|
|
{"status":"error","error":"No baseline.json found for track: track-weight","experiment":4,"timestamp":"2026-04-05T16:42:20Z","kept":false}
|
|
{"status":"error","error":"No baseline.json found for track: track-weight","experiment":5,"timestamp":"2026-04-05T16:43:17Z","kept":false}
|
|
{"status":"error","error":"No baseline.json found for track: track-weight","experiment":6,"timestamp":"2026-04-05T16:44:37Z","kept":false}
|
|
{"status":"error","error":"No baseline.json found for track: track-weight","experiment":7,"timestamp":"2026-04-05T16:45:37Z","kept":false}
|
|
{"status":"error","error":"No baseline.json found for track: track-weight","experiment":8,"timestamp":"2026-04-05T16:46:50Z","kept":false}
|
|
{"status":"error","error":"No baseline.json found for track: track-weight","experiment":9,"timestamp":"2026-04-05T16:47:42Z","kept":false}
|
|
{"status":"error","error":"No baseline.json found for track: track-weight","experiment":10,"timestamp":"2026-04-05T16:49:54Z","kept":false}
|
|
{"status":"error","error":"No baseline.json found for track: track-weight","experiment":11,"timestamp":"2026-04-05T16:50:58Z","kept":false}
|
|
{"status":"error","error":"No baseline.json found for track: track-weight","experiment":12,"timestamp":"2026-04-05T16:51:51Z","kept":false}
|
|
{"status":"success","tg128":129.00,"baseline_tg128":69.2,"delta_pct":"80.0%","build_time_s":288,"bench_time_s":4,"gpu_temp_c":"59","experiment":13,"timestamp":"2026-04-05T16:59:41Z","kept":true}
|
|
{"status":"success","tg128":150.85,"baseline_tg128":129.00,"delta_pct":"10.0%","build_time_s":8,"bench_time_s":3,"gpu_temp_c":"55","experiment":14,"timestamp":"2026-04-05T17:07:38Z","kept":true}
|
|
{"status":"success","tg128":151.41,"baseline_tg128":150.85,"delta_pct":"0%","build_time_s":8,"bench_time_s":3,"gpu_temp_c":"50","experiment":15,"timestamp":"2026-04-05T17:09:05Z","kept":true}
|
|
{"status":"success","tg128":151.79,"baseline_tg128":151.41,"delta_pct":"0%","build_time_s":8,"bench_time_s":3,"gpu_temp_c":"52","experiment":16,"timestamp":"2026-04-05T17:11:54Z","kept":true}
|
|
{"status":"success","tg128":151.02,"baseline_tg128":151.79,"delta_pct":"0%","build_time_s":8,"bench_time_s":3,"gpu_temp_c":"54","experiment":17,"timestamp":"2026-04-05T17:19:57Z","kept":false}
|
|
{"status":"success","tg128":151.43,"baseline_tg128":151.79,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"51","experiment":18,"timestamp":"2026-04-05T17:26:15Z","kept":false}
|
|
{"status":"success","tg128":219.12,"baseline_tg128":151.79,"delta_pct":"40.0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"53","experiment":19,"timestamp":"2026-04-05T17:28:31Z","kept":true}
|
|
{"status":"success","tg128":220.78,"baseline_tg128":219.12,"delta_pct":"0%","build_time_s":9,"bench_time_s":3,"gpu_temp_c":"49","ppl":7.5425,"experiment":20,"timestamp":"2026-04-05T17:30:22Z","kept":true}
|
|
{"status":"success","tg128":220.46,"baseline_tg128":220.78,"delta_pct":"0%","build_time_s":8,"bench_time_s":2,"gpu_temp_c":"49","experiment":21,"timestamp":"2026-04-05T17:33:02Z","kept":false}
|
|
{"status":"success","tg128":212.39,"baseline_tg128":220.78,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"52","experiment":22,"timestamp":"2026-04-05T17:35:31Z","kept":false}
|
|
{"status":"success","tg128":223.52,"baseline_tg128":220.78,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"55","experiment":23,"timestamp":"2026-04-05T17:46:25Z","kept":true}
|
|
{"status":"success","tg128":223.51,"baseline_tg128":223.52,"delta_pct":"0%","build_time_s":8,"bench_time_s":3,"gpu_temp_c":"54","experiment":24,"timestamp":"2026-04-05T18:07:17Z","kept":false}
|
|
{"status":"success","tg128":223.14,"baseline_tg128":223.52,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"53","experiment":25,"timestamp":"2026-04-05T18:20:26Z","kept":false}
|
|
{"status":"build_failed","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/mmvq-tq.cu(191): error: more than one instance of overloaded function \"__dp4a\" matches the argument list:\n/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/mmvq-tq.cu(192): error: more than one instance of overloaded function \"__dp4a\" matches the argument list:\n/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/mmvq-tq.cu(199): error: more than one instance of overloaded function \"__dp4a\" matches the argument list:\n/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/mmvq-tq.cu(200): error: more than one instance of overloaded function \"__dp4a\" matches the argument list:","build_time_s":1,"experiment":26,"timestamp":"2026-04-05T18:35:52Z","kept":false}
|
|
{"status":"success","tg128":223.66,"baseline_tg128":223.52,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"56","experiment":27,"timestamp":"2026-04-05T18:37:09Z","kept":true}
|
|
{"status":"success","tg128":209.52,"baseline_tg128":223.66,"delta_pct":"0%","build_time_s":8,"bench_time_s":3,"gpu_temp_c":"53","experiment":28,"timestamp":"2026-04-05T18:46:10Z","kept":false}
|
|
{"status":"success","tg128":223.32,"baseline_tg128":223.66,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"59","experiment":29,"timestamp":"2026-04-05T19:05:19Z","kept":false}
|
|
{"status":"success","tg128":223.73,"baseline_tg128":223.66,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"56","ppl":7.5425,"experiment":30,"timestamp":"2026-04-05T19:19:40Z","kept":true}
|
|
{"status":"success","tg128":216.46,"baseline_tg128":223.73,"delta_pct":"0%","build_time_s":8,"bench_time_s":2,"gpu_temp_c":"53","experiment":31,"timestamp":"2026-04-05T19:27:42Z","kept":false}
|
|
{"status":"no_change","experiment":32,"timestamp":"2026-04-05T19:40:52Z","kept":false}
|
|
{"status":"success","tg128":223.91,"baseline_tg128":223.73,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"56","experiment":33,"timestamp":"2026-04-05T19:47:35Z","kept":true}
|
|
{"status":"success","tg128":223.47,"baseline_tg128":223.91,"delta_pct":"0%","build_time_s":8,"bench_time_s":2,"gpu_temp_c":"53","experiment":34,"timestamp":"2026-04-05T19:57:29Z","kept":false}
|
|
{"status":"success","tg128":223.86,"baseline_tg128":223.91,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"56","experiment":35,"timestamp":"2026-04-05T20:13:18Z","kept":false}
|
|
{"status":"success","tg128":224.45,"baseline_tg128":223.91,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"56","experiment":36,"timestamp":"2026-04-05T20:41:59Z","kept":true}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x760e8084fb1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x760e795ca737]","build_time_s":8,"bench_time_s":1,"experiment":37,"timestamp":"2026-04-05T20:43:05Z","kept":false}
|
|
{"status":"success","tg128":221.19,"baseline_tg128":224.45,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"49","experiment":38,"timestamp":"2026-04-05T20:44:35Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x74069c1cdb1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x740694fca737]","build_time_s":5,"bench_time_s":1,"experiment":39,"timestamp":"2026-04-05T20:45:49Z","kept":false}
|
|
{"status":"success","tg128":208.20,"baseline_tg128":224.45,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"55","ppl":7.5425,"experiment":40,"timestamp":"2026-04-05T20:52:14Z","kept":false}
|
|
{"status":"success","tg128":224.23,"baseline_tg128":224.45,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"50","experiment":41,"timestamp":"2026-04-05T20:55:47Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x70aa29337b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x70aa225ca737]","build_time_s":5,"bench_time_s":1,"experiment":42,"timestamp":"2026-04-05T20:57:48Z","kept":false}
|
|
{"status":"success","tg128":219.29,"baseline_tg128":224.45,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"55","experiment":43,"timestamp":"2026-04-05T21:04:11Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7c3f28670b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7c3f213ca737]","build_time_s":4,"bench_time_s":1,"experiment":44,"timestamp":"2026-04-05T21:05:11Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x73a442c62b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x73a43b9ca737]","build_time_s":5,"bench_time_s":1,"experiment":45,"timestamp":"2026-04-05T21:06:35Z","kept":false}
|
|
{"status":"success","tg128":218.41,"baseline_tg128":224.45,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"53","experiment":46,"timestamp":"2026-04-05T21:08:49Z","kept":false}
|
|
{"status":"success","tg128":223.73,"baseline_tg128":224.45,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"56","experiment":47,"timestamp":"2026-04-05T21:17:51Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7bad3bc68b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7bad349ca737]","build_time_s":5,"bench_time_s":1,"experiment":48,"timestamp":"2026-04-05T21:18:38Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7f0fe0dd2b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7f0fd9bca737]","build_time_s":6,"bench_time_s":0,"experiment":49,"timestamp":"2026-04-05T21:22:46Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7eb6a0537b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7eb6997ca737]","build_time_s":5,"bench_time_s":1,"experiment":50,"timestamp":"2026-04-05T21:24:04Z","kept":false}
|
|
{"status":"success","tg128":223.85,"baseline_tg128":224.45,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"55","experiment":51,"timestamp":"2026-04-05T21:29:33Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7492739b3b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x74926c7ca737]","build_time_s":5,"bench_time_s":1,"experiment":52,"timestamp":"2026-04-05T21:30:24Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x759986646b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x75997f3ca737]","build_time_s":5,"bench_time_s":0,"experiment":53,"timestamp":"2026-04-05T21:31:34Z","kept":false}
|
|
{"status":"success","tg128":223.94,"baseline_tg128":224.45,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"50","experiment":54,"timestamp":"2026-04-05T21:33:26Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x73c90cb37b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x73c905dca737]","build_time_s":4,"bench_time_s":1,"experiment":55,"timestamp":"2026-04-05T21:34:32Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7f926c661b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7f92653ca737]","build_time_s":5,"bench_time_s":1,"experiment":56,"timestamp":"2026-04-05T21:41:29Z","kept":false}
|
|
{"status":"success","tg128":225.17,"baseline_tg128":224.45,"delta_pct":"0%","build_time_s":6,"bench_time_s":2,"gpu_temp_c":"57","experiment":57,"timestamp":"2026-04-05T21:50:14Z","kept":true}
|
|
{"status":"success","tg128":222.27,"baseline_tg128":225.17,"delta_pct":"0%","build_time_s":8,"bench_time_s":3,"gpu_temp_c":"49","experiment":58,"timestamp":"2026-04-05T21:54:06Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7bb3e0137b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7bb3d93ca737]","build_time_s":5,"bench_time_s":1,"experiment":59,"timestamp":"2026-04-05T21:55:23Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7b9533f37b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7b952d1ca737]","build_time_s":5,"bench_time_s":1,"experiment":60,"timestamp":"2026-04-05T21:56:34Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7a274f7b4b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7a27485ca737]","build_time_s":4,"bench_time_s":1,"experiment":61,"timestamp":"2026-04-05T21:57:42Z","kept":false}
|
|
{"status":"success","tg128":207.76,"baseline_tg128":225.17,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"56","experiment":62,"timestamp":"2026-04-05T22:04:53Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7dee6b826b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7dee645ca737]","build_time_s":5,"bench_time_s":1,"experiment":63,"timestamp":"2026-04-05T22:05:49Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7bddc9b37b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7bddc2dca737]","build_time_s":5,"bench_time_s":1,"experiment":64,"timestamp":"2026-04-05T22:07:04Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x71109ac95b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7110939ca737]","build_time_s":4,"bench_time_s":1,"experiment":65,"timestamp":"2026-04-05T22:11:06Z","kept":false}
|
|
{"status":"success","tg128":219.26,"baseline_tg128":225.17,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"56","experiment":66,"timestamp":"2026-04-05T22:19:51Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x72b8b9337b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x72b8b25ca737]","build_time_s":6,"bench_time_s":1,"experiment":67,"timestamp":"2026-04-05T22:20:59Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x7d8176937b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x7d816fbca737]","build_time_s":5,"bench_time_s":1,"experiment":68,"timestamp":"2026-04-05T22:25:11Z","kept":false}
|
|
{"status":"runtime_crash","error":"/mnt/ai/projects/llama-cpp-turboquant/ggml/src/ggml-cuda/ggml-cuda.cu:100: CUDA error\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-base.so.0(ggml_abort+0x15b)[0x784b43537b1b]\n/mnt/ai/projects/llama-cpp-turboquant/build-cuda/bin/libggml-cuda.so.0(_Z15ggml_cuda_errorPKcS0_S0_iS0_+0xb7)[0x784b3c7ca737]","build_time_s":5,"bench_time_s":1,"experiment":69,"timestamp":"2026-04-05T22:27:18Z","kept":false}
|
|
{"status":"success","tg128":224.98,"baseline_tg128":225.17,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"58","ppl":7.5425,"experiment":70,"timestamp":"2026-04-05T22:33:37Z","kept":false}
|
|
{"status":"success","tg128":225.46,"baseline_tg128":225.17,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"51","experiment":71,"timestamp":"2026-04-05T22:39:32Z","kept":true}
|
|
{"status":"success","tg128":220.92,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":8,"bench_time_s":2,"gpu_temp_c":"55","experiment":72,"timestamp":"2026-04-05T22:51:55Z","kept":false}
|
|
{"status":"success","tg128":220.68,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"52","experiment":73,"timestamp":"2026-04-05T22:57:46Z","kept":false}
|
|
{"status":"success","tg128":219.24,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"55","experiment":74,"timestamp":"2026-04-05T23:02:38Z","kept":false}
|
|
{"status":"success","tg128":219.54,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"53","experiment":75,"timestamp":"2026-04-05T23:07:50Z","kept":false}
|
|
{"status":"success","tg128":222.66,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"56","experiment":76,"timestamp":"2026-04-05T23:17:39Z","kept":false}
|
|
{"status":"success","tg128":210.26,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"48","experiment":77,"timestamp":"2026-04-05T23:20:13Z","kept":false}
|
|
{"status":"success","tg128":220.71,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"53","experiment":78,"timestamp":"2026-04-05T23:22:33Z","kept":false}
|
|
{"status":"success","tg128":219.25,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"58","experiment":79,"timestamp":"2026-04-05T23:31:36Z","kept":false}
|
|
{"status":"success","tg128":210.19,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"53","ppl":7.5425,"experiment":80,"timestamp":"2026-04-05T23:38:24Z","kept":false}
|
|
{"status":"success","tg128":221.41,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":4,"bench_time_s":3,"gpu_temp_c":"51","experiment":81,"timestamp":"2026-04-05T23:39:51Z","kept":false}
|
|
{"status":"success","tg128":215.12,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"51","experiment":82,"timestamp":"2026-04-05T23:45:46Z","kept":false}
|
|
{"status":"success","tg128":210.01,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"55","experiment":83,"timestamp":"2026-04-05T23:52:38Z","kept":false}
|
|
{"status":"success","tg128":221.42,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":5,"bench_time_s":3,"gpu_temp_c":"51","experiment":84,"timestamp":"2026-04-05T23:56:44Z","kept":false}
|
|
{"status":"success","tg128":220.24,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"54","experiment":85,"timestamp":"2026-04-06T00:01:16Z","kept":false}
|
|
{"status":"success","tg128":225.90,"baseline_tg128":225.46,"delta_pct":"0%","build_time_s":5,"bench_time_s":2,"gpu_temp_c":"56","experiment":86,"timestamp":"2026-04-06T00:20:56Z","kept":true}
|