b42c7fa5b8
* port #22358 PR to examples/speculative/speculative.cpp * use vocab_[tgt,dft] instead of ctx_[tgt,dft] when logging on draft model / target model vocabulary mismatch Co-authored-by: Petros Sideris <petros.sideris@nokia.com>
llama.cpp/examples/speculative
Demonstration of speculative decoding and tree-based speculative decoding techniques
More info: