Pull and benchmark Qwen2.5 14B Instruct #4

Closed
opened 2026-03-08 10:46:59 +01:00 by shahondin1624 · 1 comment

Description

Pull qwen2.5:14b-instruct via Ollama and run benchmarks to establish baseline performance metrics.

Acceptance Criteria

  • Model pulled successfully
  • Tokens/second measured for generation
  • VRAM usage recorded
  • Prompt evaluation speed measured
  • Results documented in a benchmark table

Blocked by

## Description Pull `qwen2.5:14b-instruct` via Ollama and run benchmarks to establish baseline performance metrics. ## Acceptance Criteria - [ ] Model pulled successfully - [ ] Tokens/second measured for generation - [ ] VRAM usage recorded - [ ] Prompt evaluation speed measured - [ ] Results documented in a benchmark table ## Blocked by - #1
shahondin1624 added this to the Phase 0: ROCm / Ollama Verification milestone 2026-03-08 10:46:59 +01:00
Author
Owner

Benchmark Results — qwen2.5:14b-instruct

Metric Value
VRAM usage 9.7 GB
Processor 100% GPU
Prompt eval rate 625 tok/s
Generation rate 54 tok/s
Context size 4096

Model pulled and benchmarked successfully. Nearly identical performance to qwen2.5-coder:14b — same VRAM footprint, similar generation rate. Good candidate for researcher / summarization role.

## Benchmark Results — qwen2.5:14b-instruct | Metric | Value | |---|---| | VRAM usage | 9.7 GB | | Processor | 100% GPU | | Prompt eval rate | 625 tok/s | | Generation rate | 54 tok/s | | Context size | 4096 | Model pulled and benchmarked successfully. Nearly identical performance to qwen2.5-coder:14b — same VRAM footprint, similar generation rate. Good candidate for researcher / summarization role.
Sign in to join this conversation.