Llama.cpp

Benchmarking llama.cpp on SpacemiT K3: RISC-V AI Cores vs Standard RVV (Part 4)

Benchmarking llama.cpp on SpacemiT K3: RISC-V AI Cores vs Standard RVV (Part 4)

TL;DR SpacemiT's K3 has two core types: X100 (general-purpose, vlen 256) and A100 ("AI cores", vlen 1024). Standard llama.cpp runs 2.3x …

Running a Local LLM on RISC-V: Building llama.cpp on a Banana Pi F3 (Part 1)

Running a Local LLM on RISC-V: Building llama.cpp on a Banana Pi F3 (Part 1)

TL;DR I built llama.cpp from source on a Banana Pi F3 (SpacemiT K1, riscv64), ran TinyLlama 1.1B, and got an OpenAI-compatible API server running at …

Running a 70B LLM on Pure RISC-V: The MilkV Pioneer Deployment Journey

Running a 70B LLM on Pure RISC-V: The MilkV Pioneer Deployment Journey

When the 40GB download completed and the model loaded into memory, I wondered: would a 70-billion parameter language model actually run on a RISC-V …