Inference

Benchmarking llama.cpp on SpacemiT K3: RISC-V AI Cores vs Standard RVV (Part 4)

Benchmarking llama.cpp on SpacemiT K3: RISC-V AI Cores vs Standard RVV (Part 4)

TL;DR SpacemiT's K3 has two core types: X100 (general-purpose, vlen 256) and A100 ("AI cores", vlen 1024). Standard llama.cpp runs 2.3x …

Running a Local LLM on RISC-V: Building llama.cpp on a Banana Pi F3 (Part 1)

Running a Local LLM on RISC-V: Building llama.cpp on a Banana Pi F3 (Part 1)

TL;DR I built llama.cpp from source on a Banana Pi F3 (SpacemiT K1, riscv64), ran TinyLlama 1.1B, and got an OpenAI-compatible API server running at …

First Words: LLM Inference on RISC-V

First Words: LLM Inference on RISC-V

Photo by Pixabay on Pexels This is part three of the RISC-V wheel factory series. Part one: link:{% post_url …