AI

Part 8 · Testing a Local-AI App: 351 Tests, Zero Infrastructure

Tests that need Ollama running, or Postgres up, or a real PDF on disk, will be skipped in CI and rotted in a week. Here's how I made CogniVault's test suite run in any environment …

Ndimofor Aretas

• May 25, 2026 • 8 min read

Gemma CogniVault

A fully local, privacy-first AI Study Companion — Gemma 4 + FAISS + BM25 + Strands Agents, running entirely on your machine.

Ndimofor Aretas

• May 25, 2026 • 1 min read

Part 5 · Getting Reliable JSON Out of a Local LLM

format="json" gets you 90% of the way there. The other 10% is trailing commas, shape surprises, and items that almost-but-don't-quite match the schema. Here's the defensive pattern …

Ndimofor Aretas

• May 10, 2026 • 7 min read

Part 3 · Two-Phase Streaming: Showing the Model Think Before It Acts

Most agent UIs hide the model's reasoning until everything finishes. CogniVault streams Gemma 4's chain of thought first, then runs the Strands tool loop — and the UX dividends are …

Ndimofor Aretas

• Apr 30, 2026 • 7 min read

RAG

Part 2 · Hybrid Retrieval in Practice: FAISS + BM25, Fused with RRF

Dense vectors are smart but forgetful. Keyword search is dumb but loyal. Here's how I combined FAISS, BM25, and Reciprocal Rank Fusion in CogniVault — and why pure semantic search …

Ndimofor Aretas

• Apr 25, 2026 • 7 min read

Part 1 · Why I Built a Local-First RAG

Cloud AI assistants are powerful — but for trainers, researchers, and anyone handling sensitive material, they're also a leaky abstraction. Here's why I built a 100% local …

Ndimofor Aretas

• Apr 20, 2026 • 4 min read

No results found

AI

Part 8 · Testing a Local-AI App: 351 Tests, Zero Infrastructure

Gemma CogniVault

Part 5 · Getting Reliable JSON Out of a Local LLM

Part 3 · Two-Phase Streaming: Showing the Model Think Before It Acts

Part 2 · Hybrid Retrieval in Practice: FAISS + BM25, Fused with RRF

Part 1 · Why I Built a Local-First RAG