AI Engineering

Part 5 · Getting Reliable JSON Out of a Local LLM featured image

Part 5 · Getting Reliable JSON Out of a Local LLM

format="json" gets you 90% of the way there. The other 10% is trailing commas, shape surprises, and items that almost-but-don't-quite match the schema. Here's the defensive pattern …

avatar
Ndimofor Aretas
Part 4 · Crash-Resumable Ingestion: DBOS, SHA-256, and Surviving a kill -9 featured image

Part 4 · Crash-Resumable Ingestion: DBOS, SHA-256, and Surviving a kill -9

Re-embedding a 200-page PDF every time you tweak one paragraph is a tax nobody wants to pay. Here's how CogniVault uses DBOS workflows and content hashing to ingest only what …

avatar
Ndimofor Aretas
Part 3 · Two-Phase Streaming: Showing the Model Think Before It Acts featured image

Part 3 · Two-Phase Streaming: Showing the Model Think Before It Acts

Most agent UIs hide the model's reasoning until everything finishes. CogniVault streams Gemma 4's chain of thought first, then runs the Strands tool loop — and the UX dividends are …

avatar
Ndimofor Aretas
Part 2 · Hybrid Retrieval in Practice: FAISS + BM25, Fused with RRF featured image

Part 2 · Hybrid Retrieval in Practice: FAISS + BM25, Fused with RRF

Dense vectors are smart but forgetful. Keyword search is dumb but loyal. Here's how I combined FAISS, BM25, and Reciprocal Rank Fusion in CogniVault — and why pure semantic search …

avatar
Ndimofor Aretas
Part 1 · Why I Built a Local-First RAG featured image

Part 1 · Why I Built a Local-First RAG

Cloud AI assistants are powerful — but for trainers, researchers, and anyone handling sensitive material, they're also a leaky abstraction. Here's why I built a 100% local …

avatar
Ndimofor Aretas