<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Privacy |</title><link>https://aretascodes.dev/tags/privacy/</link><atom:link href="https://aretascodes.dev/tags/privacy/index.xml" rel="self" type="application/rss+xml"/><description>Privacy</description><generator>HugoBlox Kit (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Mon, 20 Apr 2026 00:00:00 +0000</lastBuildDate><image><url>https://aretascodes.dev/media/icon_hu_2ab4f4763b27c75b.png</url><title>Privacy</title><link>https://aretascodes.dev/tags/privacy/</link></image><item><title>Part 1 · Why I Built a Local-First RAG</title><link>https://aretascodes.dev/blog/why-local-first-rag/</link><pubDate>Mon, 20 Apr 2026 00:00:00 +0000</pubDate><guid>https://aretascodes.dev/blog/why-local-first-rag/</guid><description>
&lt;blockquote class="border-l-4 border-neutral-300 dark:border-neutral-600 pl-4 italic text-neutral-600 dark:text-neutral-400 my-6"&gt;
&lt;p&gt;All abbreviations are fully explained in the appendix at the bottom of the page.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I&amp;rsquo;ve spent the last few years in front of virtual classrooms full of career-changers in Germany, walking them through programming basics, web development, and introductory AI courses. Most of the information we deal with is fine to paste into cloud-based AI tools. Some of it really isn&amp;rsquo;t.&lt;/p&gt;
&lt;p&gt;Exam materials under confidentiality. A trainee&amp;rsquo;s portfolio with personal details. Other private documents that should never end up training someone else&amp;rsquo;s model.&lt;/p&gt;
&lt;p&gt;So I built
— a fully local AI study and productivity tool. No cloud. No telemetry. No &amp;ldquo;we may use this data to improve our service.&amp;rdquo; Just Gemma 4 running on Ollama, on my laptop, talking to my files.&lt;/p&gt;
&lt;h2 id="the-leaky-abstraction"&gt;The leaky abstraction&lt;/h2&gt;
&lt;p&gt;The pitch for cloud AI is great: a giant model, available instantly, billed by the token. The fine print is where it gets uncomfortable:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Where does the data physically live during inference?&lt;/li&gt;
&lt;li&gt;Whose jurisdiction governs that hardware this afternoon?&lt;/li&gt;
&lt;li&gt;Does the &lt;em&gt;audit trail&lt;/em&gt; stop at the API boundary, or can you actually trace what happened to your bytes?&lt;/li&gt;
&lt;li&gt;When you tick &amp;ldquo;do not train on my data,&amp;rdquo; are you trusting a control, a contract, or both?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For most consumer use cases, those questions are fine to wave away. For &lt;strong&gt;education, healthcare, finance, legal, public administration&lt;/strong&gt; — the answer &amp;ldquo;trust us&amp;rdquo; isn&amp;rsquo;t an answer.&lt;/p&gt;
&lt;h2 id="what-local-first-actually-means-here"&gt;What &amp;ldquo;local-first&amp;rdquo; actually means here&lt;/h2&gt;
&lt;p&gt;Lots of products say &amp;ldquo;private.&amp;rdquo; I wanted three concrete properties:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;The model lives on your machine.&lt;/strong&gt; Gemma 4 (&lt;code&gt;gemma4:e4b&lt;/code&gt;) and &lt;code&gt;embeddinggemma&lt;/code&gt; are pulled via Ollama. Inference is a localhost HTTP call.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Your documents never leave.&lt;/strong&gt; Vectors, chunks, chat history, study sessions, achievements — all on disk on your computer.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;You can &lt;em&gt;verify&lt;/em&gt; it.&lt;/strong&gt; Gemma CogniVault ships a &lt;strong&gt;Privacy Audit Panel&lt;/strong&gt; that shows a live &amp;ldquo;zero external connections&amp;rdquo; indicator alongside document counts and the Ollama host. It&amp;rsquo;s not a promise — it&amp;rsquo;s a status light.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If a future build of Gemma CogniVault ever made an outbound call, that panel would be the first thing to scream.&lt;/p&gt;
&lt;h2 id="what-you-get-back"&gt;What you get back&lt;/h2&gt;
&lt;p&gt;Going local sounds like a trade-off — surely you lose the magic of the giant frontier models? In practice, with &lt;strong&gt;Gemma 4&lt;/strong&gt; you get more than enough:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Thinking mode&lt;/strong&gt; — Gemma 4&amp;rsquo;s chain-of-thought streams into a collapsible panel before the answer. Watching the model reason about your documents is genuinely useful as a teaching tool.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tool use&lt;/strong&gt; — through the
, the model decides when to search the knowledge base, summarise a document, compare two files, or check the time.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Vision&lt;/strong&gt; — attach images and PDFs straight into a chat turn.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Generation that&amp;rsquo;s actually structured&lt;/strong&gt; — quizzes, multi-lesson workshops, flashcard decks, and interactive mindmaps, generated with &lt;code&gt;format=&amp;quot;json&amp;quot;&lt;/code&gt; so the output parses reliably.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Cognivault doesn&amp;rsquo;t try to be a giant ecosystem. It&amp;rsquo;s a single-purpose tool that does one thing well: use your own documents with a capable local model in a private environment. I must admit that it was inspired to a great extent by
, which I&amp;rsquo;ve found incredibly useful but not private enough for my needs.&lt;/p&gt;
&lt;h2 id="the-shape-of-the-app"&gt;The shape of the app&lt;/h2&gt;
&lt;p&gt;CogniVault is split into four sections that map to how I actually work with information on cloud-based AI tools:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Section&lt;/th&gt;
&lt;th&gt;What it&amp;rsquo;s for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chat&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ask anything about your documents. Cited answers, scope filter, voice in.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Knowledge Base&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Upload, categorise, manage. SHA-256 detects edits on re-upload.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Study Hub&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Quiz · Workshop · Flashcards · Mindmaps — four ways to drill into the source.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dashboard&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Total study time, streak, 25 badges, GitHub-style 90-day heatmap.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Everything reachable from a sidebar that remembers where you left off, on a stack that fits in your &lt;code&gt;~/Documents&lt;/code&gt; folder.&lt;/p&gt;
&lt;h2 id="what-comes-next"&gt;What comes next&lt;/h2&gt;
&lt;p&gt;This is the first in a short series. Over the next few posts I&amp;rsquo;ll dig into the parts I&amp;rsquo;m most proud of — and a few I&amp;rsquo;d build differently next time:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Hybrid retrieval&lt;/strong&gt; — why FAISS &lt;em&gt;and&lt;/em&gt; BM25, fused with Reciprocal Rank Fusion&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Two-phase streaming&lt;/strong&gt; with Gemma 4 and Strands Agents&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Crash-resumable ingestion&lt;/strong&gt; with DBOS, hash-aware re-ingest, OCR fallback&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Getting reliable JSON&lt;/strong&gt; out of a local LLM (and what to do when it fails)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The mindmap renderer&lt;/strong&gt; — what hand-rolling SVG taught me, and why v2 uses React Flow&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Gamifying learning&lt;/strong&gt; — 25 badges, idle-gap sessions, 90-day heatmap&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Testing a local-AI app&lt;/strong&gt; with 350+ tests and zero infrastructure&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you want to skip ahead, the code is open source at
, and there&amp;rsquo;s a
.&lt;/p&gt;
&lt;p&gt;Your data. Your hardware. Your AI. Your vault.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="appendix-abbreviations-in-this-post"&gt;Appendix: Abbreviations in this post&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Abbreviation&lt;/th&gt;
&lt;th&gt;Full form&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RAG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Retrieval-Augmented Generation&lt;/td&gt;
&lt;td&gt;Retrieve relevant passages from your own documents first; let the model answer from them instead of from training memory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Artificial Intelligence&lt;/td&gt;
&lt;td&gt;Software performing tasks that normally need human intelligence&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LLM&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Large Language Model&lt;/td&gt;
&lt;td&gt;A neural network trained on huge amounts of text that can read and generate language&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HTTP&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;HyperText Transfer Protocol&lt;/td&gt;
&lt;td&gt;The protocol browsers and APIs use to exchange requests and responses&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Application Programming Interface&lt;/td&gt;
&lt;td&gt;The boundary where you call someone else&amp;rsquo;s software — and where cloud audit trails stop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IHK&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Industrie- und Handelskammer&lt;/td&gt;
&lt;td&gt;The German Chamber of Commerce and Industry, which administers trainer certification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AEVO&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ausbildereignungsverordnung&lt;/td&gt;
&lt;td&gt;The German trainer-aptitude regulation — the exam material that motivated this project&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FAISS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Facebook AI Similarity Search&lt;/td&gt;
&lt;td&gt;Meta&amp;rsquo;s vector-search library (covered in the next post)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;BM25&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Best Match 25&lt;/td&gt;
&lt;td&gt;A classic keyword-ranking formula (also next post)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SDK&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Software Development Kit&lt;/td&gt;
&lt;td&gt;A library of building blocks — here, Strands, which provides the agent loop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JSON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;JavaScript Object Notation&lt;/td&gt;
&lt;td&gt;The universal text format for structured data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PDF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Portable Document Format&lt;/td&gt;
&lt;td&gt;One of the eight-plus file types CogniVault ingests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SHA-256&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Secure Hash Algorithm, 256-bit&lt;/td&gt;
&lt;td&gt;A content fingerprint used to detect edited files on re-upload&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OCR&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Optical Character Recognition&lt;/td&gt;
&lt;td&gt;Turning pictures of text (scans) into machine-readable text&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DBOS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Database-Oriented Operating System&lt;/td&gt;
&lt;td&gt;The durable-workflow library behind crash-resumable ingestion&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SVG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Scalable Vector Graphics&lt;/td&gt;
&lt;td&gt;The browser&amp;rsquo;s built-in vector drawing format&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;</description></item></channel></rss>