<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Whisper |</title><link>https://aretascodes.dev/tags/whisper/</link><atom:link href="https://aretascodes.dev/tags/whisper/index.xml" rel="self" type="application/rss+xml"/><description>Whisper</description><generator>HugoBlox Kit (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Fri, 12 Jun 2026 00:00:00 +0000</lastBuildDate><image><url>https://aretascodes.dev/media/icon_hu_2ab4f4763b27c75b.png</url><title>Whisper</title><link>https://aretascodes.dev/tags/whisper/</link></image><item><title>CogniVault Backend Explained, Part 4 · Study Tools, Progress, and the Privacy Receipts</title><link>https://aretascodes.dev/blog/backend-explained-study-hub-privacy/</link><pubDate>Fri, 12 Jun 2026 00:00:00 +0000</pubDate><guid>https://aretascodes.dev/blog/backend-explained-study-hub-privacy/</guid><description>
&lt;blockquote class="border-l-4 border-neutral-300 dark:border-neutral-600 pl-4 italic text-neutral-600 dark:text-neutral-400 my-6"&gt;
&lt;p&gt;All abbreviations are fully explained in the appendix at the bottom of the page.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In
we followed a question through hybrid retrieval and the agent loop to a cited answer. In this final part, the same machinery gets pointed at a different goal: &lt;em&gt;teaching you&lt;/em&gt; — and then we close the series by auditing the project&amp;rsquo;s central promise: nothing leaves your machine.&lt;/p&gt;
&lt;h2 id="one-recipe-four-study-tools"&gt;One recipe, four study tools&lt;/h2&gt;
&lt;p&gt;CogniVault generates quizzes, multi-lesson workshops, flashcard decks, and mindmaps from your documents. Four different outputs — but under the hood, one shared five-step recipe:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Retrieve.&lt;/strong&gt; The same hybrid search from Part 3, but instead of your question, the probe is a broad query like &lt;em&gt;&amp;ldquo;key concepts, definitions, important facts, main ideas&amp;rdquo;&lt;/em&gt;, scoped to the documents you selected. Up to 15 representative chunks come back.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Prompt from a template.&lt;/strong&gt; The instructions sent to Gemma are not buried in Python — they&amp;rsquo;re editable Markdown files in &lt;code&gt;backend/prompts/&lt;/code&gt; (&lt;code&gt;quiz.md&lt;/code&gt;, &lt;code&gt;flashcards.md&lt;/code&gt;, and so on). Drop a modified copy into &lt;code&gt;backend/prompts/custom/&lt;/code&gt; and it overrides the shipped version on the very next request. No restart, no code change. Prompt engineering as configuration.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Constrain the output.&lt;/strong&gt; Asking a small local model to &amp;ldquo;please return JSON&amp;rdquo; works most of the time — and &lt;em&gt;most of the time&lt;/em&gt; is a production bug. CogniVault uses Ollama&amp;rsquo;s grammar-constrained generation (&lt;code&gt;format=&amp;quot;json&amp;quot;&lt;/code&gt;), which makes invalid JSON impossible rather than unlikely, plus low temperature for consistency. The full saga of getting reliable structure out of a 4-billion-parameter model is in
.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Validate defensively.&lt;/strong&gt; Every generated item is checked field by field, and malformed items are &lt;em&gt;dropped&lt;/em&gt; rather than failing the whole batch. Small models occasionally fumble one question out of ten; a product shouldn&amp;rsquo;t collapse because of it.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Persist.&lt;/strong&gt; Everything lands in SQLite, so quizzes are resumable, workshop progress survives restarts, and flashcard statuses are remembered per deck.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Here&amp;rsquo;s the recipe in motion for a quiz:&lt;/p&gt;
&lt;div class="mermaid"&gt;%%{init: {'sequence': {'actorFontSize': 28, 'messageFontSize': 24, 'loopTextFontSize': 22, 'noteFontSize': 22}}}%%
sequenceDiagram
actor U as You
participant F as Study Hub UI
participant B as FastAPI
participant V as VectorDB
participant O as Ollama (gemma4:e4b)
participant S as SQLite
U-&gt;&gt;F: Pick scope, difficulty, question count
F-&gt;&gt;B: POST /api/study/quiz/generate
B-&gt;&gt;V: Hybrid search, scoped to your documents
V--&gt;&gt;B: Up to 15 representative chunks
B-&gt;&gt;B: Render the quiz.md prompt template
B-&gt;&gt;O: chat(format="json", low temperature)
O--&gt;&gt;B: Grammar-constrained JSON
B-&gt;&gt;B: Validate each question, drop bad ones
B-&gt;&gt;S: Save quiz (resumable later)
B--&gt;&gt;F: Typed response
F--&gt;&gt;U: Play, submit, score — and maybe a new badge
&lt;/div&gt;
&lt;p&gt;The four tools differ only in their template and their shape: quizzes produce multiple-choice and true/false questions with explanations; workshops produce an outline first and then write each lesson &lt;em&gt;on demand&lt;/em&gt; when you open it; flashcards produce front/back pairs; mindmaps produce a topic tree that the frontend renders as an interactive diagram. (That renderer is its own adventure:
.)&lt;/p&gt;
&lt;h2 id="sessions-that-track-themselves"&gt;Sessions that track themselves&lt;/h2&gt;
&lt;p&gt;Most study apps make you press a start button, and most people forget. CogniVault takes a different stance: &lt;strong&gt;study sessions are inferred, not declared&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Every chat message either extends the current session or — after a 15-minute idle gap — quietly starts a new one. Walk away for coffee, come back, keep working: same session. Come back tomorrow: new session. No buttons, no forgetting.&lt;/p&gt;
&lt;p&gt;Each message also records a tiny event (timestamp, whether you used a scope filter or attachments) into &lt;code&gt;progress.db&lt;/code&gt; — a SQLite database, which is a complete relational database living in a single file. Eleven tables hold everything: sessions, message events, earned badges, quiz attempts and saved quizzes, workshops and lessons, decks and cards, and mindmaps.&lt;/p&gt;
&lt;p&gt;One engineering note worth copying: the tracking call inside the chat endpoint is wrapped so that it can &lt;em&gt;never&lt;/em&gt; block or break the chat. Analytics must be a passenger, never a driver.&lt;/p&gt;
&lt;h2 id="25-badges-defined-as-data"&gt;25 badges, defined as data&lt;/h2&gt;
&lt;p&gt;The achievements aren&amp;rsquo;t scattered through the code as &lt;code&gt;if&lt;/code&gt; statements. They live in one JSON file — 25 entries, each with a code, a name, an icon, the metric it watches, and a target. After each relevant action, an evaluator checks every definition against the database and persists anything newly earned. Some badges form ladders, each pointing to its next level.&lt;/p&gt;
&lt;p&gt;Declarative beats imperative here for a simple reason: adding badge number 26 means adding a JSON entry, not writing new logic. The design behind the streaks, the idle-gap rule, and the 90-day heatmap got its own post:
.&lt;/p&gt;
&lt;h2 id="voice-input-without-a-cloud-microphone"&gt;Voice input, without a cloud microphone&lt;/h2&gt;
&lt;p&gt;The microphone button is powered by &lt;strong&gt;faster-whisper&lt;/strong&gt; — OpenAI&amp;rsquo;s Whisper speech-recognition model re-implemented on a faster inference engine — running on your CPU with int8 quantisation (8-bit numbers instead of 32-bit: smaller, faster, accurate enough). No audio ever leaves the machine.&lt;/p&gt;
&lt;p&gt;The model is lazy-loaded on the first transcription so app startup stays instant, and if faster-whisper isn&amp;rsquo;t installed at all, the frontend simply hides the mic button. Features should degrade, not detonate.&lt;/p&gt;
&lt;h2 id="the-privacy-receipts"&gt;The privacy receipts&lt;/h2&gt;
&lt;p&gt;The series began with a promise: &lt;em&gt;nothing leaves your machine.&lt;/em&gt; Promises are cheap — here&amp;rsquo;s the audit. Every byte CogniVault stores, and where it lives:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Data&lt;/th&gt;
&lt;th&gt;Location&lt;/th&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Your uploaded files&lt;/td&gt;
&lt;td&gt;&lt;code&gt;docs/&lt;/code&gt; folder&lt;/td&gt;
&lt;td&gt;The original files&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Search vectors&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vector_store.faiss&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;FAISS binary index&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chunk text and metadata&lt;/td&gt;
&lt;td&gt;&lt;code&gt;vector_store.json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;File-to-category map&lt;/td&gt;
&lt;td&gt;&lt;code&gt;categories.json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Chat sessions&lt;/td&gt;
&lt;td&gt;&lt;code&gt;chat_history.json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sessions, badges, quizzes, workshops, decks, mindmaps&lt;/td&gt;
&lt;td&gt;&lt;code&gt;progress.db&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SQLite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ingestion checkpoints&lt;/td&gt;
&lt;td&gt;PostgreSQL (local Docker volume)&lt;/td&gt;
&lt;td&gt;DBOS system tables&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The AI models themselves&lt;/td&gt;
&lt;td&gt;Ollama&amp;rsquo;s local model store&lt;/td&gt;
&lt;td&gt;Model weights&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Nothing in that table is on someone else&amp;rsquo;s computer. Inference goes to &lt;code&gt;localhost&lt;/code&gt;. Embeddings go to &lt;code&gt;localhost&lt;/code&gt;. The only outbound request the backend ever makes is the URL-import feature — at your explicit request, and guarded against fetching private addresses. The app even surfaces these stats live in its Privacy Vault Audit panel.&lt;/p&gt;
&lt;p&gt;And because trust needs more than a table: the whole backend is covered by a pytest suite you can run yourself — the approach is documented in
.&lt;/p&gt;
&lt;h2 id="series-wrap-up"&gt;Series wrap-up&lt;/h2&gt;
&lt;p&gt;Four parts, one architecture:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;
&lt;/strong&gt; — three processes, four layers, and a decoder ring for the jargon&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;
&lt;/strong&gt; — a durable, format-aware pipeline that turns any document into searchable vectors&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;
&lt;/strong&gt; — two retrievers covering each other&amp;rsquo;s blind spots, fused by rank, driven by an agent&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Part 4&lt;/strong&gt; — the same machinery generating study materials, tracking progress without buttons, and a storage map with no cloud rows in it&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If there&amp;rsquo;s one theme, it&amp;rsquo;s this: &lt;strong&gt;boring, verifiable choices in service of privacy&lt;/strong&gt;. Exact search instead of approximate. SQLite files instead of hosted databases. Grammar-constrained JSON instead of hopeful parsing. Soft deletes instead of clever index surgery. Every piece is something you can open, read, and check — which is exactly the point.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="appendix-abbreviations-in-this-post"&gt;Appendix: Abbreviations in this post&lt;/h2&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Abbreviation&lt;/th&gt;
&lt;th&gt;Full form&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;JSON&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;JavaScript Object Notation&lt;/td&gt;
&lt;td&gt;The structured format the generators force the model to produce&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SQLite / SQL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;(SQL = Structured Query Language)&lt;/td&gt;
&lt;td&gt;A complete relational database living in one file, &lt;code&gt;progress.db&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCQ&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Multiple-Choice Question&lt;/td&gt;
&lt;td&gt;One of the two quiz question types (the other is true/false)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CPU&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Central Processing Unit&lt;/td&gt;
&lt;td&gt;Where Whisper runs — no graphics card required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;int8&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;8-bit integer (quantisation)&lt;/td&gt;
&lt;td&gt;Storing model weights as small integers: smaller, faster, accurate enough&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Artificial Intelligence&lt;/td&gt;
&lt;td&gt;Software performing tasks that normally need human intelligence&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Application Programming Interface&lt;/td&gt;
&lt;td&gt;The endpoints the Study Hub and dashboard call&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;FAISS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Facebook AI Similarity Search&lt;/td&gt;
&lt;td&gt;The vector index in the privacy-receipts table&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DBOS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Database-Oriented Operating System&lt;/td&gt;
&lt;td&gt;The durable-workflow library whose checkpoints live in PostgreSQL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SSRF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Server-Side Request Forgery&lt;/td&gt;
&lt;td&gt;The attack class the URL importer guards against&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PNG / PDF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Portable Network Graphics / Portable Document Format&lt;/td&gt;
&lt;td&gt;Two of the mindmap export formats (plus Markdown)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SVG&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Scalable Vector Graphics&lt;/td&gt;
&lt;td&gt;The browser drawing format behind the interactive mindmap rendering&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;hr&gt;
&lt;p&gt;&lt;strong&gt;Next steps:&lt;/strong&gt; clone
and read along — the README maps the full architecture, and every claim in this series can be checked directly against the code in &lt;code&gt;backend/&lt;/code&gt;. And if you want the deep-dive versions of these topics, the
picks up where this tour ends.&lt;/p&gt;</description></item></channel></rss>