Phase 3: processing to the vault
Phase 3 converts approved items into structured Obsidian notes. All generation runs locally via Qwen3.5:9b. Add --hd to any command to use Claude Sonnet 4.6 instead (after explicit confirmation).
Papers
Papers reach _inbox via the Zotero browser extension, the iOS app, or automatically via the feedreader (after calibration). After a Go decision in Phase 2:
verwerk recente papers
Claude Code:
- Retrieves metadata from Zotero MCP (title, authors, year, journal, citation key, tags) — no full text
- Calls the local subagent
process_item.pywith only the item key and metadata:process_item.pyfetches the full text locally, generates a structured note via Qwen3.5:9b, builds the YAML frontmatter, and writes the.mdfile toliterature/- Claude Code receives only
{"status": "ok", "path": "literature/..."}— no source content
- Adds
[[internal links]]to related notes in the vault - Removes the item from Zotero
_inbox
The generated note contains:
- YAML frontmatter (title, authors, year, journal, citation key, tags, status)
- Core question and main argument
- Key findings (3–5 points)
- Methodological notes
- Relevant quotes (original language)
- Links to related notes
Notes are saved to literature/[author-year-keyword1-keyword2].md — Qwen selects 2–4 nouns from the title and TLDR.
Privacy: no paper content ever appears in Claude Code's context.
process_item.pyis a self-contained local subagent — it fetches, generates, and writes without returning any source text to the orchestration layer.
YouTube videos
YouTube items follow an eager transcript pipeline: when you mark a video ✅ in the feedreader, attach-transcript.py runs automatically and stores a cleaned transcript as an attachment in the Zotero item — mirroring how a PDF accompanies a paper. This makes Go/No-go decisions content-based.
If the transcript attachment is missing (e.g. for manually added items), run it explicitly:
~/.local/share/uv/tools/zotero-mcp-server/bin/python3 .claude/attach-transcript.py \
--item-key ITEMKEY --url "https://www.youtube.com/watch?v=..."
This script:
- Fetches the transcript via
YouTubeTranscriptApi(or from.claude/transcript_cache/) - Qwen3.5:9b generates an abstract
- Uploads the transcript as a
.txtattachment to Zotero; setsabstractNote
After a Go decision, generate the literature note the same way as papers:
verwerk recente papers
Claude Code calls process_item.py, which reads the transcript attachment from Zotero locally via fetch-fulltext.py. No transcript content reaches Claude Code.
The generated note contains:
- YAML frontmatter (title, authors, year, tags, status, Zotero deep link)
- TLDR
- Key findings (3–5 points)
- Methodological notes
- (No "Relevant quotes" section — timestamps are unreliable without a verifiable source)
- Links to related notes
- Flashcards (max 3)
Notes are saved to literature/[author-year-keyword1-keyword2].md — Qwen selects 2–4 nouns from the title and TLDR.
Podcasts
Podcast transcripts are created manually via attach-transcript.py — whisper.cpp requires audio download and transcription (minutes of processing), so it cannot run in the batch pipeline.
~/.local/share/uv/tools/zotero-mcp-server/bin/python3 .claude/attach-transcript.py \
--item-key ITEMKEY --url "https://podcast-episode-page-url"
This script:
- Downloads audio using the direct MP3 URL cached from the RSS
<enclosure>tag (viafeedreader-score.py) or falls back to yt-dlp - Detects language automatically from cached show notes (Dutch show notes →
--language nl); override with--languageif needed - Transcribes locally via
whisper-cli(model:large-v3-turbo, Metal GPU, ~2–3 min per 30 min audio on M4) - If
abstractNoteis already filled (show notes set byenrich-inbox.py): moves it to a child note titled "Shownotes" - Generates an abstract via Qwen3.5:9b; sets
abstractNote; stores transcript as.txtlinked-file attachment; adds tag_enriched-transcript
After a Go decision, generate the literature note via process_item.py — same as papers.
If yt-dlp fails with "Unsupported URL": add the feed to feedreader-list.txt. After the next feedreader-score.py run, the direct audio URL is cached and used automatically.
Notes follow the same structure as YouTube: no "Relevant quotes" section (timestamps unreliable); all other sections as papers.
RSS web articles
Non-academic articles from RSS feeds that you forward to _inbox can be processed in two ways:
Via Zotero (recommended for articles worth citing):
verwerk recente papers
The item is already in _inbox with metadata from the Zotero Connector. Processed as a standard literature note.
Notes get #web or #beleid as appropriate.
After processing
After each session, check whether:
- New notes are linked to related existing notes (
[[double brackets]]) - Relevant syntheses in
syntheses/need updating - Flashcards should be created for the new notes:
maak flashcards voor literature/[note].md
If new papers were added to Zotero, update the semantic search database:
update-zotero