Step 14: Set up filter layer per source

This is the core of the 3-phase model: each source has its own capture route (phase 1) and filter moment (phase 2). Below is a per-source overview of all phases and how you indicate what may enter the vault.

Papers (Zotero)

PhaseWhat
Dump layerZotero _inbox collection — central collection bucket for all sources
Filter momentRun index-score.py to rank items by relevance; read abstract in Zotero, or have Claude Code summarize via Qwen3.5:9b (locally)
GoMove item to the relevant collection in your library
No-goDelete item from _inbox — no note is created

Tag-based filter logic

Claude Code adjusts its evaluation based on the Zotero tag of the item:

TagTreatment
Previously approved — skip Go/No-go, go directly to processing
📖Marked as interesting — only ask the Go/No-go question, no summary
/unread, no tag, or another tagGenerate summary + relevance indication, ask Go or No-go

Items with unknown tags (e.g. your own project tags or type indicators) are therefore treated as /unread: Claude Code generates a summary and asks for a Go/No-go decision.

No-go is always final: a rejected item is deleted from _inbox and receives no note in the vault. Claude Code always asks for confirmation before deletion.

Reading status in Obsidian

Every literature note gets a status field in its YAML frontmatter:

  • status: unread — default for all new notes
  • status: read — set automatically when the Zotero item had a tag (meaning you had already read it before approving)

After reading a note in Obsidian, change status: unread to status: read manually.

To see all unread notes at a glance, create a note with this Dataview query:

TABLE authors, year, journal, tags
FROM "literature"
WHERE status = "unread"
SORT year DESC, file.name ASC

Note: frontmatter tags must be written without # (e.g. tags: [beleid, zorg]). Obsidian adds the # in the UI automatically. Using # inside a YAML array breaks frontmatter parsing.

Relevance scoring with index-score.py

Before starting the Go/No-go review, you can run index-score.py to get a ranked list of inbox items sorted by semantic similarity to your existing library:

~/.local/share/uv/tools/zotero-mcp-server/bin/python3 .claude/index-score.py

The script uses ChromaDB embeddings (all-MiniLM-L6-v2, same model as zotero-mcp) to compute a relevance score (0–100) per item. Items with PDF annotations in Zotero weigh more heavily in the preference profile. Output labels: 🟢 strong match (≥70) · 🟡 possibly relevant (40–69) · 🔴 weak match (<40).

Summary requests

Claude Code can help you with the evaluation — ask for a summary of items in _inbox:

Give me an overview of the items in my Zotero _inbox collection with a 2–3 sentence
summary per item and a relevance assessment for my research on [topic].

Claude Code retrieves the metadata and abstract via Zotero MCP and gives a recommendation per item. You then decide which items deserve the next step.


YouTube videos

PhaseWhat
Dump layerZotero _inbox via iOS share sheet from the YouTube app
Filter momentWatch the first 5–10 minutes, or have Claude Code summarize based on metadata
Gotranscript [URL] in Claude Code
No-goDelete item from _inbox

Podcasts

PhaseWhat
Dump layerZotero _inbox via iOS share sheet from Overcast (overcast.fm URL)
Filter momentListen to the first 5–10 minutes
Gopodcast [URL] in Claude Code (download + transcription + processing)
No-goDelete item from _inbox

For podcasts with rich show notes (≥ 200 characters), clicking the headline in the HTML reader opens a generated article at /article/podcast/{episode_id} — the same layout as YouTube articles, with tag buttons and abstract injection via COinS. For episodes with thin show notes the headline links directly to the source. You can also ask Claude Code to fetch show notes manually:

Fetch the show notes from [URL] and give a 3-sentence summary.

RSS feeds

PhaseWhat
Feedreader (fase 1, bron 1)feedreader-score.py scores all feed items daily; YouTube items are scored using transcript text fetched via youtube_transcript_api; podcast items with show notes ≥ 200 chars have their show notes cached; produces filtered Atom feed + HTML reader sorted by relevance at http://localhost:8765/filtered.html; clicking a YouTube headline opens a generated article (/article/{video_id}) with Zotero tag buttons; clicking a podcast headline (with sufficient show notes) opens a similar article (/article/podcast/{episode_id}); both article types inject the full text into the Zotero Abstract field via rft.description in COinS
Phase 1 — Dump layerBrowse the filtered feed in the HTML reader or NetNewsWire; interesting items forwarded to Zotero _inbox via browser extension or iOS app
Phase 2 — Filter momentScan headline and intro of items in _inbox
Go (academic)Item already in Zotero _inbox → run index-score.py and process via the skill
Go (non-academic)Save via Zotero Connector, or pass inbox [URL] to Claude Code
No-goMark item as read or delete from NetNewsWire; remove from _inbox if already saved

Feedback signals in the HTML reader

The HTML reader captures five distinct behaviour types that feed into the learning loop (feedreader-learn.py):

#BehaviourSignalRecorded as
1Headline clicked + added to ZoteroStrong positiveadded_to_zotero: true
2Headline clicked, not added to ZoteroWeak negative (seen, not interesting enough)added_to_zotero: false after 3 days
3Not clicked, no 👎Ambiguous — not seen, or implicitly ignoredadded_to_zotero: false after 3 days (indistinguishable from type 2)
4👎 pressed without clickingStrong explicit negative (headline was enough to reject)skipped: true immediately
5Headline clicked, then 👎 pressedStrongest negative signal (read and rejected)skipped: true + added_to_zotero: false

Only types 4 and 5 are unambiguous rejections. Type 3 remains ambiguous even with the 👎 button. See Step 12c for details on how these signals are used to calibrate scoring thresholds.