MatheusWinkler/knowledge-pipeline
๐ง Transform voice notes and text into structured knowledge with a private, local-first AI pipeline using offline Whisper and Open WebUI.
What's novel
๐ง Transform voice notes and text into structured knowledge with a private, local-first AI pipeline using offline Whisper and Open WebUI.
Code Analysis
14 files read ยท 5 roundsAn automated knowledge ingestion pipeline that watches input folders for audio/text files, enriches them with LLM-generated metadata (titles, summaries, tags) via Open WebUI, and syncs the resulting Markdown notes to a local Obsidian vault.
Strengths
Strong architectural separation between configuration, text processing logic, and API clients. The use of `watchdog` for file monitoring combined with a debounced job queue is a robust pattern for handling real-time ingestion. The regex-based preprocessing (tag stripping, date extraction) adds significant value before the LLM call.
Weaknesses
Lack of unit tests means reliability depends entirely on manual verification. Error handling in the worker threads is somewhat generic (catching broad exceptions), which could mask specific API failures. The `configure.py` GUI logic is truncated and likely contains complex state management that wasn't fully visible.
Score Breakdown
Signal breakdown
Innovation
Craft
Traction
Scope
Evidence
Commits
6
Contributors
2
Files
17
Active weeks
3
Repository
Language
Python
Stars
1
Forks
0
License
MIT