Complete agent pipeline from YouTube URL to published summary with inline figures and quality evaluations.
9 stages transform a YouTube video into a rich, evaluated transcript summary:
Stages 1-2: Audio acquisition (yt-dlp) and transcription (Lightning-Whisper-MLX at 20x real-time)
Stages 3-4: AI correction (GPT-4o) fixes terminology errors, enrichment adds context via web search
Stage 5: Summarization (GPT-4o) creates structured analytical summary
Stage 6: Figure generation (Gemini Imagen 3) creates inline visualizations
Stages 7-8: Quality evaluation (Claude Sonnet 4.5) assesses summaries and figures
Stage 9: Publication to GitHub Pages with evaluation badges and links