← Back to Index

Bivouac AI Pipeline Architecture

Complete agent pipeline from YouTube URL to published summary with inline figures and quality evaluations.

Bivouac Agent Pipeline Architecture Diagram

Pipeline Overview

9 stages transform a YouTube video into a rich, evaluated transcript summary:

Stages 1-2: Audio acquisition (yt-dlp) and transcription (Lightning-Whisper-MLX at 20x real-time)

Stages 3-4: AI correction (GPT-4o) fixes terminology errors, enrichment adds context via web search

Stage 5: Summarization (GPT-4o) creates structured analytical summary

Stage 6: Figure generation (Gemini Imagen 3) creates inline visualizations

Stages 7-8: Quality evaluation (Claude Sonnet 4.5) assesses summaries and figures

Stage 9: Publication to GitHub Pages with evaluation badges and links