1. Initial Architecture — Tech Stack and Repository Structure¶
Date: 2026-03-30
Status¶
Accepted
Context¶
Movie Finder needed a fullstack AI application capable of:
- Natural language movie search (semantic, not keyword)
- Live metadata enrichment from IMDb
- Real-time streamed conversation (SSE)
- Stateful multi-turn sessions with JWT authentication
- Independent team ownership of each subsystem
- Cloud deployment with secrets management
Decisions¶
1. Multi-repo submodule structure¶
Decision: Each subsystem (backend app, LangGraph chain, IMDb client, RAG ingestion, frontend) is a separate Git repository, integrated via Git submodules in an orchestrator root.
Rationale:
- Each team can own, release, and deploy their subsystem independently
- Submodule pointers give the integration root explicit version control over each dependency
- Independent CI pipelines per repo (CONTRIBUTION / INTEGRATION / RELEASE modes)
Consequence: Submodule workflow is more complex than a flat monorepo. Contributors must understand submodule pointer commits. Mitigated by documentation (ONBOARDING.md, CONTRIBUTING.md).
2. FastAPI + Python 3.13¶
Decision: Python 3.13 with FastAPI for the backend API layer.
Rationale:
- LangGraph and LangChain have first-class Python support; no bridging layer required
- FastAPI's
async def+StreamingResponsemaps directly to SSE streaming - Pydantic models serve as both validation and documentation (OpenAPI auto-generated)
- asyncpg provides high-performance async PostgreSQL access
3. LangGraph multi-agent pipeline¶
Decision: Compile the AI pipeline as a LangGraph Pregel graph with named nodes.
Rationale:
- State machine is explicit and auditable — no implicit chain-of-thought looping
- Refinement cycles are bounded (max 3) — prevents infinite loops
- Each node is independently testable
- LangGraph supports streaming events per node, enabling real-time SSE output
4. Qdrant Cloud for vector search¶
Decision: Use Qdrant Cloud (managed) as the vector store. No local Qdrant container in any environment.
Rationale:
- Managed service eliminates operational overhead (backups, scaling, upgrades)
- Single production cluster shared across all environments keeps embeddings consistent
text-embedding-3-large(OpenAI, 3072 dim) chosen for quality; Qdrant's cosine similarity performs well at this dimension
Consequence: All environments (dev, staging, prod) share the same Qdrant cluster. A bad ingestion run affects all environments simultaneously. Mitigated by treating Qdrant as read-only outside the RAG ingestion pipeline.
5. PostgreSQL over SQLite¶
Decision: PostgreSQL 16 for the relational store (users, sessions, messages). SQLite was used during early development and migrated via scripts/migrate_sqlite_to_postgres.py.
Rationale:
- PostgreSQL Flexible Server on Azure supports horizontal scaling (multiple Container App replicas can share a single DB)
- SQLite does not support concurrent writes from multiple application replicas
- asyncpg provides connection pooling required for high-concurrency FastAPI workloads
6. JWT authentication (stateless)¶
Decision: Stateless JWT tokens (HS256). Access token: 30 minutes. Refresh token: 7 days.
Rationale:
- Stateless — no server-side session table required (sessions table stores chat sessions, not auth sessions)
- Short access token TTL limits exposure window if a token is intercepted
- Refresh token enables seamless UX without frequent re-login
7. Angular 21 SPA¶
Decision: Angular 21 with TypeScript 5.9 for the frontend.
Rationale:
- Strong typing prevents runtime errors in the SSE streaming parsing logic
- Angular's dependency injection and service layer cleanly separates API logic from UI
- Vitest provides fast unit tests with Angular TestBed compatibility
Consequence: Angular has a steeper learning curve than React for new contributors. Mitigated by frontend/CONTRIBUTING.md documenting Angular conventions used in this project.
8. Azure Container Apps¶
Decision: Azure Container Apps (ACA) for hosting, with Azure Database for PostgreSQL Flexible Server and Azure Container Registry.
Rationale:
- ACA is serverless — no VMs or Kubernetes clusters to manage
- Integrated managed identity for Key Vault secrets eliminates credential injection in pipelines
- Flexible Server supports
maxReplicas > 1for the backend (unlike the original SQLite constraint)
9. Jenkins CI/CD over GitHub Actions¶
Decision: Self-hosted Jenkins (Ubuntu + ngrok) rather than GitHub Actions.
Rationale:
- Existing team familiarity with Jenkins
- Docker socket access on the build agent enables BuildKit cache mounts without cloud runner fees
- Jenkins credentials store is pre-existing infrastructure
Consequence: Jenkins requires on-premise maintenance (OS updates, plugin updates, ngrok URL rotation on free plan). GitHub Actions would reduce operational overhead and is the recommended path for future migration.
Consequences — overall¶
Positive:
- Each team has clear ownership and release cadence
- The LangGraph graph is auditable — every node and edge is visible in code
- Azure managed services reduce operational burden (no DB or registry maintenance)
Negative:
- Multi-repo submodule workflow is complex for new contributors
- Shared Qdrant cluster means no environment isolation for vector data
- Jenkins requires active maintenance; a paid ngrok plan is recommended for production use
Future considerations¶
- ~~Migrate CI from Jenkins to GitHub Actions~~ → GitHub Actions now mirrors Jenkins 1:1 (0005)
- Add a Qdrant staging collection to isolate dev/staging from production vector data
- ~~Consider Terraform/Bicep in
infrastructure/~~ → Terraform IaC implemented (0006)