🌟 New Year Offer 🌟
Celebrate 2025 with 30% OFF on all products! Use code: NEWYEAR2025. Hurry, offer ends soon!
Gain the key skills for building LLM agents.
File Size: 3.39 gb.
Matt Pocock (AIhero) – Build DeepSearch in TypeScript
Building AI applications that are genuinely useful involves more than just hitting an LLM API and getting back stock chat responses.
The difference between a proof-of-concept and a production application lies in the details.
Generic chat responses might work for demos, but professional applications need appropriate outputs that align with specific requirements.
In a professional environment code is (ideally) tested, metrics are collected, analytics are displayed somewhere.
AI development can follow these established patterns.
You will hit roadblocks when trying to:
- Implement essential backend infrastructure (databases, caching, auth) specifically for AI-driven applications.
- Debug and understand the “black box” of AI agent decisions, especially when multiple tools are involved.
- Ensure chat persistence, reliable routing, and real-time UI updates for a seamless user experience.
- Objectively measure AI performance moving beyond subjective “vibe checks” for improvements.
- Manage complex agent logic without creating brittle, monolithic prompts that are hard to maintain and optimize.
In this course you will build out a “DeepSearch” AI application from the ground up to help you understand and implement these patterns and ensure a production-ready product.
Days 00-02: Getting Started
You’ll start with a project that is already built out using Next.js TypeScript (of course), PostgreSQL through the Drizzle ORM, and Redis for caching.
The first couple of days you will implement fundamental AI app features building out a Naive agent. You’ll start by hooking up an LLM of your choice to the Next.js app using the AI SDK and implement a search tool it can use to supplement its knowledge when conversing with users.
Chats with an LLM don’t save themselves so you will also save conversations to the database as well.
Days 03-05: Improve Your Agent through Observability and Evals
The first real differentiator between a vibe-coded side project and a production-ready product you can feel confident putting in front of customers is observability and evals.
You need to know what is going on with your LLM calls as well as need an objective means to judge the output that LLMs are producing.
This is exactly what the next few days are about. You’ll hook up your app to LangFuse and get familiar with looking through traces produced by the application.
Once you can see what your LLM is doing, now’s the time to test inputs and outputs of your agent using evals. Evals are the unit test of the AI application world and we’ll start by wiring up Evalite which is an open-source, vitest-based runner. You’ll learn about what makes good success criteria and build out your evals including implementing LLM-as-a-Judge and custom Datasets specific to your product. We’ll also discuss how you can capture negative feedback intro traces that you can feed into your app to make it better.
Days 06-07: Agent Architecture
Up until this point, your app is driven by an increasingly large prompt that will become unwieldy and impossible to test and iterate on when your app complexity grows.
We’ll take these next two days to revisit the over-all architecture of our application and refactor it to better handle complex multi-step AI processes. The primary idea behind this refactor is called Task Decomposition where you allow a smart LLM model determine the next steps to take based on the current conversation but allow room to hand off actual action to focused or cheaper models.
Days 08-09: Advanced Patterns
The last two days we will evaluate what our deepsearch agent is and how we can further optimize output. You’ll learn the differences between “Agent” and “Workflow” and see how in this use-case we’ll lean harder into workflow patterns to build a more reliable product.
In ai-land, this pattern is called the evaluator-optimizer loop which effectively means that if the agent has enough information, it will answer the question presented but if it doesn’t it will search for more information. With this pattern defined we’ll embrace and optimize around its design.
By the end of this cohort you will be confident in building out AI applications that are reliable and will improve through iteration and user feedback. LLM models and the whole AI field is changing rapidly so understanding these fundamentals will give you a foundation to navigate building applications for years to come.
Contents
Before We Start Building DeepSearch In TypeScript
- What Are We Building?
- Installation Instructions (Don’t Skip This!)
- Explore The Repo
- Setting Up Postgres
- Using Drizzle & Drizzle Studio
- Setting Up Redis
- FAQ’s
Day 1: Build A Naive Agent
- Introduction
- Choose An LLM
- Our First Model Call
- Set Up Discord Authentication
- Create A Naive Agent With Serper
- Showing Tool Calls In The Frontend
- Search Grounding (optional)
- Rate Limiting (optional)
- Rate Limiting Anonymous Users (optional)
- Connecting Our App To MCP Servers (optional)
Day 2: Persistence
- Create Database Resources For Persisting Messages
- Persist Chats To The Database
- Creating New Chats In The Frontend
- Showing The Saved Chats In The Frontend
- Fixing The ‘New Chat’ Button (optional)
- Adding ‘use-scroll-to-bottom’ (optional)
Day 3: Debug and Improve your Agent through Observability
- Choosing An Observability Platform
- Integrating Langfuse
- Passing Extra Metadata To Langfuse
- Adding A Scraper
- Making The LLM Date-Aware (optional)
- Improving Our Crawler (optional)
- Reporting DB Calls To Langfuse (optional)
Day 4: Vibe-check your AI App Through Evals with Evalite
- Initializing Evalite
- Choosing Our Success Criteria
- Making Our System Testable
- Our First Deterministic Eval
- Adding A Global Rate Limiter (optional)
- Optimizing Our Prompt (optional)
Day 5: Expand your Evals with LLM-as-a-Judge and Datasets
- The Data Flywheel
- Our First LLM-As-A-Judge Eval
- Create A Simple Dataset
- Organizing Our Dataset Into Dev, CI and Regression (optional)
- Assessing Answer Relevancy (optional)
- Extracting The Parameters Of Our System (optional)
Day 6: Agent Architecture through Task Decomposition
- What’s Wrong With Our Current Approach?
- Designing Our New System Prompt
- Creating a Next Action Picker
- Implementing The Loop
- Connecting Our Loop To The Frontend
- Optimize Our Answering System Prompt With Exemplars (optional)
- Smoothing Our Streaming (optional)
Day 7: Improved App UX and Persistance with Agent Task Decomposition
- Showing The Steps Taken in The Frontend
- Fixing Telemetry
- Passing The Message History
- Persisting Our New Setup To The Backend
- Generating Chat Titles (optional)
- Adding Geolocation Info To The System Prompt (optional)
Day 8: Agents vs Workflows
- Agents vs Workflows
- Collapse Search and Crawl into one tool
- Search, Scrape, Summarize
- Making A Query Rewriter
- Use A Combined Search/Scrape API Instead (optional)
- Resumable Streams (optional)
Day 9: Advanced Patterns
- Building An Evaluator
- Showing Sources In The Frontend
- Implementing Guardrails (optional)
- Implement An Ask Clarifying Questions Step (optional)
- Showing Usage In The Frontend (optional)
- Migrating to AI SDK v5 (optional)
Course Features
- Lectures 0
- Quizzes 0
- Duration 10 weeks
- Skill level All levels
- Language English
- Students 76
- Assessments Yes