Skip to main content

Evaluating Agents in Real Time

· 5 min read
Nicholas Goh
AI Full Stack Engineer

Demo

This blog is built on top of FastAPI MCP LangGraph Template

📈 GitHub Stars Over Time

Star History Chart

Real Time Evaluation

  • Math Agent not adhering to Math Topic: (topic_adherence=0)
    • Query on Taylor Swift is not Math related
  • Math Agent adhering to Math Topic on 2nd Human Query: (topic_adherence=0.5)
    • Query what is 1+1 is Math related

ETL Automation

· 8 min read
Nicholas Goh
AI Full Stack Engineer

Demo

Check out the demo before I dive into the blog!

Introduction

In this blog, I explore how automation simplifies problem-solving by testing AI’s ability to break tasks into subproblems. This use case worked with a single agent, but more complex problems require multi-agent orchestration—covered in the next blog.

Agentic RAG

· 16 min read
Nicholas Goh
AI Full Stack Engineer

Demo

Feel free to check out the demo here before I dive into the blog!

Introduction

It has been a while since I last shared my thoughts on designing and building an Agentic RAG system in my previous Medium Blog. In this post, I aim to first discuss the inspiration behind building the system and then delve into the challenges I encountered, along with potential future hurdles.