How to Build an AI Agent – Part 2: Building an MVP

Part 2 of How to Build an AI Agent series with GMI Cloud

In Part 1, we laid the groundwork for our AI agent by defining the vision, use case, and success criteria. Now, it’s time to bring that vision to life. In this second installment, we’ll walk through the step-by-step process of building a minimum viable product (MVP) AI agent that helps users decide which industry conventions are worth attending based on relevance, cost, and potential ROI.

Let’s dive straight into the build.

1. How Would it Work?

Before building any functionality, we need a solid foundation for development. This includes selecting the right tools, configuring our environment, and setting up a workflow that enables rapid iteration.

Now, what about design? Here's the overall workflow from input to expected output:

Workflow diagram

We break each step down into individual modules in case for some reason we decide we want to shift things around. The framework enables flexibility and allows different parts of the system—like content retrieval or return on investment (ROI) scoring—to evolve independently without breaking the rest of the system.

Or so we hope!

What you'll need:

  • A Large Language Model (LLM) endpoint
  • Dify.ai account (Don't worry, they're free!) for managing the AI workflow
  • Firecrawl — for scraping websites into LLM-ready data. We'll use their free plan.

Get your endpoint

For our purposes, we'll use a GMI-hosted LLM endpoint with OpenAI API compatibility, but you can do this with any endpoint with similar configurations. In our case, simply go to Settings → add “Model Provider” → add OpenAI-API-compatible.

Create a Dify project with Chatflow

Go to https://cloud.dify.ai/apps, make sure you're logged in, and you can begin to create a project. For our purposes, we'll use Chatflow.

2. Data Input

We have our flow, now we assemble each individual module. First, let's look at what's going to be treated as the input because as we know: garbage in = garbage out! 

Our AI agent needs to figure out which industry events are worth sending our teams to, so we're going to need to give it a way to search, extract, and aggregate that information in a format that's ready for the underlying LLM to process. 

It's time to build. 

Add your input parameters:

Search through the API

We'll add an iterator over our search results to understand source context. This step is important as search engines give many results, sorted by relevance. Event websites and aggregators tend to be top results, and the iterator's job is to extract the detailed information about these events.

This is where Firecrawl comes in for scraping.

Add GMI LLM

User prompt: 

**Instructions:**
You are provided with Markdown content containing information about one or more events.  Your task is to extract the event details and return them in a list of JSON objects. Each JSON object should represent a single event and conform to the following schema:

```json
{
 "title": "string",
 "start_date": "string",
 "location": "string",
 "topic/focus": "string",
 "audience": "string",
 "summary": "string"
}
```

Important Considerations:
Strict Adherence to Schema: Ensure that the output is a valid JSON array containing JSON objects that strictly adhere to the provided schema. Do not include any extra text or explanations outside the JSON array.
Handle Missing Information: If the Markdown content does not provide information for a particular field in the JSON schema, set the value of that field to "N/A".
Multiple Events: If the Markdown describes multiple events, return a JSON array containing one JSON object for each event.
Markdown Variations: Be prepared to handle variations in how event information might be presented in the Markdown. Look for keywords like "Date:", "Time:", "Location:", "Topic:", "Audience:", "Summary:", etc., but also be able to infer information from surrounding text.
Data Extraction: Extract the most relevant information for each field. For example, for the "time" field, extract the start date of the event in format of YYYY-MM-DD. For the "summary" field, provide a concise summary of the event.
JSON Output Only: The only output you should provide is the JSON array. Do not include any introductory or concluding remarks.
Markdown Content:
<Scrape text>

At this point, you need to parse the LLM output into a JSON format. By converting into JSON (a structured format), this enables the downstream steps to easily and accurately process the results.

Here's the code we used:

import json
import re

def main(arg1: str) -> dict:
   json_blocks = re.findall(r'(\[\s*\{.*?\}\s*\])', arg1, re.DOTALL)
   all_events: List[dict] = []
   for block in json_blocks:
       try:
           parsed_str = block.encode('utf-8').decode('unicode_escape')
           parsed_json = json.loads(parsed_str)

           # If the parsed JSON is a list, extend the result
           if isinstance(parsed_json, list):
               all_events.extend(parsed_json)
           # If it's a single object, append it
           elif isinstance(parsed_json, dict):
               all_events.append(parsed_json)
       except json.JSONDecodeError:
           continue

   return {
       "events": all_events
   }

What next? Well, we now have input data, but it needs to be cleaned and sorted before the AI agent begins processing. So it's time to aggregate the structured events from different search results, deduplicate, and sort by date.

Here's the code we used:

import json
from datetime import datetime

def main(arg1: list) -> dict:
   unique_events = {event["title"]: event for event in arg1}.values()
   def sort_key(event):
       try:
           return datetime.strptime(event["start_date"], "%Y-%m-%d")
       except ValueError:
           return datetime.max  # Assign max date to push invalid dates to the end
   sorted_events = sorted(unique_events, key=sort_key)
   sorted_events = sorted_events
   return {
       "events": sorted_events,
   }

And that's the data input step! So far, we have created a workflow pattern:

  • Collected data using firecrawl
  • Added context to said data
  • Parsed the data into a usable format
  • Aggregated, deduplicated, and sorted the final parsed data

Next, it's time to teach our agent to search through this data for things we want it to do.

3. Search and Use Data

We have now built a simple pattern (compose web query → web search → parse output with LLM), which we can leverage to search for two key factors when determining ROI: ticket price and hotel price

Note: for a more accurate/advanced search (e.g. ticker price with discount), we can extend this step to inquire a separate agent built specifically for this purpose.

Now, we aggregate the results and output.

4. Testing & Iteration

No MVP is complete without validation. We ran multiple rounds of testing to ensure the agent delivered value.

Validating Data Accuracy

We checked:

  • Are the correct events being pulled?
  • Is cost data accurate and reasonable?
  • Are irrelevant events being filtered out?

User Feedback & Refinement

We shared the MVP with internal stakeholders to gather early feedback. Based on this, we:

  • Improved the clarity of summaries
  • Tweaked the ROI scoring formula
  • Enhanced filtering for niche topics

Performance Optimizations

We also:

  • Reduced scraping latency by caching results
  • Improved LLM parsing reliability with better prompts
  • Added thresholds to minimize false positives

5. Deployment & Next Steps

The final stage was getting the MVP into a usable state for broader testing.

Hosting Considerations

We tested two deployment strategies:

  • Local Docker deployment for fast iteration and isolated testing
  • Cloud deployment on GMI Cloud infrastructure for scalability and remote access

Scaling the MVP

With the core working, future iterations can:

  • Add more event data sources (RSS feeds, curated newsletters, LinkedIn events)
  • Improve the ROI model using fine-tuned ML models or embeddings
  • Expand cost estimation by integrating real-time flight and hotel APIs

Future Enhancements

Looking ahead, potential next features include:

  • Personalized event scoring based on individual goals or preferences
  • Team-based recommendations (e.g., “Is this worth sending our BD team?”)
  • Expansion to other domains like product launches or networking meetups

Final Thoughts

This AI agent started as a simple idea—helping busy professionals decide which events are worth attending. Through methodical development, iterative testing, and the right tooling, we brought that idea to life with a lean, flexible MVP.

In Part 3, we’ll explore how to evolve this MVP into a production-ready system, optimize for performance, and integrate it more deeply into business decision-making workflows.

Until then—build fast, stay curious, and let the AI do the heavy lifting.

By the way, have you seen our Building an AI-Powered Voice Translator guide? It uses only open-source tooling!

Build AI Without Limits!

Build AI Without Limits
GMI Cloud helps you architect, deploy, optimize, and scale your AI strategies
Get Started Now

Ready to build?

Explore powerful AI models and launch your project in just a few clicks.
Get Started