Part 2 of How to Build an AI Agent series with GMI Cloud
Aliquet morbi justo auctor cursus auctor aliquam. Neque elit blandit et quis tortor vel ut lectus morbi. Amet mus nunc rhoncus sit sagittis pellentesque eleifend lobortis commodo vestibulum hendrerit proin varius lorem ultrices quam velit sed consequat duis. Lectus condimentum maecenas adipiscing massa neque erat porttitor in adipiscing aliquam auctor aliquam eu phasellus egestas lectus hendrerit sit malesuada tincidunt quisque volutpat aliquet vitae lorem odio feugiat lectus sem purus.
Viverra mi ut nulla eu mattis in purus. Habitant donec mauris id consectetur. Tempus consequat ornare dui tortor feugiat cursus. Pellentesque massa molestie phasellus enim lobortis pellentesque sit ullamcorper purus. Elementum ante nunc quam pulvinar. Volutpat nibh dolor amet vitae feugiat varius augue justo elit. Vitae amet curabitur in sagittis arcu montes tortor. In enim pulvinar pharetra sagittis fermentum. Ultricies non eu faucibus praesent tristique dolor tellus bibendum. Cursus bibendum nunc enim.
Mattis quisque amet pharetra nisl congue nulla orci. Nibh commodo maecenas adipiscing adipiscing. Blandit ut odio urna arcu quam eleifend donec neque. Augue nisl arcu malesuada interdum risus lectus sed. Pulvinar aliquam morbi arcu commodo. Accumsan elementum elit vitae pellentesque sit. Nibh elementum morbi feugiat amet aliquet. Ultrices duis lobortis mauris nibh pellentesque mattis est maecenas. Tellus pellentesque vivamus massa purus arcu sagittis. Viverra consectetur praesent luctus faucibus phasellus integer fermentum mattis donec.
Commodo velit viverra neque aliquet tincidunt feugiat. Amet proin cras pharetra mauris leo. In vitae mattis sit fermentum. Maecenas nullam egestas lorem tincidunt eleifend est felis tincidunt. Etiam dictum consectetur blandit tortor vitae. Eget integer tortor in mattis velit ante purus ante.
“Lacus donec arcu amet diam vestibulum nunc nulla malesuada velit curabitur mauris tempus nunc curabitur dignig pharetra metus consequat.”
Commodo velit viverra neque aliquet tincidunt feugiat. Amet proin cras pharetra mauris leo. In vitae mattis sit fermentum. Maecenas nullam egestas lorem tincidunt eleifend est felis tincidunt. Etiam dictum consectetur blandit tortor vitae. Eget integer tortor in mattis velit ante purus ante.
In Part 1, we laid the groundwork for our AI agent by defining the vision, use case, and success criteria. Now, it’s time to bring that vision to life. In this second installment, we’ll walk through the step-by-step process of building a minimum viable product (MVP) AI agent that helps users decide which industry conventions are worth attending based on relevance, cost, and potential ROI.
Let’s dive straight into the build.
Before building any functionality, we need a solid foundation for development. This includes selecting the right tools, configuring our environment, and setting up a workflow that enables rapid iteration.
Now, what about design? Here's the overall workflow from input to expected output:
We break each step down into individual modules in case for some reason we decide we want to shift things around. The framework enables flexibility and allows different parts of the system—like content retrieval or return on investment (ROI) scoring—to evolve independently without breaking the rest of the system.
Or so we hope!
For our purposes, we'll use a GMI-hosted LLM endpoint with OpenAI API compatibility, but you can do this with any endpoint with similar configurations. In our case, simply go to Settings → add “Model Provider” → add OpenAI-API-compatible.
Go to https://cloud.dify.ai/apps, make sure you're logged in, and you can begin to create a project. For our purposes, we'll use Chatflow.
We have our flow, now we assemble each individual module. First, let's look at what's going to be treated as the input because as we know: garbage in = garbage out!
Our AI agent needs to figure out which industry events are worth sending our teams to, so we're going to need to give it a way to search, extract, and aggregate that information in a format that's ready for the underlying LLM to process.
It's time to build.
Add your input parameters:
Search through the API
We'll add an iterator over our search results to understand source context. This step is important as search engines give many results, sorted by relevance. Event websites and aggregators tend to be top results, and the iterator's job is to extract the detailed information about these events.
This is where Firecrawl comes in for scraping.
Add GMI LLM
User prompt:
**Instructions:**
You are provided with Markdown content containing information about one or more events. Your task is to extract the event details and return them in a list of JSON objects. Each JSON object should represent a single event and conform to the following schema:
```json
{
"title": "string",
"start_date": "string",
"location": "string",
"topic/focus": "string",
"audience": "string",
"summary": "string"
}
```
Important Considerations:
Strict Adherence to Schema: Ensure that the output is a valid JSON array containing JSON objects that strictly adhere to the provided schema. Do not include any extra text or explanations outside the JSON array.
Handle Missing Information: If the Markdown content does not provide information for a particular field in the JSON schema, set the value of that field to "N/A".
Multiple Events: If the Markdown describes multiple events, return a JSON array containing one JSON object for each event.
Markdown Variations: Be prepared to handle variations in how event information might be presented in the Markdown. Look for keywords like "Date:", "Time:", "Location:", "Topic:", "Audience:", "Summary:", etc., but also be able to infer information from surrounding text.
Data Extraction: Extract the most relevant information for each field. For example, for the "time" field, extract the start date of the event in format of YYYY-MM-DD. For the "summary" field, provide a concise summary of the event.
JSON Output Only: The only output you should provide is the JSON array. Do not include any introductory or concluding remarks.
Markdown Content:
<Scrape text>
At this point, you need to parse the LLM output into a JSON format. By converting into JSON (a structured format), this enables the downstream steps to easily and accurately process the results.
Here's the code we used:
import json
import re
def main(arg1: str) -> dict:
json_blocks = re.findall(r'(\[\s*\{.*?\}\s*\])', arg1, re.DOTALL)
all_events: List[dict] = []
for block in json_blocks:
try:
parsed_str = block.encode('utf-8').decode('unicode_escape')
parsed_json = json.loads(parsed_str)
# If the parsed JSON is a list, extend the result
if isinstance(parsed_json, list):
all_events.extend(parsed_json)
# If it's a single object, append it
elif isinstance(parsed_json, dict):
all_events.append(parsed_json)
except json.JSONDecodeError:
continue
return {
"events": all_events
}
What next? Well, we now have input data, but it needs to be cleaned and sorted before the AI agent begins processing. So it's time to aggregate the structured events from different search results, deduplicate, and sort by date.
Here's the code we used:
import json
from datetime import datetime
def main(arg1: list) -> dict:
unique_events = {event["title"]: event for event in arg1}.values()
def sort_key(event):
try:
return datetime.strptime(event["start_date"], "%Y-%m-%d")
except ValueError:
return datetime.max # Assign max date to push invalid dates to the end
sorted_events = sorted(unique_events, key=sort_key)
sorted_events = sorted_events
return {
"events": sorted_events,
}
And that's the data input step! So far, we have created a workflow pattern:
Next, it's time to teach our agent to search through this data for things we want it to do.
We have now built a simple pattern (compose web query → web search → parse output with LLM), which we can leverage to search for two key factors when determining ROI: ticket price and hotel price
Note: for a more accurate/advanced search (e.g. ticker price with discount), we can extend this step to inquire a separate agent built specifically for this purpose.
Now, we aggregate the results and output.
No MVP is complete without validation. We ran multiple rounds of testing to ensure the agent delivered value.
We checked:
We shared the MVP with internal stakeholders to gather early feedback. Based on this, we:
We also:
The final stage was getting the MVP into a usable state for broader testing.
We tested two deployment strategies:
With the core working, future iterations can:
Looking ahead, potential next features include:
This AI agent started as a simple idea—helping busy professionals decide which events are worth attending. Through methodical development, iterative testing, and the right tooling, we brought that idea to life with a lean, flexible MVP.
In Part 3, we’ll explore how to evolve this MVP into a production-ready system, optimize for performance, and integrate it more deeply into business decision-making workflows.
Until then—build fast, stay curious, and let the AI do the heavy lifting.
By the way, have you seen our Building an AI-Powered Voice Translator guide? It uses only open-source tooling!
Build AI Without Limits!
Give GMI Cloud a try and see for yourself if it's a good fit for AI needs.
Starting at
$4.39/GPU-hour
As low as
$2.50/GPU-hour