In Part 1, we laid the groundwork for our AI agent by defining the vision, use case, and success criteria. Now, it’s time to bring that vision to life. In this second installment, we’ll walk through the step-by-step process of building a minimum viable product (MVP) AI agent that helps users decide which industry conventions are worth attending based on relevance, cost, and potential ROI.
Let’s dive straight into the build.
1. How Would it Work?
Before building any functionality, we need a solid foundation for development. This includes selecting the right tools, configuring our environment, and setting up a workflow that enables rapid iteration.
Now, what about design? Here's the overall workflow from input to expected output:

We break each step down into individual modules in case for some reason we decide we want to shift things around. The framework enables flexibility and allows different parts of the system—like content retrieval or return on investment (ROI) scoring—to evolve independently without breaking the rest of the system.
Or so we hope!
What you'll need:
- A Large Language Model (LLM) endpoint
- Dify.ai account (Don't worry, they're free!) for managing the AI workflow
- Firecrawl — for scraping websites into LLM-ready data. We'll use their free plan.
Get your endpoint
For our purposes, we'll use a GMI-hosted LLM endpoint with OpenAI API compatibility, but you can do this with any endpoint with similar configurations. In our case, simply go to Settings → add “Model Provider” → add OpenAI-API-compatible.

Create a Dify project with Chatflow
Go to https://cloud.dify.ai/apps, make sure you're logged in, and you can begin to create a project. For our purposes, we'll use Chatflow.

2. Data Input
We have our flow, now we assemble each individual module. First, let's look at what's going to be treated as the input because as we know: garbage in = garbage out!
Our AI agent needs to figure out which industry events are worth sending our teams to, so we're going to need to give it a way to search, extract, and aggregate that information in a format that's ready for the underlying LLM to process.
It's time to build.
Add your input parameters:

Search through the API


We'll add an iterator over our search results to understand source context. This step is important as search engines give many results, sorted by relevance. Event websites and aggregators tend to be top results, and the iterator's job is to extract the detailed information about these events.
This is where Firecrawl comes in for scraping.

Add GMI LLM
User prompt:
**Instructions:**
You are provided with Markdown content containing information about one or more events. Your task is to extract the event details and return them in a list of JSON objects. Each JSON object should represent a single event and conform to the following schema:
```json
{
"title": "string",
"start_date": "string",
"location": "string",
"topic/focus": "string",
"audience": "string",
"summary": "string"
}
```
Important Considerations:
Strict Adherence to Schema: Ensure that the output is a valid JSON array containing JSON objects that strictly adhere to the provided schema. Do not include any extra text or explanations outside the JSON array.
Handle Missing Information: If the Markdown content does not provide information for a particular field in the JSON schema, set the value of that field to "N/A".
Multiple Events: If the Markdown describes multiple events, return a JSON array containing one JSON object for each event.
Markdown Variations: Be prepared to handle variations in how event information might be presented in the Markdown. Look for keywords like "Date:", "Time:", "Location:", "Topic:", "Audience:", "Summary:", etc., but also be able to infer information from surrounding text.
Data Extraction: Extract the most relevant information for each field. For example, for the "time" field, extract the start date of the event in format of YYYY-MM-DD. For the "summary" field, provide a concise summary of the event.
JSON Output Only: The only output you should provide is the JSON array. Do not include any introductory or concluding remarks.
Markdown Content:
<Scrape text>

At this point, you need to parse the LLM output into a JSON format. By converting into JSON (a structured format), this enables the downstream steps to easily and accurately process the results.

Here's the code we used:
import json
import re
def main(arg1: str) -> dict:
json_blocks = re.findall(r'(\[\s*\{.*?\}\s*\])', arg1, re.DOTALL)
all_events: List[dict] = []
for block in json_blocks:
try:
parsed_str = block.encode('utf-8').decode('unicode_escape')
parsed_json = json.loads(parsed_str)
# If the parsed JSON is a list, extend the result
if isinstance(parsed_json, list):
all_events.extend(parsed_json)
# If it's a single object, append it
elif isinstance(parsed_json, dict):
all_events.append(parsed_json)
except json.JSONDecodeError:
continue
return {
"events": all_events
}
What next? Well, we now have input data, but it needs to be cleaned and sorted before the AI agent begins processing. So it's time to aggregate the structured events from different search results, deduplicate, and sort by date.

Here's the code we used:
import json
from datetime import datetime
def main(arg1: list) -> dict:
unique_events = {event["title"]: event for event in arg1}.values()
def sort_key(event):
try:
return datetime.strptime(event["start_date"], "%Y-%m-%d")
except ValueError:
return datetime.max # Assign max date to push invalid dates to the end
sorted_events = sorted(unique_events, key=sort_key)
sorted_events = sorted_events
return {
"events": sorted_events,
}
And that's the data input step! So far, we have created a workflow pattern:
- Collected data using firecrawl
- Added context to said data
- Parsed the data into a usable format
- Aggregated, deduplicated, and sorted the final parsed data
Next, it's time to teach our agent to search through this data for things we want it to do.
3. Search and Use Data
We have now built a simple pattern (compose web query → web search → parse output with LLM), which we can leverage to search for two key factors when determining ROI: ticket price and hotel price
Note: for a more accurate/advanced search (e.g. ticker price with discount), we can extend this step to inquire a separate agent built specifically for this purpose.

Now, we aggregate the results and output.

4. Testing & Iteration
No MVP is complete without validation. We ran multiple rounds of testing to ensure the agent delivered value.
Validating Data Accuracy
We checked:
- Are the correct events being pulled?
- Is cost data accurate and reasonable?
- Are irrelevant events being filtered out?
User Feedback & Refinement
We shared the MVP with internal stakeholders to gather early feedback. Based on this, we:
- Improved the clarity of summaries
- Tweaked the ROI scoring formula
- Enhanced filtering for niche topics
Performance Optimizations
We also:
- Reduced scraping latency by caching results
- Improved LLM parsing reliability with better prompts
- Added thresholds to minimize false positives
5. Deployment & Next Steps
The final stage was getting the MVP into a usable state for broader testing.
Hosting Considerations
We tested two deployment strategies:
- Local Docker deployment for fast iteration and isolated testing
- Cloud deployment on GMI Cloud infrastructure for scalability and remote access
Scaling the MVP
With the core working, future iterations can:
- Add more event data sources (RSS feeds, curated newsletters, LinkedIn events)
- Improve the ROI model using fine-tuned ML models or embeddings
- Expand cost estimation by integrating real-time flight and hotel APIs
Future Enhancements
Looking ahead, potential next features include:
- Personalized event scoring based on individual goals or preferences
- Team-based recommendations (e.g., “Is this worth sending our BD team?”)
- Expansion to other domains like product launches or networking meetups
Final Thoughts
This AI agent started as a simple idea—helping busy professionals decide which events are worth attending. Through methodical development, iterative testing, and the right tooling, we brought that idea to life with a lean, flexible MVP.
In Part 3, we’ll explore how to evolve this MVP into a production-ready system, optimize for performance, and integrate it more deeply into business decision-making workflows.
Until then—build fast, stay curious, and let the AI do the heavy lifting.
By the way, have you seen our Building an AI-Powered Voice Translator guide? It uses only open-source tooling!
Build AI Without Limits!


