AI-Built SaaS DiaryMay 29, 2026· 5 min read

What the AI Agents Got Right — and What They Messed Up

An honest scorecard on building a SaaS with AI coding agents: where they genuinely sped me up, where they quietly got it wrong, and the bugs I only caught by reading every line.

People keep asking me the same thing: did the AI agents actually build Metavulus, or is that just marketing? Honest answer — they wrote a lot of the code, and they also produced some of the worst bugs I shipped. Here's the scorecard, with specifics, so you can calibrate your own expectations.

What they got right

Scaffolding and boilerplate, fast. New Next.js route, a Prisma model and its CRUD, the first pass of a component with sensible Tailwind classes that already matched my dark theme — this is where agents shine. The stuff that's tedious but well-trodden gets done in minutes, and it's usually about 80% right. For a solo founder, that 80% is the difference between shipping a feature this week and shipping it next month.

Holding a pattern across files. Once I established a convention — bilingual copy objects with identical en/id keys, data-no-i18n on the root, server components reading the locale — the agents replicated it consistently across new surfaces. That consistency is genuinely hard to maintain by hand when you're tired at 1am, and it's where a lot of solo-founder codebases rot. The agents kept it tidy.

Explaining unfamiliar territory. When I touched something I didn't fully understand — caching behavior, a Prisma relation, an edge case in the App Router — the agent could explain the trade-offs in plain language and show me two options. It turned "I don't know what I don't know" into "here are the three things to decide." That's a real accelerant for someone learning on the job.

First-draft bilingual copy. The agents are good at producing a solid first pass of parallel English/Indonesian copy. Not final — I always edit the Bahasa so it reads like a person wrote it — but a strong starting point that saves me staring at a blank file.

What they messed up

Now the uncomfortable half. None of this is "AI bad" — it's "AI is a fast junior who is confidently wrong in specific, predictable ways."

Confident wrong assumptions that pass review. The locale bug from the last entry is the cleanest example. The agent leaned on the cookie to decide language, which reads perfectly fine in a diff. It only fell apart with real first-visit traffic where the cookie wasn't set yet. The lesson: agents optimize for code that looks correct, not code that survives production. They don't have a mental model of a real user's first second on the site. You do. That's your job.

Hallucinated APIs on the road less travelled. The moment I reached for anything slightly off the mainstream, the agent would invent a method or a config option that simply doesn't exist, stated with total confidence. On common Next.js and Prisma paths it's reliable; off them, it bluffs. This is a strong argument for the boring stack — you keep the agent on terrain where it's actually trustworthy.

Over-engineering when under-engineering was the answer. Left unchecked, agents love to add abstraction: a helper, a wrapper, a config, a "flexible" generic for a thing used in exactly one place. More than once I deleted an agent's clever layer and replaced it with five obvious lines. They reach for "robust" when "simple and readable" was the actual requirement.

Subtle bilingual misses. The agents would occasionally translate a trading term that should stay English — turning funded or drawdown into a literal Indonesian phrase no trader uses. To the agent it's "be consistent, translate everything." To a real Indonesian trader it reads as someone who doesn't know the domain. These are the misses that only a human in the audience catches.

The actual workflow that works

After all this, here's the loop I trust:

I decide what to build and why. The agent never picks the feature. That requires knowing the audience, and that's not in the training data.
The agent drafts. Scaffolding, first-pass component, CRUD, draft copy.
I read every line. Not skim — read. This is non-negotiable. The bugs that reach users are the ones I waved through.
I test like a real, hostile first-time user. First visit, wrong locale, logged out, slow connection. That's where agent assumptions die.
I edit the Bahasa by hand. Always.

The headline isn't "AI builds your startup." It's "AI raises your floor and your speed, and you stay fully responsible for the ceiling." A fast junior who never gets tired is incredibly valuable — and would also burn the place down if you stopped reviewing.

Where this leaves Metavulus

Faster than I could have built alone, with a few scars that taught me where the human has to stay in the loop. The product is bilingual, live, and being shaped by what real traders actually use — clarity over volume, as the last entry covered.

The usual honest note: Metavulus is a research tool, not financial advice, and trading carries real risk. I'm using AI to build faster, not to promise anyone certainty — in code or in markets.

That's three entries in. I'll keep logging the real version of this — what ships, what breaks, what people pay for, and where the agents help versus hurt. Thanks for reading the messy, honest one.

Written & reviewed by

Aries Yuangga

Founder of Metavulus · Licensed Futures Advisor

Aries Yuangga is the founder of Metavulus and a BAPPEBTI-licensed Futures Advisor (Wakil Penasihat Berjangka). He writes about trading with a focus on structure, risk management, and Indonesia-specific regulation — not hype.

BAPPEBTI Futures Advisor permit 0015/UPTP/SI-WPA/8/2024Bank Indonesia authorized derivatives advisor (PUVA)

The author's license is a personal credential. This article is educational and is not licensed financial advice.

About Metavulus →

Back to all articles

What they got right

What they messed up

Now the uncomfortable half. None of this is "AI bad" — it's "AI is a fast junior who is confidently wrong in specific, predictable ways."

The actual workflow that works

After all this, here's the loop I trust:

I decide what to build and why. The agent never picks the feature. That requires knowing the audience, and that's not in the training data.

The agent drafts. Scaffolding, first-pass component, CRUD, draft copy.

I read every line. Not skim — read. This is non-negotiable. The bugs that reach users are the ones I waved through.

I test like a real, hostile first-time user. First visit, wrong locale, logged out, slow connection. That's where agent assumptions die.

I edit the Bahasa by hand. Always.

Where this leaves Metavulus

The usual honest note: Metavulus is a research tool, not financial advice, and trading carries real risk. I'm using AI to build faster, not to promise anyone certainty — in code or in markets.

That's three entries in. I'll keep logging the real version of this — what ships, what breaks, what people pay for, and where the agents help versus hurt. Thanks for reading the messy, honest one.