Felipe Antolinez on agentic-ai

6 posts tagged "agentic-ai"

Wednesday, 15.4.2026

I think that everyone working in AI should build their own agent from scratch, at least once. Not because it's hard, but because it's surprisingly easy, which is precisely the point.

Exactly one year ago today, I read Thorsten Ball's How to Build an Agent, or: The Emperor Has No Clothes. In this blog post, which is the single piece of text that most influenced me last year, he shows how to build a fully functional coding agent from scratch in under 400 lines of code.

Shortly thereafter, we did a one-day hackathon at Ren Systems, and by the end of the day, we had our own working agent running in the terminal. We didn't even use Anthropic's Claude Code back then, so we actually typed the complete agent code manually.

Only after building our own agent did I really start to understand what agents are. I had been reading about agents for months, but something fundamentally different clicked in my brain when I saw our own agent interact with the user, interpret the intent, and iteratively choose the right tools to achieve the goal. This emergent behavior is hard to appreciate without experiencing it so directly.

Agents have become common by now, but I still recommend everyone working in this space to build one from scratch. It sounds almost esoteric, but you have to touch it yourself to really feel what's going on. The moment your agent does something you didn't explicitly program it to do is when the line between deterministic code execution and something that's conscious and thinking starts to blur. You know it's an illusion, but it's a remarkably convincing one.

# 11:15 am on LinkedIn / coding-agents, ai, agentic-ai, ren

Sunday, 12.4.2026

Minions: Stripe’s One-Shot, End-to-End Coding Agents (Stripe Engineering Blog). A fascinating two-part blog post (Part 1, Part 2) from Stripe's engineering team on how they built their internal coding agents, which they call minions. What first stood out to me is how remarkably well-written these posts are. At a time when many engineering blog posts read as if they were mostly AI-generated, a piece with this much clarity is a strong signal of Stripe's commitment to quality in everything they do.

Stripe's minions are fully unattended agents built for one-shot coding tasks. An engineer can kick off a minion from Slack, and it produces a pull request that passes CI and is ready for review, with no human interaction in between. Over a thousand PRs merged per week at Stripe are entirely minion-produced.

As someone working at a startup, I find it fascinating to see this level of investment in what I've been calling "engineering the machine that writes the code". What makes this particularly notable is that Stripe is operating in a very high-stakes environment with high demands on reliability and robustness.

Stripe's system is complex, far beyond what a startup with limited resources could build internally. But what makes it interesting is that minions were built on top of infrastructure Stripe had already developed for human engineers:

We built out devboxes for the needs of human engineers, long before LLM coding agents existed. As it turns out, parallelism, predictability, and isolation were also very desirable properties as well for Stripe engineers to be able to work most effectively. What's good for humans is good for agents, and building on this infrastructural primitive paid dividends as a natural home for LLM agents.

The most interesting technical concept in the post is what they call "blueprints." Anthropic's blog post on building effective agents distinguishes between workflows (fixed execution graphs of LLM calls) and agents (loops with tools). Blueprints are a hybrid: a state machine that interleaves agentic nodes (LLMs or agents can work non-deterministically) with deterministic nodes (e.g., linters, git operations, test runners) that don't invoke an LLM at all. The idea is to put the LLMs in a contained box for each subtask, constraining its tools and context as needed, and guarantee that certain steps always happen correctly.

A few other things stuck with me. Stripe built a centralized internal MCP server, called Toolshed, which hosts nearly 500 tools spanning internal systems and SaaS platforms, and to which all of Stripe's agents can connect. Stripe's engineers also make extensive use of agent rule files that are conditionally applied based on which subdirectory or code files the agent is working in. These rules dynamically provide their coding agents with the necessary context, rather than loading a massive global ruleset, e.g., from a CLAUDE.md file, that would bloat the context window. Notably, all coding at Stripe, whether by humans or agents, happens in sandboxed cloud developer environments called devboxes, which can be spun up in about 10 seconds with all necessary dependencies preloaded.

Our backend engineer, Jan Giacomelli, was inspired by this blog post and just last week built our own internal version: a sandboxed coding agent that one-shots tasks and creates pull requests, which we're calling a "renion." I'm very curious to try it and see where this goes. I'm a strong believer that professional engineering organizations need to engineer their own internal AI systems to some extent, because each company's development environment and requirements are different enough that general tools can't provide maximum value on their own. I'm also curious about how we can bring the "blueprint" pattern of wrapping agents in deterministic workflows to other parts of the AI-powered business logic in our backend.

# 8:10 am / coding-agents, ai, agentic-ai, ren, software-engineering

Monday, 30.3.2026

From skeptic to true believer: How OpenClaw changed my life (Lenny’s Podcast). This is the podcast on OpenClaw I listened to this weekend after the Karpathy episode. I think I understood the appeal of a proactive system that works independently from the start, but I haven't bought into the hype so far. However, I feel that these two podcasts together have started changing my mind—not because of a single capability, but because of the apparent emergent behavior that arises once a Claw has context about you and access to real tools. Agents, as we typically think of them, are reactive: you give them a task, and they execute what they are asked to do. But I now fully realize that Claws are persistent and have personalities of their own. They run in the background, build up memory over time, check in on a schedule, and start acting on your behalf without being prompted.

Claire Vo, who was apparently a big OpenClaw skeptic when it launched, now manages nine agents across multiple Mac Minis for both personal and professional use.

The first thing that stood out to me in this conversation is how well the onboarding is apparently done. Instead of structured forms and settings pages, your Claw just asks you who it is and who you are, and you figure it out together through conversation, as if you hired a new employee. The second thing I learned is how well-crafted the default behavior of the Claw appears to be. The Claw's behavior emerges from some simple markdown files ("soul document"), but the defaults are apparently surprisingly thoughtful and lead to a really pleasant behavior. It sounds like this is something anyone working in product right now should experience firsthand.

I'm now genuinely intrigued to try it myself. To really get the full experience, you clearly need to run it on a separate machine, both for security and because you don't want to think about whether your laptop is online. I should really try setting one up on my Raspberry Pi, or just buy a Mac Mini for it. The other thing I don't really have yet is a clear use case for a Claw. I wonder whether I should try to come up with one before getting started, or whether this is something you just have to go for, because the onboarding seems good enough that the use case will emerge during the setup process.

# 7:18 am / ai, agentic-ai, product, podcast, claws

Sunday, 29.3.2026

Andrej Karpathy on Code Agents, AutoResearch, and the Loopy Era of AI (No Priors Podcast). Andrej Karpathy is always worth listening to because he has the time to experiment and tinker with the latest developments in a way that most people working at companies don't. He effectively lives a few months in the future compared to the rest of us.

Two things stuck with me from this conversation. First, Karpathy frames Claws (from OpenClaw) as another layer of the AI stack: LLMs → Agents → Claws. I have never actually set up a Claw yet, but the persistent memory architecture and how "your Claw" gets to know you over time are things I want to experiment with, as this is directly relevant to what we're working on at Ren as the product becomes more agentic.

Second, his work on AutoResearch. We've discussed the concept internally at Ren multiple times over the past few months, but never found the time to actually try it. We have a concrete problem that would lend itself well to this approach: building a more efficient multi-label classifier. We currently use a relatively heavy model for it, we have abundant training data, and the objective is clear (maximize precision/recall/F1 for a given latency budget). We could just let an AutoResearch system loose on this task. What I'm missing is knowing how to set up a sandbox that's safe enough but has sufficient permissions for the agent to carry out the research on its own. The meta task would then be similar to Claws: build a system in a few markdown files that defines how the agent approaches and documents its research.

# 7:50 am / coding-agents, ai, agentic-ai, ren, podcast, claws

Saturday, 14.2.2026

An Interview with Ben Thompson by John Collison on the Cheeky Pint Podcast. John Collison and Ben Thompson sketch out four levels of agentic commerce in this interview. I like that they start from the bottom up instead of jumping to the far end state.

Reduce friction. Agents that fill out web forms on your behalf. You paste a product URL into ChatGPT and say "buy this for me."
Contextual search. Natural language queries with real context: "I need a jacket for -10°C in the Alps" instead of guessing keywords.
Persistent preference profile. A profile the agent builds over time from your pins, browsing history, or style boards.
Proactive recommendations. Don't wait for the user to search: anticipate what they need and surface it at the right time. Thompson's point is that this already exists at scale. Zuckerberg called Meta's ad platform the most successful agent in the world.

Interesting to think about how these four levels apply to Ren and other AI products. At Ren, we started from the hardest end, level 4, with proactive recommendations.

One could arguably add a level 5: full autonomy. The agent doesn't just recommend, it acts. OpenClaw is the most visible example right now: a local AI agent that browses, buys, books, and executes on your behalf without waiting for approval at each step.

# 6:36 pm / stratechery, ai, agentic-ai, e-commerce, claws

Tuesday, 3.2.2026

OpenClaw took the internet by storm last week. It's obviously a security nightmare, but its viral success confirms something I've believed for a while: most AI products today are fundamentally limited by requiring too much agency from their users.

There's real demand for AI that works for you, not just with you. A system that monitors, anticipates, and acts proactively instead of waiting for your next prompt.

At Ren Systems, we've been building exactly this kind of product: AI that works on behalf of the user and surfaces what matters before they ask for it. There's something almost magical when an AI system works this way.

Proactive AI is where the next value unlock lies. But building it in a way that's secure and compliant isn't something you can prompt out of Claude Code overnight.

# 2 pm on LinkedIn / ai, agentic-ai, ren, claws