Felipe Antolinez's Weblog: product

From skeptic to true believer: How OpenClaw changed my life (Lenny's Podcast)

2026-03-30T07:18:50.437529+00:00

From skeptic to true believer: How OpenClaw changed my life (Lenny's Podcast)

This is the podcast on OpenClaw I listened to this weekend after the Karpathy episode. I think I understood the appeal of a proactive system that works independently from the start, but I haven't bought into the hype so far. However, I feel that these two podcasts together have started changing my mind—not because of a single capability, but because of the apparent emergent behavior that arises once a Claw has context about you and access to real tools. Agents, as we typically think of them, are reactive: you give them a task, and they execute what they are asked to do. But I now fully realize that Claws are persistent and have personalities of their own. They run in the background, build up memory over time, check in on a schedule, and start acting on your behalf without being prompted.

Claire Vo, who was apparently a big OpenClaw skeptic when it launched, now manages nine agents across multiple Mac Minis for both personal and professional use.

The first thing that stood out to me in this conversation is how well the onboarding is apparently done. Instead of structured forms and settings pages, your Claw just asks you who it is and who you are, and you figure it out together through conversation, as if you hired a new employee. The second thing I learned is how well-crafted the default behavior of the Claw appears to be. The Claw's behavior emerges from some simple markdown files ("soul document"), but the defaults are apparently surprisingly thoughtful and lead to a really pleasant behavior. It sounds like this is something anyone working in product right now should experience firsthand.

I'm now genuinely intrigued to try it myself. To really get the full experience, you clearly need to run it on a separate machine, both for security and because you don't want to think about whether your laptop is online. I should really try setting one up on my Raspberry Pi, or just buy a Mac Mini for it. The other thing I don't really have yet is a clear use case for a Claw. I wonder whether I should try to come up with one before getting started, or whether this is something you just have to go for, because the onboarding seems good enough that the use case will emerge during the setup process.

Tags: ai, agentic-ai, product, podcast, claws

Context Windows Are Limited by Atoms, Not Bits

2026-03-01T11:45:00+00:00

There is a popular narrative in tech right now: AI progress is exponential, context windows will grow to infinity, and all vertical AI products will soon be replaced by general-purpose AI that can use all the context of your entire business. This implies that the big players like Anthropic, OpenAI, and Google, with their general-purpose agents like Claude Cowork, ChatGPT, or Gemini, will subsume all software.

I don’t think this will happen. While advertised context windows have grown to 1M or even 10M tokens, there’s a widening gap between advertised capacity and what models can reliably use. Effective context window sizes have been saturating over the past 6–12 months and remain at 200K–1M tokens for most tasks.

The reason is physics. Most people talk only about model capability, but there are actually three things to AI: atoms (the hardware), bits (the logic), and power (the energy required to move electrons through hardware to make computations). Recent breakthroughs have been almost entirely in bits, which means that AI progress in general and context window size specifically will be constrained by atoms and power.

Why Attention Is Hard to Scale

Attention in transformer models have been the basis of all AI progress in recent years.¹ However, the complexity of attention scales quadratically with the context window size, with perhaps surprising implications for memory requirements.

A 1M-token context window corresponds to roughly 5MB of plain text, which isn’t much for many tasks. However, each token doesn’t require only storing a single number. Depending on the embedding size, each token requires storing thousands of numbers across many different layers of the model. Therefore, the key–value cache for a frontier transformer model to run inference on a 1M-token context window easily requires tens or even hundreds of GB of working memory, which is many orders of magnitude more than the raw text.

Many tricks to extend context windows do so by avoiding “true” attention, where each token attends to every other token, which comes with substantial performance costs. Alternative architectures like state-space models promise sub-quadratic scaling, but none have matched current transformer-based frontier models so far. The accuracy tradeoff is likely fundamental, and there is no free lunch.

This means that, in practice, effective context length rarely exceeds half of the advertised maximum. Models exhibit a U-shaped performance curve, performing best on information at the beginning and end while degrading on context in the middle.² And even when models retrieve information perfectly, longer inputs still hurt reasoning.³ Without major breakthroughs in memory technology and power infrastructure, usable context windows are unlikely to grow substantially in the coming years.

Some numbers may help to make this concrete. A frontier model like Opus 4.6 or GPT-5.3 already requires hundreds of gigabytes just to store the weights. NVIDIA’s next-generation GPU, the Rubin R100, which should start shipping in the second half of this year, will have 288 GB of high-bandwidth memory—the same amount as the Blackwell B300 GPU, which started shipping in the second half of 2025. A single long-context session consumes most of the available memory. Therefore, production context windows have expanded on paper, but the effective ceiling at which models reason reliably has barely moved.

High-bandwidth memory and power, not compute, are currently the hard constraints. Memory is expensive, physically difficult to manufacture, and supply-constrained in the coming years.⁴ On the power side, data center electrical capacity in the US is nearly maxed out, with utility connection wait times exceeding 3–5 years.⁵ Increasing AI demand will make both constraints even tighter.

Products as Context Operating Systems

Andrej Karpathy put it well: you can think of the LLM as a CPU and the context window as RAM, which means you need something like an operating system to select and manage context.^6, ⁷ Therefore, context engineering, the practice of selecting the right data for the model's context window depending on the task, will remain essential.

This doesn’t contradict the bitter lesson.⁸ Sutton’s argument is that general methods that leverage more computation eventually outperform all hand-crafted solutions and clever engineering. This is true for algorithms and training, where scaling has consistently won, but the bitter lesson is about what’s theoretically optimal, not what’s deployable given physical constraints. Power grids and memory fabs don’t follow exponential curves. The bitter lesson describes where we’ll end up eventually, but it doesn’t tell us how long it will take to get there.

There is a large gap between the capabilities of current LLMs can do and the value delivered by current products. This product overhang is real and will remain an opportunity in the coming years. Not everything being built now will be obsolete when better models arrive, because physical constraints will limit how quickly they can be developed and deployed.

Context windows saturating might be where general AI progress stalls this time, with breakthroughs likely taking years. In the meantime, products that effectively serve as context operating systems for users and do this well will become tremendously valuable.

Tags: ai, llms, bitter-lesson, product, saas

Boris Cherny (creator of Claude Code) on Lenny's Podcast

2026-02-21T19:35:27+00:00

Boris Cherny (creator of Claude Code) on Lenny's Podcast

I hadn't come across the term "latent demand" before this podcast, and Boris Cherny calls it the single most important principle in product. The idea of latent demand is to watch how users misuse or hack your product to solve their own use cases, and then build specifically for that. Cherny also extends this to AI. With AI products, you should observe what the model/agent is trying to do (e.g., which data it wants to access, which tools are missing, or it has to chain together that could be implemented in a use-case specific tool call), and make that easier.

Cherny also had an interesting comment on innovation. You can't force it, but you have to give people space and psychological safety to fail, but cut ideas that aren't working. Claude Code itself wasn't explicitly on the roadmap, and it wasn't an obvious hit at launch.

He also shared an interesting observation on how roles in and around product are changing with AI. Everyone on the Claude Code team—engineers, PMs, designers, etc.—codes, but with a different angle. He thinks the term "software engineer" might disappear by the end of the year and be replaced by something broader, like "builder".

Tags: coding-agents, ai, product, podcast

Quoting Ben Thompson

2025-12-19T14:45:00+00:00

This new screen aims to solve a rather obvious problem with all of the AI apps: what do you use them for? All of the options on this screen are achievable through a chat interface, but you need to know what to ask for, which is actually step 2 of the process: first you have to know what is possible, and most people don't. This screen aims to solve that: there are obvious filters you can use, and ideas for images you might want to create, like a Christmas card. Again, all of these are doable from a text interface, but there is a reason why purely text interfaces are the domain of so-called graybeards: it's not the typing that is the problem, or even knowing what to type: it's knowing what you could type.

[...]

To summarize, one role of product is to show you what you can do; another role is to inspire you to come up with more of your own ideas.

— Ben Thompson, Stratechery: ChatGPT Image 1.5; Apple v. Epic, Continued; Holiday Schedule

Tags: stratechery, ai, product

Strategy Letter IV: Bloatware and the 80/20 Myth

2025-10-14T10:00:00+00:00

Strategy Letter IV: Bloatware and the 80/20 Myth

A great insight from Joel Spolsky in 2001 that still holds true today: 80% of users use only 20% of your product's features. The problem is that it's never the same 20%; everyone uses a different 20% of features.

Tags: product

Quoting Chris Dixon

2025-10-09T10:00:00+00:00

A popular strategy for bootstrapping networks is what I like to call "come for the tool, stay for the network." The idea is to initially attract users with a single-player tool and then, over time, get them to participate in a network. The tool helps get to initial critical mass. The network creates the long term value for users, and defensibility for the company.

— Chris Dixon, Cited by Ben Thompson in his piece on Sora and Meta's disruption potential.

Tags: stratechery, startups, product