AI – kbps

LLMs are a resource, not a tool

Feb 12, 2026 Jay

Large language models — ChatGPT-like text-generating AIs — are unique in a way that makes it difficult to conceptualize what they’re doing. Strictly speaking, they’re predictive text generators, and nothing more. Their ability to recite facts is a side-effect of their having been trained on vast amounts of text that contains, among other things, those facts. They were not designed as “fact-retrieval engines.”

Nor were they designed as “software development engines,” despite their increasing use in this domain. Every tool or piece of software that uses LLMs for code generation feeds instructions to the LLM — invisibly to the user — that begin with the phrase: “You are an expert software engineer” (or something nearly identical to this).

With no more text to consume, and with fewer sources for their insatiable need for energy, LLMs are seeing the bulk of their improvements these days coming from the tools we’re building around them. This includes “prompt engineering” (knowing the secret grammar that will get them to do what you want), multi-agent workflows (LLMs using other LLMs, ad infinitum), etc.

LLMs’s proficiency with facts — yes, they screw up sometimes, and as such we shouldn’t rely on them for critical matters, but their fact-retrieval is astounding — has persuaded us that behind them is a general artificial intelligence, understanding our questions and rifling through its database of facts to give us answers.

As a result, first-time users of an LLM for code generation will often be perplexed at how poorly it understands their existing code, learning shortly thereafter that providing even a few bits of documentation dramatically improves the results — demonstrating just how helpless these things are for most tasks without our assistance.

Recently, Vercel, a cloud computing company, released an AGENTS.md file (a text file standard for coaching LLMs, written in plain English, with intermittent code examples) for writing code in React, a popular JavaScript framework for building complex web applications. This file is nearly 10,000 words, with an estimated reading time of half an hour.

If you take nothing else away from this blog post, remember this: for all the hype around AI, the state of the art in 2026 includes writing an entire book about React, and telling LLMs to read it before every interaction.

This has had me thinking about what an LLM is, exactly. What do we call something like this?

It may be easiest and most common to refer to it as a tool, but this isn’t quite right. A tool is designed, sometimes exquisitely, to achieve a specific purpose. LLMs aren’t that. Just about all their benefits to us are accidental.

Instead, a better metaphor might be to think of LLMs as a resource that we harness rather than a tool that we use. A resource in the sense that electricity, for instance, or the wind is a resource.

The wind is powerful and useful, but aimless and indifferent. We use the wind to power turbines, to fly kites, to sail. Sailing evolved and improved not because the wind got better at propelling our boats — it wasn’t designed to do that — but because we built better sails and developed better sailing techniques. And this is what we’re doing with LLMs today — building better sails, developing better techniques.

I’m reminded of Steve Jobs’ quip about Dropbox: that it’s a feature, not a product. I see echoes of this in Apple’s reluctance to turn Siri into a chat UI, perhaps because they’re seeing it in the same way.

LLMs make good analysts, bad oracles

Jun 10, 2024 Jay

AI Apple ChatGPT Google LLMs Meta OpenAI

I’m not an AI apologist by any means, but I’m frustrated by the muddled way LLMs have been marketed, portrayed, and used. I want to focus on the utility of them here, rather than the moral or legal implications of using copyrighted content to feed their corpora.

One of the first things we started doing when ChatGPT became public was, naturally, asking it questions. And for the most part, it gave us some pretty good answers.

But as has been widely demonstrated recently — as Google, Meta, and others have begun grafting “AI” onto their search results — is just how wrong it can get things, and it has had us asking: Is this really ready for widespread adoption? Asking for arbitrary bits of information from an LLM’s entire corpus of text — like Google’s and Bing’s smart summaries — is demonstrably, hilariously, and sometimes dangerously flawed.

Over the last couple years, I haven’t really heard much from OpenAI themselves about what we are supposed to be using ChatGPT for. They seem more interested in creating the technology — which no one could seriously doubt is impressive — than in finding real-world applications for it. The announcement for ChatGPT didn’t tell us what to do with it (though it did emphasize that the tool can be expected to product false information).

I think the misconception about what ChatGPT is purported to be good at can be attributed to the name and the UI. A chat-based interface to something called “ChatGPT” implies that the product is something it isn’t. It’s technically impressive, of course, and makes for a good demo. But chat doesn’t play to its strengths.

The reason any given LLM is even able to wager a guess at a general knowledge question is the corpus of text it’s been trained on. But producing answers to general knowledge questions is a side-effect of this training, not its purpose. It isn’t being fed an “encyclopedia module” that it classifies as facts about the world, followed by a “cookbook module” that it classifies as ways to prepare food. It was designed to produce believable language, not accurate language.

Where it does excel, however, is at coming to conclusions about narrow inputs. Things like Amazon’s review summaries; YouTube’s new grouping of comments by “topic”; or WordPress’s AI Feedback — these take specific streams of text and are tasked with returning feedback or summaries about them, and seem to work pretty well and have real utility.

These examples demonstrate two similar but distinct facets of LLMs: Their role as general knowledge engines, or “oracles,” and as input processing engines, or “analysts.” When we ask ChatGPT (or Google, or Meta) how many rocks we should eat per day, we are expecting it to behave as an oracle. When we ask it to summarize the plot of a short story or give us advice for improving our resume, we are expecting it behave as an analyst.

Signs point to Apple using LLMs primarily as analysts in the features to be announced at today’s WWDC, processing finite chunks of data into something else, rather than snatching arbitrary knowledge out of the LLM ether.

The allure of ChatGPT as an oracle is of course hard to resist. But I think if we recognize these two functions as separate, and focus on LLMs capabilities as analysts, we can wring some value out of them. (Their environmental impact notwithstanding.)

One Response