The Decision Layer Nobody Built
The largest study of chatbot use reveals that the most valuable AI conversations in enterprises are decision support. None of them are being captured.
In September, OpenAI published the largest study ever conducted 1 of how people actually use a chatbot. Seven researchers, including David Deming at Harvard, classified 1.1 million ChatGPT conversations through a privacy-preserving pipeline. The headline most coverage focused on was the size: 700 million weekly users, around 10% of the world’s adult population, sending 18 billion messages a week.
The finding I cannot stop thinking about is buried deeper in the paper.
The researchers introduced a taxonomy they call Asking, Doing, and Expressing. Asking is when the user wants information or guidance to make a better decision. Doing is when the user wants the model to produce something: write the email, draft the code, generate the report. Expressing is everything else: chitchat, emotional venting, social interaction.
Across the entire sample, 49% of messages are Asking. 40% are Doing. 11% are Expressing.
That ratio gets sharper when you look at who is doing the Asking. Users with graduate degrees, working in management, business, science, and engineering occupations, send a disproportionate share of Asking messages. The more senior the role, the more decision-intensive the work, the more the human is using the chatbot to think, not to produce.
And Asking is growing faster than Doing. OpenAI published an interactive breakdown 2 alongside the paper that makes the trend visible month by month: Practical Guidance is rising, Writing is falling, and the crossover is accelerating.
Anthropic’s own data tells the same story from a different angle. In a study of 81,000 Claude users 3 across 159 countries conducted earlier this year, the single largest category of value people report is not productivity in the narrow sense. It is what they call “cognitive partnership”: using the model as a thinking collaborator. This is not one vendor’s finding. It is the pattern showing up everywhere the data is honest enough to look.
This is the part nobody is talking about. The dominant enterprise use case for generative AI is not automation. It is not productivity. It is decision support. It is people in high-stakes roles pulling out their phones, opening a chat window, and asking the model to help them think through something that matters.
Which means the most valuable AI conversations happening inside your organisation are happening in the worst possible conditions.
The conversations are not what they seem
I want to take that finding and put it next to a fact about how chatbots actually work. Most people, including most executives I speak with, do not understand this fact.
When you ask a chatbot a question, the answer you receive is generated. It is not derived. There is no underlying calculation, no canonical reference, no deterministic process happening behind the curtain. The model produces what is statistically the most plausible next sequence of words given the conversation so far.
Ask the same question tomorrow. You will get a different answer. Not wildly different, usually. The shape will be similar. But the specific numbers, the framing, the assumptions surfaced, the tradeoffs emphasised will shift. And there is no setting in any chat window that controls this. No slider for “give me the same answer you gave me last week.” The reproducibility most enterprise systems take for granted is simply not how these models work.
So what is actually happening when a senior executive uses ChatGPT to think through a capital allocation decision?
A plausible analysis is being generated, in real time, from a model that does not remember what it told the executive last week, cannot reproduce its own reasoning, and has no record of what assumptions it made along the way. The executive reads it, integrates it with their own judgement, makes a decision. The chat window closes. The reasoning vanishes. The decision survives, but the path to it does not.
This is not a problem if the chatbot is being used for trivial queries. If you are asking how to format a date in Python, the non-reproducibility is irrelevant. If you are asking which cities to visit in Portugal, the variation is part of the value.
But the OpenAI paper is telling us that this is not what high-end users are doing with these tools. They are asking decision-support questions. The thing that is supposed to be the firmest part of an enterprise, the reasoning behind important commitments, is being delegated to an interface that is structurally non-reproducible and structurally amnesiac.
The capture trap
There is an obvious response to this. If the conversations matter, capture them. Log everything. Build an enterprise surveillance layer over employee chatbot use. Microsoft Copilot, Google Workspace AI, the entire vendor ecosystem is competing on exactly this promise: bring the AI use inside the enterprise perimeter so the company can see what is happening.
I have watched this play out in three large pharmaceutical companies over the past eighteen months. The pattern is identical and depressing.
The IT function rolls out the enterprise-sanctioned chatbot. Employees are told to use it instead of their personal accounts. Compliance is monitored. Usage is logged. And then, very quickly, a strange thing happens. The serious conversations stop appearing in the logs.
People are not stupid. The moment they understand that their chats are visible to their employer, they stop using the enterprise tool for anything sensitive. The questions they want to ask are the ones about a difficult colleague, a struggling project, a decision they are not yet ready to commit to publicly. Those questions migrate to a personal phone, a personal laptop, a personal account.
The enterprise tool gets the safe questions. The personal account gets the real ones.
And even the personal account is compromised. Every few weeks another story circulates about company data surfacing in model outputs, about conversations being used to train the next version. Most of these stories trace back to previously public data, not to chat inputs. It does not matter. The perception is load-bearing. The people whose decisions would benefit most from AI support are the ones least willing to type the real question into any box, enterprise or personal.
I should be honest: most of the Asking in that 49% is probably low-stakes. Policy questions, how-to queries, terminology lookups. The high-stakes decision conversations are likely a small fraction of the total. But that is exactly the problem. They are a small fraction because the conditions for having them do not exist.
This is the trap, and it has two layers. The enterprise tool fails because surveillance kills candour. The personal tool fails because training-data fear kills trust. The conversations that would have been most valuable are not fleeing to a different platform. They are never happening at all.
Microsoft has not solved this. Google has not solved this. The vendor solution to “we cannot see what our employees are doing with AI” is not working, because the underlying problem was never about visibility. It was about whether the employee believed the system was on their side.
The gap is between chat and institutional memory
So we have three things at once.
The most important enterprise use case for AI is decision support. The OpenAI data is telling us this clearly, and the people using it that way are exactly the people whose decisions matter most.
The interface providing that decision support is non-reproducible by design. The same question gets different answers. The reasoning evaporates the moment the window closes. There is no audit trail because there is nothing to audit.
The standard enterprise response, logging everything, fails because it triggers self-censorship and training-data anxiety, pushing the most valuable conversations out of reach entirely.
These three things together describe a gap that no current product category fills. It is not an AI tool. It is not a chatbot. It is not a logging system or a compliance layer. It is not a knowledge management product, because knowledge management was designed for documents, not for the live reasoning that produces decisions.
What needs to exist is a layer that sits between the chatbot interface and the enterprise’s institutional memory. A layer that captures the structure of a decision: what was being weighed, what assumptions were active, what alternatives were considered, what was finally chosen. Without surveilling the conversation itself. A layer that distinguishes between the chat (private, ephemeral, generative) and the decision (structured, durable, instrumented).
A leading group of investors and practitioners are starting to converge on this gap. Foundation Capital calls it context graphs 4. Others call it decision traces. I have been calling it the Judgement Graph in my own work. The gap is becoming visible now.
If I were a CIO tomorrow
If I were a CIO reading this, and assumed the OpenAI data is true for my own organisation, I would do five things.
-
Identify the 10 to 20 decisions a year where we most regret not having a record of the reasoning: capital allocation, major product bets, regulatory posture, key appointments.
-
Define what a decision record means for us: the options we actually considered, the assumptions we were making about the world, the risks we accepted, the owner and timestamp.
-
Make it cheap to create that record without changing how executives use chat: keep their conversations private and ephemeral, but give them a way to promote a decision into a structured, durable object.
-
Commit, in writing, that those decision records are for learning and accountability, not performance theatre: no surprise gotchas, no retroactive fishing expeditions.
-
Run a quiet pilot around one real decision cycle and ask a simple question at the end: “Would you have made a better, worse, or identical call without this layer?”
This is roughly the shape of the infrastructure I am trying to prototype at ChainAlign: a Judgement Graph that remembers decisions, not chats.
This is not a category your CIO budget anticipated. It is not in any analyst’s market map. It is not on the shopping list when an enterprise sets its AI strategy for the year.
It is, however, the category that the OpenAI data just made unavoidable.
If 49% of the highest-leverage AI use in your enterprise is decision support, and none of it is being captured, and the standard capture mechanisms drive the signal away, then the question is not whether you need decision infrastructure. The question is how long you can pretend you do not.
Everyone is asking how to get more out of AI.
Almost no one is asking what happens to the reasoning when the chat window closes.
Sources
1. Chatterji, Cunningham, Deming, Hitzig, Ong, Shan & Wadman — “How People Use ChatGPT” (2025)
2. OpenAI — ChatGPT usage data, interactive breakdown
3. Huang et al. — “What 81,000 People Want from AI” (2026)
4. Foundation Capital — “Context Graphs: AI’s Trillion-Dollar Opportunity”