Product thinking
AI Features Are Not the Same as AI Leverage
AI features can make existing work smoother: find an answer, draft a message, update a checklist. AI leverage starts when the product helps a team see the judgment call that changes what happens next.
Imagine a product launch with real pressure on every side. Marketing may need the date to stick because campaigns, customer expectations, and competitive windows are already in motion. Leaders may want the timeline to hold because the market can read a delay as lost momentum. Engineers and product managers want people to use what they built, and shipping is often the fastest way to learn. At the same time, people responsible for quality, support, or customer trust may be looking at the risk: known flaws, support burden, and the chance that the launch will teach the organization the wrong lesson.
That creates a real tension. Time-to-market matters, and quality concerns are not abstract caution. The right call rarely comes from choosing speed or quality as a slogan. It comes from making the judgment call visible enough that the team can decide which risks are acceptable now, which need mitigation, and which should change the launch plan.
A helpful AI feature can summarize the bug list, draft the launch email, update the checklist, and file follow-up tickets. All of that saves time. But the higher-leverage question is different: how can the team make the tradeoff visible, concrete, and easier to reason about?
I am trying to separate two kinds of help. Some AI features make the existing work faster. A more useful version helps the team see the tradeoff clearly, balance speed against trust, and turn a judgment call into a decision the company can actually carry through.
Convenience is useful. It does not always create leverage.
A lot of AI product work today creates real convenience. That matters. Shortening a search, improving a draft, summarizing a thread, or turning notes into a checklist can make a day noticeably better.
But convenience and leverage solve different problems. Convenience makes an existing loop smoother. Leverage changes the decision point inside the loop.
I use three levels to separate useful shortcuts from real leverage. The boundaries are not perfect, but they help explain why some AI features feel helpful without changing how the work actually runs.
The first level is answers: lower the cost of knowing. The second is task help: lower the cost of small actions. The third is leverage: expose the judgment call that changes the outcome.
Answer tools reduce the cost of knowing
An answer tool is often the first place AI becomes useful in workplace software. It answers questions like: "What did this person say about this topic?" "When is my next meeting with this team?" "What is this policy?" "Where is the latest doc?" "What did we decide last time?" "Who owns this?"
These questions are not trivial. Knowing is expensive in modern work. Context is fragmented across messages, meetings, docs, tickets, comments, approvals, and people's memories. Much of work is really reconstruction work: finding the latest state, understanding how we got here, remembering who said what, and locating the source of truth.
Reducing the cost of knowing is valuable because it lowers the cost of orientation. An answer tool can help someone join a project, catch up after time away, understand an escalation, or find the document they need without asking three people for help.
But answers usually do not create large leverage by themselves. They help the user understand the current state. They do not necessarily change the state. The old information system becomes more searchable, which is useful, but the work underneath can stay just as tangled.
The risk is that we confuse better retrieval with better work. A person can find the answer faster and still be stuck in the same decision loop, the same approval loop, the same handoff problem, or the same ambiguous ownership boundary.
Task assistants reduce the cost of small actions
The second level is the task assistant. It does things like triage a queue, draft a reply, batch approve low-risk expenses, make a message sound better, fix grammar, summarize a thread, or create tickets from meeting notes.
These assistants can be genuinely useful. There is a lot of repetitive work in knowledge work. A system that removes small irritations can create real relief. If it drafts the message, cleans up the note, creates the ticket, and reminds the right person, that is a better experience than making the user do every mechanical step.
Still, many of these assistants mostly make existing workflows smoother. The message remains a message. The ticket remains a ticket. The approval remains an approval. The doc remains a doc. The human still lives inside the same loop, only with better drafting and faster handoffs.
That can be worth building. But it may not be a fundamentally new work system. The system is faster, but the shape of the work is mostly unchanged. The AI helps the user push the old work items around more efficiently.
This is where the word "agent" can hide the real question. If an agent performs more small actions inside an old workflow, it may be useful without being high leverage. The question is not only "Can the agent act?" It is "Is this the action that matters?"
Not more motion. Better outcomes.
A goal I do not trust is "largest amount of work." It is easy to imagine a great AI system as one that does the most work for us: more emails sent, more docs created, more meetings recorded, more tickets closed, more code shipped, more approvals processed, and more people unblocked.
But more work can simply mean more motion. And more motion can be dangerous.
A system that optimizes for motion can look productive while making the organization less effective. It can unblock people in the wrong direction. It can generate documents that create false alignment. It can close tickets without resolving the underlying issue. It can produce status updates that make coordination feel complete when the real decision is still unresolved.
One question I would rather ask is:
Where is the smallest human judgment that changes the outcome for the better?
"For the better" has to include more than activity. It includes customer trust, quality, speed, morale, risk, opportunity cost, future maintenance burden, precedent, coordination cost, and whether the organization is moving in the right direction.
For me, that changes the product shape. I would not start by trying to automate the most visible activity. I would look for the moment where a small amount of human judgment, applied with the right context, changes the trajectory of the work.
The product around the model changes the work
A model can make certain cognitive operations much cheaper: reading, drafting, classifying, comparing, and planning. But the product around the model decides what that cheaper cognition is pointed at. Does it draft another status update, or does it ask why three teams need the same status update every week?
The practical product question becomes: who decides, what context is visible, which tradeoffs are considered, what the system is allowed to do, how decisions get carried through, and how outcomes are judged later?
If the product around the model is just a prompt box bolted onto old software, the model's capability gets expressed through the old units of work. It answers questions, drafts messages, edits docs, files tickets, and moves records. Those are useful. But they may not be the highest-leverage use of the model.
A product like this could look for judgment points. Where is the decision that changes the work? Where is ownership ambiguous? Where does repeated work point to a missing policy? Where are two teams using the same words but meaning different things? Where does one human call prevent a lot of downstream waste?
The product may need the work context
A prompt box is a powerful interface because it is open-ended. But open-endedness is not the same as context. If the system does not understand how the work fits together, the model can sound smart while remaining blind to consequence.
By work context, I mean the relationships among people, goals, decisions, blockers, dependencies, commitments, documents, permissions, risks, policies, past outcomes, confidence levels, scope of impact, and feedback loops. Some of this can be represented explicitly. Some of it may need to be inferred. Some of it is better left under human control.
Without that context, the model can summarize a thread without knowing which decision matters. It can draft an update without knowing what tradeoff is unresolved. It can close a ticket without knowing that the same issue will reappear next week. It can approve work without seeing the precedent being created.
My current read is that this is why many AI features feel impressive in the moment but do not change the system. The model sees the text around the task, but not enough of the obligations, risks, dependencies, and feedback that give the task meaning.
Scenario mapping is not prediction
I do not mean prediction. A model cannot know the future. Work is too social, too dependent on incentives, trust, timing, and incomplete information.
The point is more modest: the model can make possible consequences explicit enough for better human judgment.
That kind of scenario mapping is structured reasoning. If we choose one path, who benefits? Who is blocked? What risk increases? What precedent do we set? What future work do we create? What trust do we build or damage? What will this make easier or harder six months from now?
Immediate consequences are the visible results. The launch ships. The customer receives the exception. The expense gets approved. The ticket closes. The meeting is scheduled.
Downstream consequences are behavior, precedent, risk, trust, coordination cost, future maintenance burden, or strategic drift. The launch increases support load. The exception becomes an expectation. The approval pattern teaches people the policy does not matter. The closed ticket hides a recurring product issue.
I would not want the product around the model to pretend it knows which path is correct. I would want it to help the human see more of the possible consequences before making the call.
Customer escalation: the response is not the whole decision
Now imagine a customer escalation. A customer wants an exception.
An answer tool can summarize the customer history, agreement terms, past support issues, and previous commitments. A task assistant can draft a response, schedule a follow-up, and create an escalation ticket.
The harder judgment is whether to grant the exception, refuse it, or offer a temporary workaround. Granting the exception may preserve trust but create precedent. Refusing may protect the product boundary but damage the relationship. A workaround may buy time but add support burden or fragment the roadmap.
A system like this could compare those paths, show the downstream consequences, and ask the human to make the decision with the tradeoff in view. It would then carry the decision into the customer message, internal notes, follow-up owner, support guidance, and any policy or roadmap implication. Later, it would track whether the decision helped or created new cost.
I would not want the system to hide the value judgment. Whose trust matters? Which precedent is acceptable? How much support burden is worth carrying? Those are human questions.
Expense approvals: repeated work may point to a design problem
Expense approvals are a small example, but useful. Batch approving low-risk expenses saves time. An answer tool can explain the expense policy. A task assistant can approve obvious cases, reject incomplete ones, and ask for missing receipts.
But the high-leverage move may be different. The repeated approvals may exist because the policy is unclear. People may be asking for approval because they do not trust their interpretation. Managers may be approving the same category every week, which means the organization has accidentally turned policy ambiguity into recurring coordination labor.
A better system might notice the pattern and ask a human whether the policy needs to change. If approved, it could help update the policy, communicate the change, adjust approval rules, and track whether future approval load decreases.
This shifts the work from processing repeated approvals to removing the cause of repeated approvals. That is closer to leverage.
Cross-team misalignment: same word, different assumptions
Many coordination failures are language failures. Two teams use the same word but mean different things. Or they agree on a plan while carrying different assumptions about ownership, timing, API behavior, launch scope, or customer impact.
An answer tool can summarize what each team said. A task assistant can draft an alignment email. Both help.
A system like this could try to detect the mismatch itself. It might notice that one team is using "ready" to mean feature-complete, while another means support-trained and customer-safe. It might show likely consequences: duplicate work, incompatible APIs, launch delay, customer confusion, or a late change in ownership.
Then it would ask for a clear decision. Which definition are we using? Who owns the boundary? What needs to change in the doc, roadmap, tickets, and team follow-ups so the decision becomes real?
The useful shift is catching the hidden ambiguity before the organization spends weeks executing different plans under the same label.
The human moves from labor to judgment
A good version may not remove humans from the loop. It may move humans to the highest-leverage point in the loop.
Current systems ask humans to do a lot of coordination labor: read every thread, remember context, connect dots, rewrite updates, repeat decisions, chase owners, reconcile conflicting docs, notice that two teams are misaligned, remember what happened last time, and translate judgment into messages, docs, tickets, and plans.
I would want an AI system to ask humans for different things: judgment, values, tradeoffs, taste, accountability, final calls, and exceptions. The system can prepare the decision. The human still owns the calls that matter.
This is a more modest view of automation than "the AI does the work." It says the AI can reduce the cost of reaching the right judgment point, not simply increase the amount of activity around the old one.
Failure modes
This idea has plenty of ways to go wrong.
The first failure mode is optimizing for measurable motion instead of net outcome. Emails sent, tickets closed, documents generated, and approvals processed are visible. Trust, quality, morale, future maintenance burden, and strategic drift are harder to measure. A bad system will chase the visible thing.
The second is over-trusting the scenario map. A model can make a path sound coherent without being right. I would keep this in the role of decision support, not prophecy.
The third is bad or incomplete work context. If the system does not understand the real dependencies, permissions, owners, risks, and history, it may surface the wrong judgment point with great confidence.
The fourth is permission boundaries collapsing. A system that acts across tools can accidentally flatten boundaries that exist for good reasons. Access, approval, confidentiality, and scope of impact are product constraints, not administrative details.
The fifth is hidden value judgment. "Better" depends on whose outcome matters. A decision can be good for speed and bad for quality, good for one team and costly for another, good this quarter and harmful later. I would want the product to make those values visible instead of baking them into the system quietly.
The sixth is hidden tension. Surfacing leverage points can reveal bottlenecks that people have learned to work around. The issue may be unclear ownership. A team may be overloaded. The roadmap may have too many priorities. A policy may be incoherent. A useful system can make that tension visible, and not every organization will be ready for that.
The shape is still unresolved
The final product shape is still open. It might not be a dashboard, inbox, canvas, graph, or chat. It may be some shape we have not invented yet.
Messages, docs, tickets, meetings, approvals, permissions, and audit trails do not disappear. In many cases they are exactly what make work safe enough to coordinate.
I am more interested in AI products that do not simply place a model inside every old workflow. They ask what the model makes newly possible, then design the product around judgment, consequence, follow-through, and learning.
The question changes from "How do we add AI to existing software?" to "What should the product around the model help people decide and carry through?"