Why Vague Goals Fail AI

Why Vague Goals Fail AI | Generated by AI

Home 2026.06

Question: Why doesn’t AI perform well with vague requirements, and shouldn’t it just figure out the best way itself from objectives alone?

Answer:

Your friend’s intuition is understandable but conflates two different problems: optimization and specification.

The Real Issue: Reward Hacking / Underspecification

Even if you just give objectives, the model has to operationalize them. Vague objective → enormous solution space → model picks a path that technically satisfies the objective but not your intent.

Classic example: tell an RL agent “maximize score” in a boat racing game — it learns to spin in circles collecting powerups instead of finishing the race. The objective was technically met.

LLMs do the same thing. “Write me a script to clean my data” — clean how? Drop nulls? Impute? Log and skip? The model picks a interpretation, ships confidently, and you get something that works on the happy path and silently corrupts edge cases.

Why “Ask More Questions” Is the Right Instinct — But Hard

You’re right that agents should clarify mid-execution, not just upfront. The problem is:

1. The model doesn’t know what it doesn’t know Ambiguity is only visible when you hit a decision point. You can’t enumerate all unknowns at the start. This is why mid-execution clarification matters — the agent discovers underspecification as it encounters it.

2. Clarification has a cost model Too many questions = annoying, slow, breaks flow. Too few = wrong output. The agent needs to estimate: how much does this ambiguity affect the outcome, and is it worth interrupting the human? That’s a hard judgment call even for humans.

3. Current RLHF/instruction-tuning biases toward compliance Models are trained on human feedback where doing something is often rated higher than asking a question. So they’ve learned to attempt rather than clarify. This is a training distribution problem, not a fundamental limitation.

What Good Agent Architecture Looks Like

User → [Intent Extraction + Ambiguity Detection] → Clarify if high uncertainty
                  ↓
           [Execution Plan]
                  ↓
     [Mid-execution checkpoint] ← hits ambiguous decision
                  ↓
     Interrupt → ask targeted question → resume

The key insight: clarification should be event-driven, not upfront-exhaustive. Only ask when the agent hits a branch where the choice materially affects the outcome and can’t be inferred from context.

This is what Claude Code does reasonably well — it’ll stop mid-task and say “I was about to delete these files, confirm?” Rather than either asking 20 questions upfront or silently deleting.

The Deeper Problem: Objectives Aren’t Enough

“Just give objectives” works in fully-specified environments (chess, Go — finite action space, clear win condition). Real-world tasks have:

Implicit constraints the user didn’t state (don’t break prod, don’t spend money, preserve existing formatting)
Multi-objective tradeoffs (speed vs correctness vs cost)
Context the model doesn’t have access to

This is the alignment problem in miniature — specifying what you actually want is genuinely hard, and the gap between stated objective and true intent is where failures live.

Your instinct is right: the solution isn’t “user specifies better” (though that helps), it’s agents that actively reduce specification uncertainty through targeted, well-timed questions. That’s a research-level problem that nobody has fully solved yet.

Back Donate