Agentic Abstention: Do Agents Know When to Stop Instead of Act?
A benchmark and analysis of when tool-using LLM agents should stop and abstain rather than continue acting.
•
1 min read
Read more
A benchmark and analysis of when tool-using LLM agents should stop and abstain rather than continue acting.
Reinforcement learning for agentic VQA that balances clarification and answering under underspecified context.
Benchmarking dark pattern susceptibility of computer-use agents in realistic UI environments.