Agentic Systems

Agentic Abstention: Do Agents Know When to Stop Instead of Act? featured image

Agentic Systems

Agentic Abstention: Do Agents Know When to Stop Instead of Act?

A benchmark and analysis of when tool-using LLM agents should stop and abstain rather than continue acting.

May 1, 2026 • 1 min read

Clarify or Answer: Reinforcement Learning for Agentic VQA with Context Under-specification featured image

Agentic Systems

Clarify or Answer: Reinforcement Learning for Agentic VQA with Context Under-specification

Reinforcement learning for agentic VQA that balances clarification and answering under underspecified context.

Jan 20, 2026 • 1 min read

Agentic Systems

SusBench: An Online Benchmark for Evaluating Dark Pattern Susceptibility of Computer-Use Agents

Benchmarking dark pattern susceptibility of computer-use agents in realistic UI environments.

Jan 1, 2026 • 1 min read