82
agent-computer-use is a Command Line Interface (CLI) tool designed to enable AI agents to interact with and control any desktop application. It allows agents to click buttons, type text into fields, and read screen content directly from the terminal, using the accesibility tree.
Desktop App Control: Interact with any application as if you were using a mouse and keyboard, but through a simple CLI.
Agent Integration: Built specifically for AI agents, enabling them to autonomously perform tasks by understanding and acting upon the interface.
Accessibility Tree Utilization: Works by reading the accessibility tree, ensuring it can identify and interact with all elements like buttons, text fields, and menus, similar to how screen readers function.
Snapshot and Act: Capture the current state of the application's UI elements with snapshots, assign references, and then use these references to perform actions like clicking or typing.
Cross-Platform Compatibility: Supports macOS, Windows, and Linux environments. (for now)
Built with