For example, it can simulate keystrokes, mouse clicks, and cursor movement across the screen.
Anthropic has released the "Computer Usage" API, which is available for open beta testing, as reported by TechCrunch. With this API, the model can interact with a computer "like a human": it can "see" the screen, move the cursor, "press" keys, and "click" the mouse button. Sonnet 3.5 can use any application and data available on the computer. To get the model to perform a task, you simply need to give a command, such as asking it to fill out a form by accessing files on the computer. All actions will be visible in a special window.
TechCrunch notes that the model currently struggles with complex tasks: for example, when asked to change a ticket reservation, it succeeded in less than 50% of cases. And when tasked with canceling a reservation, it failed about a third of the time. Anthropic points out that the model also has issues: it performs poorly with "scrolling" and zooming, and sometimes skips tasks. The company warns developers that the model is slow and its use may lead to errors, so they recommend starting with low-risk tasks for testing.