Coin World News Report:
Reddit Users
The first to discover it
–Claude suddenly
Sharper,
More capable
Now we know why: Anthropic has made significant upgrades to its AI models, including an enhanced Claude 3.5 Sonnet and much-needed upgrades to its lightweight Haiku model.
Latest News: These artificial intelligences can now physically control computers like humans, move cursors, scroll pages, and even click buttons.
In a video demonstration, Anthropic researcher Sam Ringer showed how Claude was able to fill out forms on external websites by scrolling through spreadsheets, searching for company information after analyzing the company’s CRM, and then understanding and filling out fields in the forms.
Anthropic stated in an article, “With Claude now available on API, developers can instruct Claude to use computers just like people look at screens, move cursors, click buttons, and input text. Claude 3.5 Sonnet is the first cutting-edge AI model that provides computer usage.”
The official announcement was made earlier today. “We released computer usage in advance for developers to provide feedback and expect this capability to rapidly improve over time.”
Anthropic (or possibly one of their AI artificially intelligent button-pressers? Jk.) seems to have released the model before they made the announcement. For the past few hours, the Claude and Anthropic subreddits have been crowded with people trying to figure out what happened because their AI is performing so well: users report that it’s faster, more accurate, and surprisingly, it no longer apologizes.
“Claude is back, so much better. It just responds to you like it really understands intent, instead of being flat and lifeless.”
NextGenA user said in a post on Reddit. “I was stuck on a coding problem for hours with o1 Mini and o1 Preview, gradually getting worse responses. Submitted the same prompt to Claude and it had no problem right away.”
Roth_Skyfire said in another comment.
They’re right. According to Anthropic’s report, in SWE bench verification tests, the coding capability of Claude 3.5 Sonnet jumped from 33.4% to 49%, surpassing competitors like OpenAI’s o1 Preview. This is not just a minor improvement. Every benchmark reported by Anthropic indicates that the new Claude 3.5 Sonnet is much better than the original model.
Image: Anthropic
But here’s where things get really interesting. The upgraded Sonnet is not only smarter; it can now control your computer. Anthropic refers to this new feature as “computer usage,” which is currently in beta. It works by allowing Claude to access your desktop and perform a task. Then, the AI will start using your computer remotely just like a human would, moving the cursor, clicking buttons, typing commands, filling out forms and text fields, just like a human.
However, this feature is only available through the API, so end-users won’t be able to enjoy it in the short term.
Anthropic has trained Claude to interpret what’s happening on your screen visually. Developers can instruct it to perform tasks such as filling out forms, browsing websites, or even using software applications. It’s like having your AI sit in front of your computer and work for you, except it doesn’t get tired and (hopefully) doesn’t make as many mistakes as us humans.
The feature is in the testing phase, as it still encounters some basic issues – scrolling and zooming give it trouble. That’s why Anthropic keeps a close eye on things, storing screenshots for at least 30 days and conducting security checks to detect any suspicious behavior.
The company’s paranoia is justified. A few months ago, Microsoft launched a feature called “Recall,” which allowed Copilot+ to take screenshots of users’ computers, making their AI more useful and relevant. The noise was so loud that Microsoft had to postpone the plan after the Copilot+Recall feature was labeled as “spyware” and authorities started investigating it.
But Anthropic is made up of good guys, and they promise to be different. The research team stated, “We find that the updated Claude 3.5 Sonnet, including its new computer usage capability, remains at AI safety level 2 – meaning it doesn’t require higher security and safety measures than what we currently have in place.”
Companies like Replit have already integrated Claude’s computer usage feature to help automate app evaluation, and The Browser Company is testing its ability to streamline web-based workflows. These early adopters are exploring ways to have Claude handle tasks that typically require dozens or even hundreds of manual steps.
Furthermore, Anthropic’s budget-friendly model, Claude 3.5 Haiku, is now as powerful as the previous flagship model, Claude 3 Opus. However, this model runs at only a fraction of the cost and has much lower latency, making it more accessible without sacrificing too much performance.
Claude 3.5 Haiku will be available in November.
Subscribe to Updates
Get the latest creative news from FooBar about art, design and business.
Anthropic releases new Claude 35 sonnet An intelligent model to take over your computer
Add A Comment