Updated on October 28, 2024
In IT support, it’s common to have a support agent remotely access your computer to provide support. With the new Claude 3.5 Sonnet, an AI model can do that, too!
Anthropic has included a new Computer control mode to their AI model that can do a lot of tasks, including ordering food and booking hotel rooms. Additionally, the new Claude scores highly on several AI benchmarks and even defeats OpenAI o1 in coding tasks.
In this article, we’ll cover the new features of Claude and predict what second-order effects these capabilities might have in the future. We’ll cover:
1. Computer Use
2. Claude 3.5 Sonnet v/s GPT 4-o v/s Google Gemini
3. How will Claude 3.5 Sonnet enhance Customer Support?
Computer Use by the New Claude 3.5 Sonnet
If the headline innovation in OpenAI o1 was “reasoning,” then the one standout feature for Claude 3.5 Sonnet is its capability for computer use. In essence, researchers from Anthropic have taught Claude how to control a personal computer and do small tasks.
How Was it Trained
Claude 3.5 Sonnet was trained to react to live screenshots of a user’s computer. They trained Claude to:
1. Count pixels on a screenshot to understand how far it has to move the cursor.
2. Use xdtools and shell commands to use a computer.
It’s a versatile tool since shell commands can automate multiple workflows on a desktop. Similarly, Claude 3.5 Sonnet has a lot of uses, too.
Practical Use
According to the new documentation, the PC use feature can be used to:
1. Do basic tasks on the computer like editing files, deleting files, or replacing them.
2. Write and edit code on offline IDEs.
3. Access the internet to browse videos and do specific tasks
And much more.
While the current model has mainly demonstrated efficiency in coding and different tasks, it could be further trained to provide IT support. The model can give capable IT support using the user’s computer with some fine-tuning.
It should also be able to solve problems with web tools and demonstrate basic workflows.
Claude 3.5 Sonnet v/s GPT 4-o v/s Google Gemini
The new version of Claude 3.5 Sonnet improves over all other publicly available state-of-the-art (SOTA) models, including Open AI o1.
Here are the results.
Claude 3.5 Sonnet (New) | Claude 3.5 Sonnet | GPT 4-o | Gemini 1.5 | |
---|---|---|---|---|
Graduate-level reasoning (GPQA- Diamond) | 65% | 59.4% | 53.6% | 59.1% |
Undergraduate-level Knowledge(MMLU Pro) | 78% | 75.1% | – | 75.8% |
Agentic Coding (SWE-bench verified) | 49% | 33.4% | – | – |
Code (HumanEval) | 93.7% | 92% | 90.2% | – |
Math Problem-Solving(MATH) | 78.3% | 71.1% | 76.6% | 86.5% |
Multilingual Math (MGSM) | 92.5% | 91.6% | 90.5% | – |
Reasoning over Text (DROP, F1 score) | 88.3% | 87.1% | 83.4% | – |
Agentic tool use (TAU-bench) | 69.2% (Retail) 46% (Airline) | 62.6% (Retail) 36% (Airline) | – | – |
Chart Q&A(Relaxed accuracy test) | 90.8% | 90.8% | 85.7% | – |
Document Q&A (ANLS test) | 94.2% | 95.2% | 92.8% | – |
As you can see, the new Claude 3.5 Sonnet significantly outperforms most previous SOTA models. Most importantly, it also shows improved performance in the TAU bench, a specific benchmark that tests the capability of an AI model in terms of tool use in particular business contexts.
The new model can use tools more intuitively in the retail and airline industries.
This model can also:
1. Solve mathematical problems more accurately
2. Accurately assess documents and images
3. Solve Google-proof PhD-student-level questions
4. Defeat Open AI o1 in a coding benchmark.
5. Solve more generic undergraduate-level questions with efficiency.
This provides the new Claude 3.5 Sonnet with advanced-level capabilities that it can use to understand your documents and answer repetitive questions.
This also qualifies explicitly for IT support. Let’s discuss that next.
How the New Claude 3.5 Sonnet Enhance Customer Support?
You can already see the general capabilities of generative AI for customer support through our guide. However, the new model has some specific advantages over older methods:
1. Computer Use – The model can perform different actions on a Windows desktop, helping in IT support.
2. Empathetic Answers – The model has a warm and compassionate tone that can answer questions in a more human-like way.
3. Knowledge Base Understanding – The model outperforms others in understanding long documents and knowledge bases and giving contextual answers.
4. Data Extraction – The model can understand charts, graphs, and other visual data to extract insights from them.
5. Robotic Process Automation – Alongside repetitive questions, the model can also perform repetitive tasks that a support agent needs to perform.
These capabilities should make the new Claude 3.5 Sonnet the new standard for AI customer support chatbots.
Conclusion
Though the name new Claude 3.5 Sonnet might sound confusing because of the previous model, the latest release from Anthropic makes vast improvements over the last iteration. With its capabilities around reasoning and automated PC use, it can help solve many customer support problems at scale.
This should solve a lot of critical problems in customer support. In terms of technology, this is the first step towards level 3 of AI progress.
This new release from Anthropic positions it firmly at the top of the AI race. It comes at the heels of our new feature that lets you integrate the latest Claude models with your customer support chatbots.
If you’re looking for a customer support chatbot powered by Claude. Talk to us.
A Content Marketing Manager at Kommunicate, Uttiya brings in 11+ years of experience across journalism, D2C and B2B tech. He’s excited by the evolution of AI technologies and is interested in how it influences the future of existing industries.