Updated on May 2, 2024

Battle of LLMs GPT-4 Turbo vs. Claude 3 Opus vs. Google Gemini 1.5 Pro

ChatGPT has already entered the product manager’s toolkit to design and implement unique user experiences. Its impact is profound – where it has also changed the job description and titles!

“I believe that all product managers will be AI product managers. This is because we see all products needing to have a personalized experience, or ‘A recommender system’ that is actually good.”

—  Marily Nika, computer scientist and an AI Product Leader at Meta

From GPT–powered customer support, language translations, and shopping assistants, to feedback – we have covered many such use cases in our 15 ways to use ChatGPT for product engagement.

But as of 2024, we have equally or better competitive language models developed by Google and Anthropic that may perform better in certain use cases.

We make things easy for upgrading your product toolkit by comparing the latest and premium language models as of May 2024.

We call them – The AI Titans, namely the GPT–4 Turbo vs. Claude 3 Opus vs. Google Gemini 1.5 Pro. We have compared them based on:

  • AI benchmarks: technical performance parameters like underlying technology, context windows, speed, etc.
  • Use cases: how you can incorporate each model into your product toolkit
  • Response test: prompt tests for vision, text generation, following instructions, maths problems, and reasoning
  • Pricing: comparing paid versions for each
  • Upcoming updates: funding raised, technology upgrades, and latest news

First — let’s understand what’s new with the top three LLMs: GPT, Claude, and Gemini LLMs

Here’s how the latest versions of models compare to their predecessors:

What’s new with GPT–4 Turbo?

GPT-4 Turbo is the latest generation model developed by OpenAI. It is the most capable model that can solve complex problems with more accuracy. Since April 2024, you can access GPT–4 Turbo on the ChatGPT Plus plan.

Compared to its previous versions, GPT–4 Turbo promises:

  • Updated knowledge cut-off of April 2023
  • Cheaper: 3X cost savings for input tokens and 2X cost savings for output tokens
  • Larger context windows: GPT–4 Turbo provides 128k tokens compared to 16k tokens for GPT–3.5 Turbo
  • Multimodality: you can now input images and text-to-speech to receive a response

Here’s how GPT–4 Turbo compares with GPT–4 and GPT–3.5 to help you gauge the improvement made:

Criteria/ModelGPT–3.5 TurboGPT–4GPT–4 Turbo
Knowledge cut-offJanuary 2022April 2023April 2023
Accessibility FreePlusPlus (since April 2024)
Input promptTextText and ImagesText, Images, Text-to-Speech
Context Window16,385 tokens (GPT-3.5 turbo–1125)
4,096 tokens (GPT–3.5 turbo instruct)
There are two versions – 8,192  and 32,000 tokens128,000 tokens
PriceInput: $0.50 / 1M tokens
Output: $1.50 / 1M tokens
Input: $30.00 / 1M tokens
Output: $60.00 / 1M tokens
Input: $10.00 / 1M tokens
Output: $30.00 / 1M tokens

What’s new with Claude 3 Opus?

Claude 3 is the latest AI chatbot assistant developed by Anthropic. You can choose between three model options – Opus, Sonnet, and Haiku, each having varied use cases and performance levels.

Claude 3 models can handle complex tasks and demonstrate human-like fluency in their responses. Here are their key features:

  • Multilingual: all Claude 3 models can provide output in non-English languages (like Japanese or Spanish), thus suitable for translation use cases.
  • Multimodal: all Claude 3 models can process and analyze images and extract document data.
  • More intelligent: Claude 3 models are superior in intelligence compared to previous Claude 20, Claude 2.1, and Claude 1.2 models.

Each model has a specific use case – here’s a comparison of the Claude 3 family models:

Criteria/ModelClaude 3 OpusClaude 3 SonnetClaude 3 Haiku
SpeedSimilar speeds to Claude 2 and 2.12x faster than Claude 2 and Claude 2.1Fastest model
Ideal forComplex tasksEnterprise workloads that require balance in intelligence and speedInstant, accurate, and targeted responses
Context window200K200K200K
Knowledge cut-offAug 2023Aug 2023Aug 2023
PriceInput: $15 / 1M tokens
Output: $75 / 1M tokens
Input: $3 / 1M tokens
Output: $15 / 1M tokens
Input: $0.25 / 1M tokens
Output: $1.25 / 1M tokens
Claude 3 Models
AI-powered chatbot

What’s new with Google Gemini 1.5 Pro?

Gemini is a family of multi-modal LLMs developed by Google DeepMind. They are successors to LaMDA and PaLM 2 and are positioned as a direct competitor to OpenAI’s ChatGPT. Gemini 1.5 Pro is their latest model released in February 2024 which promises long-context understanding while using less computing.

Key Gemini 1.5 Pro features compared to its previous version include:

  • More context window: with 1 million tokens, Gemini 1.5 Pro can handle 11 hours of audio, 1 hour of video, and more than 30,000 lines of code!
  • In-context learning: better responses from longer prompts without fine-tuning
Gemini 1.5 Pro stacks against its predecessors (Gemini 1.0, GPT-4 Turbo, Claude 2.1)

Here’s how Gemini 1.5 Pro stacks against its predecessors:

Criteria/ModelGemini 1.0Gemini 1.5 Pro
SpeedComparatively slower response timeFaster
Context window32,000 tokens128,000 tokens (soon 1 million tokens)
Use caseCannot handle longer code blocksCan handle longer code blocks
PricingInput: $0.50 / 1 million tokens
Output: $1.50 / 1 million tokens
Input: $7 / 1 million tokens
Output: $21 / 1 million tokens
Gemini 1.5 Pro stacks against its predecessors

Benchmarking the AI Titans: a data-driven showdown

Now that we understand the progressive updates of each model, let’s stack their latest versions together to compare their underlying technologies and performance. We will examine their key metrics such as coding, reasoning, text generation, etc so that product managers can understand the strengths and weaknesses of these AI models.

Note that these comparisons are good on paper as it is easy to fine-tune model performance to meet benchmarks.

Here are benchmark ranks sourced from Papers with code:

Benchmark/ ModelGTP–4 Claude 3 OpusGemini 1.5 ProSource
Code generation#1#9#20HumanEval
Sentence completion#4NANAHellaSwag
Common sense reasoning#1Claude 2 at #3NAARC (Challenge)
Arithmetic reasoning#1#9Gemini Ultra at #10GSM8K
GenAI IQ Tests85 (for GPT–4 Turbo)10176Maximum Truth
Benchmark ranks

Conclusion:

Gemini 1.5 Pro and GPT–4 Turbo were released a few weeks back, hence comprehensive benchmarking isn’t available yet. As of May 2024, GPT–4 is ahead in the AI benchmarking tests across papers. Before adopting a model, compare its benchmark performance ranks for your use case.

When to use each AI Titan for your product? – a use case comparison of GPT–4 Turbo, Claude 3 Opus, and Gemini 1.5 Pro

More than benchmark performance, you must first fix your use case of adopting the AI model. Here’s what each of the top three LLMs has expertise in:

What is GPT–4 Turbo good for?

Use case of AI Models, ChatGPT, Claude 3 and Gemini 1.5

Image Source: OpenAI – Introducing GPTs

GPT–4’s superior understanding of language makes it suitable for content creation. You can build custom GPTs tailored to your requirements to write blog posts, copy, social media captions, and more. It is also good in natural conversations – making it ideal for use cases like customer support, negotiations, coaching, etc.

You can explore real custom GPTs to potential use cases — Custom GPT marketplace by Open AI

What is Claude 3 Opus good for?

Anthropic positions Claude 3 Opus for applications that require powerful computing. It can integrate into large enterprise workflows and applications to deliver top-level performance. If you want to save costs and want balanced performance, Claude 3 Sonnet is a better choice. 

Typical use cases for the Claude 3 family include data analysis, extracting information from long documents, summarizing responses, contract drafting, and more. 

For example, with its 200k context window, Claude 3 Opus saves time in reading, interpreting, summarizing, and creating legal documents.

Here’s a review by YouTuber Andy Stapleton who shares the ‘Academic research’ use case:

Claude 3 Opus review by YouTuber Andy Stapleton who shares the ‘Academic research’

What is Gemini 1.5 Pro good for?

Gemini 1.5 Pro presents a good use case for multi-modal applications. It combines natural language processing with computer vision and other sensory inputs. If you combine this with other Google products — you can power immersive user experiences, enhance product visualization, and unlock new dimensions of customer engagement.

Here’s what it looks like —

Gemini 1.5 Pro is also good for handling longer code blocks thanks to its 1 million tokens context window.

Here’s a video by Google where the model reasons across a 402-page manuscript:

Gemini 1.5 Pro
Need help training chatbot

Decoding model architectures of GPT–4, Claude 3 Opus, and Gemini 1.5 Pro

Understanding the underlying model architectures for top LLM models helps product managers anticipate which AI model aligns best with their product’s specific needs. 

GPT–4 Turbo: The Transformer-based powerhouse

GPT–4 Turbo is built upon the foundations of GPT–3, where it retains its transformer-based architecture. The improvements are in its:

  • Better response time
  • Refined attention mechanisms lead to a more nuanced and contextual understanding of the input text.
  • Expanded training data: GPT–4 Turbo is multi-modal, has new safeguards for ethical and desirable behavior, and better knowledge cut-off.

Claude 3 Opus: Anthropic’s versatile approach

Claude 3 family focuses on functional specializations. The Claude 3 Opus, its flagship model, has an architecture that blends elements of transformer-based models and other neural network architectures. It also incorporates Anthropic’s proprietary “Constitutional AI” principles, which aim to instill ethical and safety-conscious behaviors in the model.

Google Gemini 1.5 Pro: The multimodal powerhouse

Sundar Pichai talking about Gemini 1.5 Pro

— Sundar Pichai, CEO of Google for Gemini 1.5 Pro release on The Keyword

Gemini 1.5 Pro is a mid-size multimodal model that adopts the Mixture-of-Experts (MoE) architecture. MoE enables the model’s parameter count to grow without increasing the number of activated parameters per input. This ensures its efficiency in serving while accommodating expansion, thus facilitating longer context understanding.

It is also extensively trained on large-scale, multimodal datasets. It enables the model to develop a deep understanding of the relationships between different types of information.

Comparing brains of the AI Titans – Training and Learning Capabilities of GPT–4, Claude 3 Opus, and Gemini 1.5 Pro

Here’s a summary to guide your understanding of each model’s training and learning capabilities to design your internal training methods for your product’s needs:

Capability/ ModelGPT–4 TurboClaude 3 OpusGemini 1.5 Pro
Training processLeverages unsupervised learning techniques over vast datasets.Combines Transformer-based models with specialized training.Rigorous training on multimodal datasets focused on understanding varied data types and modalities.
Adaptation and improvement to new data over timeAdapts through fine-tuning and transfer learning.Adapts through continual learning and exposure to varied inputs.Leverages multimodal architecture to process and reason over different types of information, contexts, and tasks.
Training and Learning Capabilities of GPT–4, Claude 3 Opus, and Gemini 1.5 Pro

Comparing how user-centric the interfaces of GPT-4 Turbo, Claude 3 Opus, and Google Gemini 1.5 Pro

Let’s compare UI for all AI Titans based on ease of integration, customization options, and user feedback.

GPT–4 Turbo user interface

You can easily access GPT–4 Turbo via ChatGPT Plus’s default interface. It offers Team pricing options with a dedicated workspace where you can build and share GPTs. It also allows you to switch between personal and team accounts for easy storage and segregation of work.

ChatGPT–4 Turbo user interface

Image Source: Maginative

Gemini 1.5 Pro user interface

Image Source: Beebom

Google’s Gemini 1.5 Pro looks similar to its available Gemini platform that runs on its predecessor Gemini 1.0 model. It has a sleek, modern, and easy-to-use UI that follows Google’s material design principles.

It showcases tokens used when you input your prompt and allows advanced options like setting temperature or multi-modal prompting. You can also switch between Gemini models, add stop sequences, integrate with Google Workspace, and much more.

Claude 3 Opus user interface

Claude 3 Opus user interface

Image Source: TechCrunch

Once you purchase the Claude Pro, you get access to its model selector as shown in the screenshot above to choose Opus or any other model. The user interface to input prompt and check the response is similar to the free version of Claude AI – which is clutter-free and simple with options to choose light or dark mode. You can also attach files to enrich your prompt.

Train the chatbot on your own data

Comparing cost and accessibility for GPT-4 Turbo, Claude 3 Opus, and Google Gemini 1.5 Pro

GPT-4 Turbo offers a more affordable option at $10 per million input tokens and $30 per million output tokens. In contrast, Claude 3 Opus comes with a heftier price tag, with $15 per million input tokens and $75 per million output tokens, more than double the cost of GPT-4 Turbo.

Google, however, has taken a different approach with Gemini 1.5 Pro, offering a preview pricing of $7 per million input tokens and $21 per million output tokens. This makes it the most cost-effective option among the three AI titans.

Per million tokens / ModelGPT–4 TurboClaude 3 OpusGemini 2.5 Pro
Input $10.00$15.00$7.00
Output$30.00$75.00$21.00
Cost Competition

When it comes to accessibility – GPT–4 Turbo is more widely accessible via ChatGPT Plus.

Twitter post of Sam Altman on ChatGPT- 4 Turbo

Image Source: Twitter

While Claude 3 Opus and Gemini 1.5 Pro are primarily available through their respective API platforms.

We can help you make the most of the top LLM models

Using our previous AI benchmark scores, you can further weigh the tradeoffs and decide how to integrate these AI models into your product roadmap.

Here are some quick takeaways for product managers to help opt for a suitable AI model among the top three LLM models:

  • For GPT–4 Turbo: gives an option to build custom GPTs for tasks such as language generation, question answering, and code generation.
  • For Claude 3 Opus: excels in a variety of applications, from customer support and task assistance to content generation and creative problem-solving.
  • For Gemini 1.5 Pro: the multimodal approach makes it valuable to deliver immersive and contextually-aware experiences to their users.

Write A Comment

Close

Devashish Mamgain

I hope you enjoyed reading this blog post.

If you want the Kommunicate team to help you automate your customer support, just book a demo.

Book a Demo

You’ve unlocked 30 days for $0
Kommunicate Offer

Upcoming Webinar: Conversational AI in Fintech with Srinivas Reddy, Co-founder & CTO of TaxBuddy.

X