Understanding Language Models

Context Window

The maximum amount of text (measured in tokens) that the model can process in a single interaction. Larger context windows allow the model to understand and respond to longer conversations or documents.

Tokens

Units of text that the model processes. A token can be as short as a single character or as long as a word (roughly 4 characters = 1 token in English).

Input/Output Pricing

Models charge differently for processing your input (prompts) vs generating output (completions). Prices are per 100 million tokens.

Cache

Some models offer caching capabilities to store and retrieve previous responses, potentially reducing costs and improving response times.

Available Models

Choose from our selection of state-of-the-art language models

GPT-4o mini

OpenAI

Context Window: 128,000 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0.15
Cache Read
$0.075
Completion Tokens
$0.6

GPT-4o

OpenAI

Context Window: 128,000 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$2.5
Cache Read
$1.25
Completion Tokens
$10

Claude 3.5 Haiku

Anthropic

Context Window: 200,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$1
Cache Read
$0.1
Cache Write
$1.25
Completion Tokens
$5

Claude 3.5 Sonnet

Anthropic

Context Window: 200,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$3
Cache Read
$0.3
Cache Write
$3.75
Completion Tokens
$15

Claude 3 Opus

Anthropic

Context Window: 200,000 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$15
Cache Read
$1.5
Cache Write
$18.75
Completion Tokens
$75

Gemini 1.5 Flash 8B

Google

Context Window: 128,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0.0375
Completion Tokens
$0.15

Gemini 1.5 Flash 002

Google

Context Window: 128,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0.075
Completion Tokens
$0.3

Gemini 1.5 Pro 002

Google

Context Window: 128,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$1.25
Completion Tokens
$5

chatgpt-4o-latest

OpenAI

Context Window: 128,000 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$5
Completion Tokens
$15

Gemini Experimental 1114

Google

Context Window: 128,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0
Completion Tokens
$0

Claude 3.5 Haiku (2024-10-22)

Anthropic

Context Window: 200,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$1
Cache Read
$0.1
Cache Write
$1.25
Completion Tokens
$5

Claude 3.5 Sonnet (2024-10-22)

Anthropic

Context Window: 200,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$3
Cache Read
$0.3
Cache Write
$3.75
Completion Tokens
$15

Gemini 1.5 Flash 8B Experimental (2024-09-24)

Google

Context Window: 128,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0
Completion Tokens
$0

Gemini 1.5 Pro Experimental (2024-08-27)

Google

Context Window: 128,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0
Completion Tokens
$0

Gemini 1.5 Flash Experimental (2024-08-27)

Google

Context Window: 128,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0
Completion Tokens
$0

GPT-4o (2024-08-06)

OpenAI

Context Window: 128,000 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$2.5
Cache Read
$1.25
Completion Tokens
$10

GPT-4o mini (2024-07-18)

OpenAI

Context Window: 128,000 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0.15
Cache Read
$0.075
Completion Tokens
$0.6

Claude 3.5 Sonnet (2024-06-20)

Anthropic

Context Window: 200,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$3
Cache Read
$0.3
Cache Write
$3.75
Completion Tokens
$15

Gemini 1.5 Flash 001 (2024-05-24)

Google

Context Window: 128,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0.075
Completion Tokens
$0.3

Gemini 1.5 Pro 001 (2024-05-24)

Google

Context Window: 128,000 tokens
Max Output: 8,192 tokens
Pricing (per 100M tokens)
Prompt Tokens
$1.25
Completion Tokens
$5

GPT-4o (2024-05-13)

OpenAI

Context Window: 128,000 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$5
Completion Tokens
$15

Claude 3 Haiku (2024-03-27)

Anthropic

Context Window: 200,000 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0.25
Cache Read
$0.03
Cache Write
$0.3
Completion Tokens
$1.25

Claude 3 Opus (2024-02-29)

Anthropic

Context Window: 200,000 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$15
Cache Read
$1.5
Cache Write
$18.75
Completion Tokens
$75

Hermes 3 405B Instruct (free)

Nous

Context Window: 8,192 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0
Completion Tokens
$0

Llama 3.1 405B Instruct (free)

Meta

Context Window: 8,000 tokens
Max Output: 4,000 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0
Completion Tokens
$0

Llama 3.1 70B Instruct (free)

Meta

Context Window: 8,192 tokens
Max Output: 4,096 tokens
Pricing (per 100M tokens)
Prompt Tokens
$0
Completion Tokens
$0