On-demand Pricing for Tokens-as-a-Service

Groq powers leading openly-available AI models.

Get started for free and upgrade as your needs grow. View the pricing of our core models below. Other models are available for specific customer requests including fine tuned models. Send us your inquiries here.

Large Language Models (LLMs)

*Approximate number of tokens per $
AI Model
Current Speed(Tokens per Second)
Input Token Price(Per Million Tokens)
Output Token Price(Per Million Tokens)
Llama 4 Scout (17Bx16E) 128k
594
$0.11(9.09M / $1)*
$0.34(2.94M / $1)*
Llama 4 Maverick (17Bx128E) 128k
562
$0.20(5M / $1)*
$0.60(1.6M / $1)*
Llama Guard 4 12B 128k
325
$0.20(5M / $1)*
$0.20(5M / $1)*
DeepSeek R1 Distill Llama 70B 128k
400
$0.75(1.33M / $1)*
$0.99(1.01M / $1)*
Qwen QwQ 32B (Preview) 128k
400
$0.29(3.44M / $1)*
$0.39(2.56M / $1)*
Mistral Saba 24B 32k
330
$0.79(1.27M / $1)*
$0.79(1.27M / $1)*
Llama 3.3 70B Versatile 128k
394
$0.59(1.69M / $1)*
$0.79(1.27M / $1)*
Llama 3.1 8B Instant 128k
840
$0.05(20M / $1)*
$0.08(12.5M / $1)*
Llama 3 70B 8k
330
$0.59(1.69M / $1)*
$0.79(1.27M / $1)*
Llama 3 8B 8k
1345
$0.05(20M / $1)*
$0.08(12.5M / $1)*
Gemma 2 9B 8k
500
$0.20(5M / $1)*
$0.20(5M / $1)*
Llama Guard 3 8B 8k
765
$0.20(5M / $1)*
$0.20(5M / $1)*

Text-to-Speech (TTS) Models

AI Model
Characters /s
PricePrice (Per M Characters)
PlayAI Dialog v1.0
140
$50.00

Automatic Speech Recognition (ASR) Models

AI Model
Speed Factor
Price(Per Hour Transcribed)
Whisper V3 Large
217x
$0.111*
Whisper Large v3 Turbo
228x
$0.04*
Distil-Whisper
250x
$0.02*

Batch API

The Batch API is now available for Dev Tier customers and currently offered at a 25% discount rate. Batch processing lets you run thousands of API requests at scale by submitting your workload as a batch to Groq and letting us process it with a 24-hour turnaround.

Learn more about Batch pricing and how to get started.

For enterprise API solutions or on-prem deployments, please fill out the form on our Enterprise Access Page.