On-demand Pricing for Tokens-as-a-Service

Groq powers leading openly-available AI models.

Get started for free and upgrade as your needs grow. View the pricing of our core models below. Other models are available for specific customer requests including fine tuned models. Send us your inquiries here.

Large Language Models (LLMs)

*Approximate number of tokens per $
AI Model	Current Speed(Tokens per Second)	Input Token Price(Per Million Tokens)	Output Token Price(Per Million Tokens)
Llama 4 Scout (17Bx16E) 128k	594	$0.11(9.09M / $1)*	$0.34(2.94M / $1)*	Try Now Model Card
Llama 4 Maverick (17Bx128E) 128k	562	$0.20(5M / $1)*	$0.60(1.6M / $1)*	Try Now Model Card
Llama Guard 4 12B 128k	325	$0.20(5M / $1)*	$0.20(5M / $1)*	Try Now Model Card
DeepSeek R1 Distill Llama 70B 128k	400	$0.75(1.33M / $1)*	$0.99(1.01M / $1)*	Try Now Model Card
Qwen QwQ 32B (Preview) 128k	400	$0.29(3.44M / $1)*	$0.39(2.56M / $1)*	Try Now Model Card
Mistral Saba 24B 32k	330	$0.79(1.27M / $1)*	$0.79(1.27M / $1)*	Try Now
Llama 3.3 70B Versatile 128k	394	$0.59(1.69M / $1)*	$0.79(1.27M / $1)*	Try Now Model Card
Llama 3.1 8B Instant 128k	840	$0.05(20M / $1)*	$0.08(12.5M / $1)*	Try Now Model Card
Llama 3 70B 8k	330	$0.59(1.69M / $1)*	$0.79(1.27M / $1)*	Try Now Model Card
Llama 3 8B 8k	1345	$0.05(20M / $1)*	$0.08(12.5M / $1)*	Try Now Model Card
Gemma 2 9B 8k	500	$0.20(5M / $1)*	$0.20(5M / $1)*	Try Now Model Card
Llama Guard 3 8B 8k	765	$0.20(5M / $1)*	$0.20(5M / $1)*	Try Now Model Card

Text-to-Speech (TTS) Models

AI Model	Characters /s	PricePrice (Per M Characters)
PlayAI Dialog v1.0	140	$50.00	Try Now Model Card

Automatic Speech Recognition (ASR) Models

AI Model	Speed Factor	Price(Per Hour Transcribed)
Whisper V3 Large	217x	$0.111*	Try Now Model Card
Whisper Large v3 Turbo	228x	$0.04*	Try Now Model Card
Distil-Whisper	250x	$0.02*	Try Now Model Card

Batch API

The Batch API is now available for Dev Tier customers and currently offered at a 25% discount rate. Batch processing lets you run thousands of API requests at scale by submitting your workload as a batch to Groq and letting us process it with a 24-hour turnaround.

Learn more about Batch pricing and how to get started.

For enterprise API solutions or on-prem deployments, please fill out the form on our Enterprise Access Page.