Featured Models

These models are deployed for industry-leading  speeds to excel at production tasks

Mixtral MoE 8x7B Instruct

Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts.

▲ up
to

200

tokens/sec

FireFunction V1

Fireworks' open-source function  calling model.

▲ up
to

434

tokens/sec

Llama 2 70B Chat

A fine-tuned version of Llama 2 70B, optimized for dialogue applications using Reinforcement Learnin...

▲ up
to

200

tokens/sec

Serverless models are hosted by Fireworks
- No need to configure hardware or deploy models.  Usage is billed per token.

Title

Description

Context

Action

Mixtral MoE 8x7B Instruct

Serverless

Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following

32,768

Playground

Llama 2 70B Chat

Serverless

Fireworks' open-source function calling model.

32,768

Playground

Mistral 7B Instruct

Serverless

A fine-tuned version of Llama 2 70B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations.

4,096

Playground

FireLLaVA-13B

Serverless

The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-vO.1 generative text model using a variety of publicly available conversation datasets.

32,768

Playground

Bleat

Serverless

Bleat allows you to enable function calling in LLaMA 2 in a similar fashion to OpenAl's implementation for ChatGPT.

4,096

Playground

Chinese Llama 2 LoRA 7B

Serverless

The LoRA version of Chinese-Llama-2 base on Llama-2-7b-hf.

4,096

Playground

Gemma 7B Instruct

Serverless

Gemma 7B Instruct from Google. Gemma is provided under and subject to the Gemma Terms of Use found at ai.google.dev/gemma/terms

Unknown

Playground

Hermes 2 Pro Mistral 7b

Serverless

Latest version of Nous Research's Hermes series of models, using an updated and cleaned version of the Hermes 2 dataset, and is now trained on a diverse and rich set of function calling and JSON mode samples

Unknown

Playground

Japanese StableLM Instruct Beta 70B

Serverless

japanese-stablelm-instruct-beta-70b is a 70B-parameter decoder-only language model based on japanese-stablelm-base-beta-70b and further fine tuned on Databricks Dolly-15k, Anthropic HH, and other public data.

4,096

Playground

Mixtral MoE 8x7B Instruct

Serverless

Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following

Context

32,768

Playground

Llama 270B Chat

Serverless

Fireworks' open-source function calling model.

Context

32,768

Playground

Mistral 7B Instruct

Serverless

A fine-tuned version of Llama 2 70B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations.

Context

4,096

Playground

FireLLaVA-13B

Serverless

The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-vO.1 generative text model using a variety of publicly available conversation datasets.

Context

32,768

Playground

Bleat

Serverless

Bleat allows you to enable function calling in LLaMA 2 in a similar fashion to OpenAl's implementation for ChatGPT.

Context

4,096

Playground

Chinese Llama 2 LoRA 7B

Serverless

The LoRA version of Chinese-Llama-2 base on Llama-2-7b-hf.

Context

4,096

Playground

Gemma 7B Instruct

Serverless

Gemma 7B Instruct from Google. Gemma is provided under and subject to the Gemma Terms of Use found at ai.google.dev/gemma/terms

Context

4,096

Playground

Hermes 2 Pro Mistral 7b

Serverless

Context

Unknown

Playground

Japanese StableLM Instruct Beta 70B

Serverless

Japanese-stablelm-instruct-beta-70b is a 70B-parameter decoder-only language model based on japanese-stablelm-base-beta-70b and further fine tuned on Databricks Dolly-15k, Anthropic HH, and other public data.

Context

Unknown

Playground

Serverless models are hosted by Fireworks
- No need to configure hardware or deploy models.  Usage is billed per token.

Title

Description

Context

Action

Mixtral MoE 8x7B Instruct

Serverless

Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following

32,768

Playground

Llama 2 70B Chat

Serverless

Fireworks' open-source function calling model.

32,768

Playground

Mistral 7B Instruct

Serverless

A fine-tuned version of Llama 2 70B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations.

4,096

Playground

FireLLaVA-13B

Serverless

The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-vO.1 generative text model using a variety of publicly available conversation datasets.

32,768

Playground

Bleat

Serverless

Bleat allows you to enable function calling in LLaMA 2 in a similar fashion to OpenAl's implementation for ChatGPT.

4,096

Playground

Chinese Llama 2 LoRA 7B

Serverless

The LoRA version of Chinese-Llama-2 base on Llama-2-7b-hf.

4,096

Playground

Gemma 7B Instruct

Serverless

Gemma 7B Instruct from Google. Gemma is provided under and subject to the Gemma Terms of Use found at ai.google.dev/gemma/terms

Unknown

Playground

Hermes 2 Pro Mistral 7b

Serverless

Unknown

Playground

Japanese StableLM Instruct Beta 70B

Serverless

4,096

Playground

Mixtral MoE 8x7B Instruct

Serverless

Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following

Context

32,768

Playground

Llama 270B Chat

Serverless

Fireworks' open-source function calling model.

Context

32,768

Playground

Mistral 7B Instruct

Serverless

A fine-tuned version of Llama 2 70B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations.

Context

4,096

Playground

FireLLaVA-13B

Serverless

The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-vO.1 generative text model using a variety of publicly available conversation datasets.

Context

32,768

Playground

Bleat

Serverless

Bleat allows you to enable function calling in LLaMA 2 in a similar fashion to OpenAl's implementation for ChatGPT.

Context

4,096

Playground

Chinese Llama 2 LoRA 7B

Serverless

The LoRA version of Chinese-Llama-2 base on Llama-2-7b-hf.

Context

4,096

Playground

Gemma 7B Instruct

Serverless

Gemma 7B Instruct from Google. Gemma is provided under and subject to the Gemma Terms of Use found at ai.google.dev/gemma/terms

Context

4,096

Playground

Hermes 2 Pro Mistral 7b

Serverless

Context

Unknown

Playground

Japanese StableLM Instruct Beta 70B

Serverless

Context

Unknown

Playground

All Image Models

All currently deployed image models.

Title

Description

Context

Action

Mixtral MoE 8x7B Instruct

Serverless

Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following

32,768

Playground

Llama 2 70B Chat

Serverless

Fireworks' open-source function calling model.

32,768

Playground

Mistral 7B Instruct

Serverless

A fine-tuned version of Llama 2 70B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations.

4,096

Playground

FireLLaVA-13B

Serverless

The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-vO.1 generative text model using a variety of publicly available conversation datasets.

32,768

Playground

Bleat

Serverless

Bleat allows you to enable function calling in LLaMA 2 in a similar fashion to OpenAl's implementation for ChatGPT.

4,096

Playground

Chinese Llama 2 LoRA 7B

Serverless

The LoRA version of Chinese-Llama-2 base on Llama-2-7b-hf.

4,096

Playground

Gemma 7B Instruct

Serverless

Gemma 7B Instruct from Google. Gemma is provided under and subject to the Gemma Terms of Use found at ai.google.dev/gemma/terms

Unknown

Playground

Hermes 2 Pro Mistral 7b

Serverless

Unknown

Playground

Japanese StableLM Instruct Beta 70B

Serverless

4,096

Playground

Mixtral MoE 8x7B Instruct

Serverless

Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following

Context

32,768

Playground

Llama 270B Chat

Serverless

Fireworks' open-source function calling model.

Context

32,768

Playground

Mistral 7B Instruct

Serverless

A fine-tuned version of Llama 2 70B, optimized for dialogue applications using Reinforcement Learning from Human Feedback (RLHF), and perform comparably to ChatGPT according to human evaluations.

Context

4,096

Playground

FireLLaVA-13B

Serverless

The Mistral-7B-Instruct-v0.1 Large Language Model (LLM) is a instruct fine-tuned version of the Mistral-7B-vO.1 generative text model using a variety of publicly available conversation datasets.

Context

32,768

Playground

Bleat

Serverless

Bleat allows you to enable function calling in LLaMA 2 in a similar fashion to OpenAl's implementation for ChatGPT.

Context

4,096

Playground

Chinese Llama 2 LoRA 7B

Serverless

The LoRA version of Chinese-Llama-2 base on Llama-2-7b-hf.