Path: blob/main/smolagents_doc/en/pytorch/using_different_models.ipynb
5689 views
Using different models
smolagents
provides a flexible framework that allows you to use various language models from different providers. This guide will show you how to use different model types with your agents.
Available model types
smolagents
supports several model types out of the box:
InferenceClientModel: Uses Hugging Face's Inference API to access models
TransformersModel: Runs models locally using the Transformers library
VLLMModel: Uses vLLM for fast inference with optimized serving
MLXModel: Optimized for Apple Silicon devices using MLX
LiteLLMModel: Provides access to hundreds of LLMs through LiteLLM
LiteLLMRouterModel: Distributes requests among multiple models
OpenAIServerModel: Provides access to any provider that implements an OpenAI-compatible API
AzureOpenAIServerModel: Uses Azure's OpenAI service
AmazonBedrockServerModel: Connects to AWS Bedrock's API
All model classes support passing additional keyword arguments (like temperature
, max_tokens
, top_p
, etc.) directly at instantiation time. These parameters are automatically forwarded to the underlying model's completion calls, allowing you to configure model behavior such as creativity, response length, and sampling strategies.
Using Google Gemini Models
As explained in the Google Gemini API documentation (https://ai.google.dev/gemini-api/docs/openai), Google provides an OpenAI-compatible API for Gemini models, allowing you to use the OpenAIServerModel with Gemini models by setting the appropriate base URL.
First, install the required dependencies:
Then, get a Gemini API key and set it in your code:
Now, you can initialize the Gemini model using the OpenAIServerModel
class and setting the api_base
parameter to the Gemini API base URL:
Using OpenRouter Models
OpenRouter provides access to a wide variety of language models through a unified OpenAI-compatible API. You can use the OpenAIServerModel to connect to OpenRouter by setting the appropriate base URL.
First, install the required dependencies:
Then, get an OpenRouter API key and set it in your code:
Now, you can initialize any model available on OpenRouter using the OpenAIServerModel
class:
Using xAI's Grok Models
xAI's Grok models can be accessed through LiteLLMModel.
Some models (such as "grok-4" and "grok-3-mini") don't support the stop
parameter, so you'll need to use REMOVE_PARAMETER
to exclude it from API calls.
First, install the required dependencies:
Then, get an xAI API key and set it in your code:
Now, you can initialize Grok models using the LiteLLMModel
class and remove the stop
parameter if applicable: