Skip to main content

Overview

NebiusLLMService provides chat completion capabilities using Nebius Token Factory’s API with OpenAI-compatible interface. It supports streaming responses and function calling.

Nebius LLM API Reference

Pipecat’s API methods for Nebius integration

Example Implementation

Function calling example with Nebius

Nebius Documentation

Official Nebius documentation

Nebius Token Factory

Access models and manage API keys

Installation

To use Nebius LLM services, install the required dependencies:
pip install "pipecat-ai[nebius]"

Prerequisites

Nebius Account Setup

Before using Nebius LLM services, you need:
  1. Nebius Account: Sign up at Nebius
  2. API Key: Generate an API key from the Token Factory dashboard
  3. Model Selection: Choose from available models (default: openai/gpt-oss-120b)

Required Environment Variables

  • NEBIUS_API_KEY: Your Nebius API key for authentication

Configuration

api_key
str
required
The API key for accessing Nebius’s API.
base_url
str
default:"https://api.tokenfactory.nebius.com/v1/"
The base URL for the Nebius API. Override if using a different endpoint.
settings
NebiusLLMService.Settings
default:"None"
Runtime-configurable model settings. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using NebiusLLMService.Settings(...). These can be updated mid-conversation with LLMUpdateSettingsFrame. See Service Settings for details.
ParameterTypeDefaultDescription
modelstr"openai/gpt-oss-120b"Nebius model identifier. Check Nebius Token Factory for available models.
temperaturefloatNOT_GIVENSampling temperature (0.0 to 2.0). Lower values are more focused, higher are creative.
max_tokensintNOT_GIVENMaximum tokens to generate.
top_pfloatNOT_GIVENTop-p (nucleus) sampling (0.0 to 1.0). Controls diversity of output.
frequency_penaltyfloatNOT_GIVENPenalty for frequent tokens (-2.0 to 2.0). Positive values discourage repetition.
presence_penaltyfloatNOT_GIVENPenalty for new topics (-2.0 to 2.0). Positive values encourage new topics.
NOT_GIVEN values are omitted from the API request entirely, letting the Nebius API use its own defaults. This is different from None, which would be sent explicitly.

Usage

Basic Setup

import os
from pipecat.services.nebius import NebiusLLMService

llm = NebiusLLMService(
    api_key=os.getenv("NEBIUS_API_KEY"),
)

With Custom Settings

import os
from pipecat.services.nebius import NebiusLLMService

llm = NebiusLLMService(
    api_key=os.getenv("NEBIUS_API_KEY"),
    settings=NebiusLLMService.Settings(
        model="openai/gpt-oss-120b",
        temperature=0.7,
        max_tokens=1000,
    ),
)

Updating Settings at Runtime

Model settings can be changed mid-conversation using LLMUpdateSettingsFrame:
from pipecat.frames.frames import LLMUpdateSettingsFrame
from pipecat.services.nebius.llm import NebiusLLMSettings

await task.queue_frame(
    LLMUpdateSettingsFrame(
        delta=NebiusLLMSettings(
            temperature=0.3,
        )
    )
)

Notes

  • OpenAI Compatibility: Nebius’s API is OpenAI-compatible, allowing use of familiar patterns and parameters.
  • Function Calling: Supports OpenAI-style tool/function calling format.
  • Streaming: Supports streaming responses for real-time interaction.

Event Handlers

NebiusLLMService supports the following event handlers, inherited from LLMService:
EventDescription
on_completion_timeoutCalled when an LLM completion request times out
on_function_calls_startedCalled when function calls are received and execution is about to start
@llm.event_handler("on_completion_timeout")
async def on_completion_timeout(service):
    print("LLM completion timed out")