info
Get a 7 day free trial for LiteLLM Enterprise here.
no call needed
New Models / Updated Models​
- New OpenAI
/image/variations
endpoint BETA support Docs - Topaz API support on OpenAI
/image/variations
BETA endpoint Docs - Deepseek - r1 support w/ reasoning_content (Deepseek API, Vertex AI, Bedrock)
- Azure - Add azure o1 pricing See Here
- Anthropic - handle
-latest
tag in model for cost calculation - Gemini-2.0-flash-thinking - add model pricing (it’s 0.0) See Here
- Bedrock - add stability sd3 model pricing See Here (s/o Marty Sullivan)
- Bedrock - add us.amazon.nova-lite-v1:0 to model cost map See Here
- TogetherAI - add new together_ai llama3.3 models See Here
LLM Translation​
- LM Studio -> fix async embedding call
- Gpt 4o models - fix response_format translation
- Bedrock nova - expand supported document types to include .md, .csv, etc. Start Here
- Bedrock - docs on IAM role based access for bedrock - Start Here
- Bedrock - cache IAM role credentials when used
- Google AI Studio (
gemini/
) - support gemini 'frequency_penalty' and 'presence_penalty' - Azure O1 - fix model name check
- WatsonX - ZenAPIKey support for WatsonX Docs
- Ollama Chat - support json schema response format Start Here
- Bedrock - return correct bedrock status code and error message if error during streaming
- Anthropic - Supported nested json schema on anthropic calls
- OpenAI -
metadata
param preview support- SDK - enable via
litellm.enable_preview_features = True
- PROXY - enable via
litellm_settings::enable_preview_features: true
- SDK - enable via
- Replicate - retry completion response on status=processing
Spend Tracking Improvements​
- Bedrock - QA asserts all bedrock regional models have same
supported_
as base model - Bedrock - fix bedrock converse cost tracking w/ region name specified
- Spend Logs reliability fix - when
user
passed in request body is int instead of string - Ensure ‘base_model’ cost tracking works across all endpoints
- Fixes for Image generation cost tracking
- Anthropic - fix anthropic end user cost tracking
- JWT / OIDC Auth - add end user id tracking from jwt auth
Management Endpoints / UI​
- allows team member to become admin post-add (ui + endpoints)
- New edit/delete button for updating team membership on UI
- If team admin - show all team keys
- Model Hub - clarify cost of models is per 1m tokens
- Invitation Links - fix invalid url generated
- New - SpendLogs Table Viewer - allows proxy admin to view spend logs on UI
- New spend logs - allow proxy admin to ‘opt in’ to logging request/response in spend logs table - enables easier abuse detection
- Show country of origin in spend logs
- Add pagination + filtering by key name/team name
/key/delete
- allow team admin to delete team keys- Internal User ‘view’ - fix spend calculation when team selected
- Model Analytics is now on Free
- Usage page - shows days when spend = 0, and round spend on charts to 2 sig figs
- Public Teams - allow admins to expose teams for new users to ‘join’ on UI - Start Here
- Guardrails
- set/edit guardrails on a virtual key
- Allow setting guardrails on a team
- Set guardrails on team create + edit page
- Support temporary budget increases on
/key/update
- newtemp_budget_increase
andtemp_budget_expiry
fields - Start Here - Support writing new key alias to AWS Secret Manager - on key rotation Start Here
Helm​
- add securityContext and pull policy values to migration job (s/o https://github.com/Hexoplon)
- allow specifying envVars on values.yaml
- new helm lint test
Logging / Guardrail Integrations​
- Log the used prompt when prompt management used. Start Here
- Support s3 logging with team alias prefixes - Start Here
- Prometheus Start Here
- fix litellm_llm_api_time_to_first_token_metric not populating for bedrock models
- emit remaining team budget metric on regular basis (even when call isn’t made) - allows for more stable metrics on Grafana/etc.
- add key and team level budget metrics
- emit
litellm_overhead_latency_metric
- Emit
litellm_team_budget_reset_at_metric
andlitellm_api_key_budget_remaining_hours_metric
- Datadog - support logging spend tags to Datadog. Start Here
- Langfuse - fix logging request tags, read from standard logging payload
- GCS - don’t truncate payload on logging
- New GCS Pub/Sub logging support Start Here
- Add AIM Guardrails support Start Here
Security​
- New Enterprise SLA for patching security vulnerabilities. See Here
- Hashicorp - support using vault namespace for TLS auth. Start Here
- Azure - DefaultAzureCredential support
Health Checks​
- Cleanup pricing-only model names from wildcard route list - prevent bad health checks
- Allow specifying a health check model for wildcard routes - https://docs.litellm.ai/docs/proxy/health#wildcard-routes
- New ‘health_check_timeout ‘ param with default 1min upperbound to prevent bad model from health check to hang and cause pod restarts. Start Here
- Datadog - add data dog service health check + expose new
/health/services
endpoint. Start Here
Performance / Reliability improvements​
- 3x increase in RPS - moving to orjson for reading request body
- LLM Routing speedup - using cached get model group info
- SDK speedup - using cached get model info helper - reduces CPU work to get model info
- Proxy speedup - only read request body 1 time per request
- Infinite loop detection scripts added to codebase
- Bedrock - pure async image transformation requests
- Cooldowns - single deployment model group if 100% calls fail in high traffic - prevents an o1 outage from impacting other calls
- Response Headers - return
x-litellm-timeout
x-litellm-attempted-retries
x-litellm-overhead-duration-ms
x-litellm-response-duration-ms
- ensure duplicate callbacks are not added to proxy
- Requirements.txt - bump certifi version
General Proxy Improvements​
- JWT / OIDC Auth - new
enforce_rbac
param,allows proxy admin to prevent any unmapped yet authenticated jwt tokens from calling proxy. Start Here - fix custom openapi schema generation for customized swagger’s
- Request Headers - support reading
x-litellm-timeout
param from request headers. Enables model timeout control when using Vercel’s AI SDK + LiteLLM Proxy. Start Here - JWT / OIDC Auth - new
role
based permissions for model authentication. See Here
Complete Git Diff​
This is the diff between v1.57.8-stable and v1.59.8-stable.
Use this to see the changes in the codebase.