Configuring Custom Models: Vision & Parallel Tool Calls
Overview
This page documents configuration patterns for using custom models (e.g. Qwen, custom fine-tunes) with LiteLLM as a proxy and OpenCode as the agentic frontend.
When using non-standard models, both LiteLLM and OpenCode require explicit configuration to advertise and support advanced capabilities like vision (image input) and parallel tool calls. Without proper setup, these features silently fail.
Vision / Multimodal Support for Custom Models
Configuring image support for custom models in a llama.cpp + LiteLLM + opencode setup.
The Problem
LiteLLM does not infer vision support for arbitrary custom_openai models. OpenCode also performs its own preflight modality check and does not read LiteLLM's supports_vision metadata for custom providers. Both layers must be configured independently.
LiteLLM Side
- Add
supports_vision: trueunder the LiteLLM model'smodel_infoso LiteLLM and its model metadata advertise image capability - LiteLLM's OpenAI-compatible/custom OpenAI chat path does not normally strip OpenAI-style image blocks — it preserves
image_urlcontent and normalizes string image URLs into{ "url": ... }
OpenCode Side
- For opencode custom providers, image support must be declared per model with
modalities.inputincludingimage - If
modalitiesis set, include bothinputandoutput - Without
modalities.input: ["text", "image"], attachments get replaced with an error:ERROR: Cannot read image (this model does not support image input). Inform the user.
Working OpenCode Model Shape
"Qwen3.6-35B-A3B": {
"name": "Qwen3.6-35B-A3B",
"limit": {
"context": 262144,
"output": 65536
},
"modalities": {
"input": ["text", "image"],
"output": ["text"]
}
}
Parallel Tool Calls
Enabling parallel tool calls for a custom model served through LiteLLM, used by OpenCode.
The Problem
The parallel_tool_calls: true in model_info is metadata only — it does not auto-forward to the API. Without explicit pass-through, the downstream server never receives the signal.
OpenCode Side
- No config change needed in
opencode.jsonc - OpenCode uses the AI SDK which auto-detects parallel tool call capability from the provider
- The
tool_callboolean in model config is for declaring tool support, not parallel behavior - Your setup (
jamesravey_litellmprovider with@ai-sdk/openai-compatible) handles this at the SDK level
LiteLLM Side
Fix: Add parallel_tool_calls inside litellm_params:
"litellm_params": {
...
"parallel_tool_calls": true
}
Prerequisites
- LiteLLM must be v1.61.0+ (parallel tool calls pass-through added then)
- The downstream inference server (vLLM, TGI, etc.) must support parallel tool calls
Files Reference
~/.config/opencode/opencode.jsonc— OpenCode model config (no changes needed for parallel tool calls)- LiteLLM model config — add
parallel_tool_callstolitellm_paramsandsupports_vision: truetomodel_info
No comments to display
No comments to display