Hugging Face

Use the hf provider for Hugging Face Inference Providers.

hf:
  api_key: "${HF_TOKEN}"
  # default_provider: groq # optional: groq, fireworks-ai, cerebras, etc.

Use hf.<model_name>[:provider] to specify models. If no provider suffix is supplied, Hugging Face auto-routes the request.

fast-agent --model kimi
fast-agent --model kimi26instant
fast-agent --model hf.openai/gpt-oss-120b
fast-agent --model hf.moonshotai/kimi-k2-instruct-0905:groq
fast-agent --model "hf.moonshotai/Kimi-K2.6:novita?reasoning=on"

Curated aliases such as kimi, deepseek-hf, glm, and minimax include provider choices and request defaults tested with fast-agent features such as structured outputs and tool use. Capability can still vary by backing provider.

Kimi instant mode

Kimi models that support instant mode can disable reasoning with the instant query parameter:

fast-agent --model "hf.moonshotai/Kimi-K2.5?instant=on"  # thinking disabled
fast-agent --model "hf.moonshotai/Kimi-K2.5?instant=off" # thinking enabled

Hugging Face MCP authentication

HF_TOKEN is automatically applied when connecting to Hugging Face MCP servers:

hf.co / huggingface.co uses Authorization: Bearer {HF_TOKEN}
*.hf.space uses both Authorization: Bearer {HF_TOKEN} and X-HF-Authorization: Bearer {HF_TOKEN}

Model aliases

Model Alias	Maps to
`deepseek-ai/deepseek-v4-pro`	`deepseek-ai/deepseek-v4-pro`
`deepseek-hf`	`hf.deepseek-ai/DeepSeek-V4-Pro:together`
`deepseek32`	`hf.deepseek-ai/DeepSeek-V3.2:fireworks-ai`
`deepseek4-hf`	`hf.deepseek-ai/DeepSeek-V4-Pro:together`
`deepseek4pro-hf`	`hf.deepseek-ai/DeepSeek-V4-Pro:together`
`deepseekv4pro-hf`	`hf.deepseek-ai/DeepSeek-V4-Pro:together`
`glm`	`hf.zai-org/GLM-5.1:together`
`glm47`	`hf.zai-org/GLM-4.7:cerebras`
`glm5`	`hf.zai-org/GLM-5:novita`
`glm51`	`hf.zai-org/GLM-5.1:together`
`gpt-oss`	`hf.openai/gpt-oss-120b:cerebras`
`gpt-oss-20b`	`hf.openai/gpt-oss-20b`
`kimi`	`hf.moonshotai/Kimi-K2.6:novita?temperature=1.0&top_p=0.95&reasoning=on`
`kimi-2.5`	`hf.moonshotai/Kimi-K2.5:novita?temperature=1.0&top_p=0.95&reasoning=on`
`kimi-2.6`	`hf.moonshotai/Kimi-K2.6:novita?temperature=1.0&top_p=0.95&reasoning=on`
`kimi25`	`hf.moonshotai/Kimi-K2.5:novita?temperature=1.0&top_p=0.95&reasoning=on`
`kimi25instant`	`hf.moonshotai/Kimi-K2.5:novita?temperature=0.6&top_p=0.95&reasoning=off`
`kimi26`	`hf.moonshotai/Kimi-K2.6:novita?temperature=1.0&top_p=0.95&reasoning=on`
`kimi26instant`	`hf.moonshotai/Kimi-K2.6:novita?temperature=0.6&top_p=0.95&reasoning=off`
`kimithink`	`hf.moonshotai/Kimi-K2.6:novita?temperature=1.0&top_p=0.95&reasoning=on`
`minimax`	`hf.MiniMaxAI/MiniMax-M2.7:fireworks-ai?temperature=1.0&top_p=0.95&top_k=40`
`minimax2.5`	`hf.MiniMaxAI/MiniMax-M2.5:novita?temperature=1.0&top_p=0.95&top_k=40`
`minimax21`	`hf.MiniMaxAI/MiniMax-M2.1:novita`
`minimax25`	`hf.MiniMaxAI/MiniMax-M2.5:fireworks-ai?temperature=1.0&top_p=0.95&top_k=40`
`minimax27`	`hf.MiniMaxAI/MiniMax-M2.7:fireworks-ai?temperature=1.0&top_p=0.95&top_k=40`
`moonshotai/kimi-k2`	`moonshotai/kimi-k2`
`moonshotai/kimi-k2-instruct-0905`	`moonshotai/kimi-k2-instruct-0905`
`moonshotai/kimi-k2-thinking`	`moonshotai/kimi-k2-thinking`
`moonshotai/kimi-k2.5`	`moonshotai/kimi-k2.5`
`moonshotai/kimi-k2.6`	`moonshotai/kimi-k2.6`
`qwen/qwen3.5-397b-a17b`	`qwen/qwen3.5-397b-a17b`
`qwen35`	`hf.Qwen/Qwen3.5-397B-A17B:novita?temperature=0.6&top_p=0.95&top_k=20&min_p=0.0&presence_penalty=0.0&repetition_penalty=1.0&reasoning=on`
`qwen35instruct`	`hf.Qwen/Qwen3.5-397B-A17B:novita?temperature=0.7&top_p=0.8&top_k=20&min_p=0.0&presence_penalty=1.5&repetition_penalty=1.0&reasoning=off`