We're thrilled to share thatDeepInfrais now a supported Inference Provider on the Hugging Face Hub! DeepInfra joins our growing ecosystem, enhancing the breadth and capabilities of serverless inference directly on the Hub's model pages. Inference Providers are also seamlessly integrated into our client SDKs (for both JS and Python), making it super easy to use a wide variety of models with your preferred providers. DeepInfrais a serverless AI inference platform offering one of the most cost-effective pricing per token in the industry. With a catalog of over 100 models, DeepInfra makes it easy for developers to integrate a wide range of AI capabilities into their applications with minimal setup. DeepInfra supports a broad spectrum of model types - from LLMs to text-to-image, text-to-video, embeddings, and more. As part of this initial integration, DeepInfra is launching support forconversational and text-generation taskson Hugging Face, enabling access to popular open-weight LLMs such asDeepSeek V4,Kimi-K2.6,GLM-5.1, and many more.Support for additional tasks(text-to-image, text-to-video, embeddings, and more) will roll out soon! Read more about how to use DeepInfra as an Inference Provider in its dedicateddocumentation page. See the full list of models supported by DeepInfrahere. Follow DeepInfra on Hugging Face:https://huggingface.co/DeepInfra. DeepInfra is available through the Hugging Face SDKs -huggingface_hub(>= 1.11.2) for Python and@huggingface/inferencefor JavaScript. The following examples show how to useDeepSeek V4 Prothrough DeepInfra. Use aHugging Face tokento authenticate - the request will be routed to DeepInfra automatically. Hugging Face Inference Providers are integrated in most Agent Harnesses - including Pi, OpenCode, Hermes Agents, OpenClaw, and more. This means you can plug DeepInfra-hosted models straight into your favorite tools without any extra glue code. Browse the full list of integrationshere. For direct requests, i.e. when you use the key from an inference provider, you are billed by the corresponding provider. For instance, if you use a DeepInfra API key you're billed on your DeepInfra account. For routed requests, i.e. when you authenticate via the Hugging Face Hub, you'll only pay the standard provider API rates. There's no additional markup from us; we just pass through the provider costs directly. (In the future, we may establish revenue-sharing agreements with our provider partners.) Important Note‼️ PRO users get $2 worth of Inference credits every month. You can use them across providers. 🔥 Subscribe to theHugging Face PRO planto get access to Inference credits, ZeroGPU, Spaces Dev Mode, 20x higher limits, and more. We also provide free inference with a small quota for our signed-in free users, but please upgrade to PRO if you can! We would love to get your feedback! Share your thoughts and/or comments here:https://huggingface.co/spaces/huggingface/HuggingDiscussions/discussions/49