Small but capable SLMs
Running small but surprisingly capable language models on edge devices, laptops, and cost-constrained environments where larger models are impractical. Phi models (1.5B to 14B parameters) are commonly used for on-device AI, mobile applications, and scenarios requiring low-latency local inference.
Download Phi models from Hugging Face (microsoft/phi-*) and run them locally with Ollama, llama.cpp, or the Hugging Face Transformers library. For cloud deployment, Phi models are available as serverless endpoints on Azure AI Studio. No API key needed for local use; Azure requires an Azure subscription.
Be the first to share a Microsoft Phi case study and get discovered by clients.
Submit a case studySubmit a brief and we'll match you with vetted specialists who have proven Microsoft Phi experience.