DeepSeek R1 Models: Now Running Directly on Your Copilot+ PC

Microsoft has announced the availability of DeepSeek R1 7B and 14B distilled models for Copilot+ Pcs via Azure AI Foundry. It expands the AI capabilities on the Edge, allowing researchers, developers, and enthusiasts to take advantage of large-scale machine learning models directly from their Copilot+ PCs. Here’s everything you need to know about.

Today, we are advancing our AI ambitions with the release of DeepSeek R1 7B & 14B distilled models for Copilot+ PCs via Azure AI Foundry. This is the next step on our journey to continue to make Windows the platform for AI, seamlessly integrating intelligence from the cloud to… pic.twitter.com/QaUYrlMIt6
— Pavan Davuluri (@pavandavuluri) March 3, 2025

What Are DeepSeek R1 7B and 14B Models?

The DeepSeek R1 models are especially optimized models for Copilot+ PCs. Initially, it will be rolling out on Qualcomm Snapdragon X-based systems, with support for Intel Core Ultra 200V and AMD Ryzen to follow. The biggest advantage of these AI models is that they allow complex multi-step reasoning tasks to run locally, reducing dependence on cloud-based processing without compromising efficiency.

How Do NPUs Improve AI Performance on Copilot+ PCs?

Neural Processing Units (NPUs) are at the core of Microsoft’s AI strategy. It allows DeepSeek models to run locally on PCs without consuming significant power or performance trade-offs. Copilot+ PCs are equipped with NPUs capable of over 40 trillion operations per second (TOPS). This claims to deliver sustained AI workloads with minimal impact on battery life and thermal performance.

How Are DeepSeek R1 Models Optimized for Performance?

The 7B and 14B DeepSeek models use 4-bit block-wise quantization to optimize memory usage. The transformer block, responsible for processing context and token iteration, employs int4 per-channel quantization for weights alongside int16 activations. According to Microsoft, the 14B model achieves approximately eight tokens per second, with further optimizations in progress.

This approach builds on Microsoft’s work with Phi Silica, a platform for low-bit inference on NPUs. Techniques such as QuaRot and sliding window processing, previously used in optimizing the 1.5B DeepSeek model, have been applied to the larger variants.

How Can Developers Access DeepSeek R1 Models?

Developers can access DeepSeek R1 1.5B, 7B, and 14B models via Microsoft’s AI Toolkit for VS Code. The models, available in ONNX QdQ format, can be downloaded through Azure AI Foundry. The AI Toolkit also allows for local experimentation with DeepSeek models and provides cloud-based testing via Azure Foundry.

Furthermore, Microsoft is also addressing the security concerns around AI adoption by providing comprehensive security for AI applications, including DeepSeek R1. In order to secure AI models, the company is using security tests, including red teaming and automated safety checks, Built-in content filtering with Azure AI Content Safety, and more. Additionally, the company offers tools like Microsoft Purview Data Loss Prevention (DLP) and Microsoft Defender for Cloud Apps to track third-party AI apps and block high-risk applications.

Keval Vachharajani
Reporter
- LinkedIn
Keval Vachharajani is a seasoned business tech journalist with over five years of experience covering technology for renowned publications. Now, he brings his expertise to the dynamic world of B2B. At Geekflare, Keval focuses on uncovering the latest developments in SaaS, delivering in-depth news, analysis, and insights to empower businesses and professionals.