The artificial intelligence landscape is witnessing a significant development as DeepSeek's open-source AI models demonstrate compatibility with Chinese-made Moore Threads GPUs, potentially reducing reliance on traditional NVIDIA hardware for AI inference tasks.
Technical Achievement
Moore Threads has successfully deployed the DeepSeek-R1-Distill-Qwen-7B model on both its MTT S80 client graphics card and MTT S4000 datacenter-grade graphics cards. The implementation utilizes the Ollama framework, a lightweight solution designed for running large language models locally on various operating systems, combined with Moore Threads' proprietary inference engine optimizations.
- DeepSeek API Pricing: USD 2.2 per million output tokens
- OpenAI API Pricing: USD 60 per million output tokens
- Supported Hardware: MTT S80 (client GPU), MTT S4000 (datacenter GPU)
- Framework Support: Ollama (MacOS, Linux, Windows)
- Compatible Models: DeepSeek-R1-Distill-Qwen-7B
Market Impact
This development comes at a crucial time when DeepSeek has been making waves in the global AI community. The company's pricing strategy is particularly aggressive, offering API services at USD 2.2 per million output tokens, significantly undercutting OpenAI's rate of USD 60 per million tokens. Major Chinese tech companies including Alibaba, ByteDance, Baidu, and JD Cloud have already integrated DeepSeek's models into their cloud services.
Performance and Integration
While specific performance metrics haven't been disclosed, Moore Threads claims excellent results through their custom computational optimizations and improved memory management. The company's GPUs have demonstrated CUDA compatibility, particularly beneficial for Chinese-language applications. This integration represents a significant step forward in China's domestic AI hardware capabilities.
Future Implications
The successful deployment of DeepSeek models on Moore Threads GPUs signals a potential shift in the AI hardware landscape. This development could lead to more affordable and accessible AI implementation options, particularly in the Chinese market. However, it's important to note that current demonstrations have been limited to distilled models, and comprehensive performance comparisons with AMD, Apple, or NVIDIA solutions are yet to be established.