DeepSeek: AI Breakthrough or Security Threat from China?
DeepSeek
DeepSeek is the new AI kid on the block, claiming to rival OpenAI’s model. DeepSeek is a Chinese artificial intelligence (AI) startup founded in 2023 by entrepreneur Liang Wenfeng. Initially a research branch of the hedge fund High-Flyer, DeepSeek was established to develop AI models with a focus on artificial general intelligence.
But what’s all the frenzy about DeepSeek? DeepSeek recently released a new set of models that it claims perform on par with OpenAI, Anthropic, Meta, etc., trained at a fraction of the cost, with less equipment, and even claims it’s currently outperforming OpenAI’s DALL-E. It’s worth noting that the highest-ranked DeepSeek model on HuggingFace is currently in 252nd place, take from that what you will..
DeepSeek’s model rankings on HuggingFace
With the recent TikTok ban, it is clear that the United States Congress is concerned about China and its espionage activities, and they have reason to believe that the Chinese are monitoring US activities using apps as a proxy. In other words, officials see national security concerns in apps like DeepSeek that route data abroad. The same concerns are echoed with DeepSeek, as it could potentially be used for mass espionage and the monitoring of Americans.In this post, we seek to examine the deep concerns about its Chinese origin and data-related issues. We will explore the risks of using the DeepSeek app or website, versus downloading a local model.
Using the DeepSeek App or Website
Using DeepSeek’s App or website is perhaps the most severe threat and, ironically, that is how the average American can access the most powerful models. If the Chinese government is using DeepSeek as a proxy to spy on Americans, then we are in hot soup, because you are literally handing over your information for free by using these models. With every piece of information provided, a profile can be built for each person and this can be used in all sorts of ways:
Political Motives: With enough information about you, misinformation targeted campaigns become more sophisticated. Combine that with DeepFakes, and it becomes a lethal combination.
Intellectual Property: Using the DeepSeek app or website means there is a chance you will prompt it with your own intellectual property or that of your employer. Multiply that risk by the millions of users, and the potential leak could be enormous.
DeepSeek's Privacy Policy
Most of the information in DeepSeek’s privacy policy is fairly standard, but a few details stand out, ones we haven’t explicitly seen in other AI companies’ policies. (Of course, this doesn’t necessarily mean those companies aren’t collecting similar data.)
One particularly striking point is the collection of “keystroke patterns or rhythms.” This might sound mundane at first, but research such as the work by A. Peacock and colleagues suggests that typing cadence can be as personal as a fingerprint or voiceprint. In other words, keystroke dynamics have the potential to become a form of secondary digital identification. When a company knows not just what you type but how you type it your speed, pauses, and other subtle patterns it could theoretically use this data to identify or track you across different sessions or systems.
DeepSeek’s privacy policy doesn’t specify exactly how they use these keystroke patterns whether it’s purely for analytics, fraud detection, or something else. But the mere act of collecting such uniquely identifying information underlines the need for transparency about how it’s stored, who can access it, and what safeguards are in place to prevent abuse. In an era when privacy is increasingly difficult to safeguard, details like these merit a closer look and thoughtful discussion.
Data Storage
"The personal information we collect may be stored on servers located outside of your home country, in fact, we use secure servers in the People’s Republic of China".
A quick IP lookup reveals that DeepSeek is headquartered in Hong Kong, placing all data exchanged under Chinese government jurisdiction. Whether or not it is actively monitored remains unclear, but the risk is still there.
This raises serious security concerns, particularly given China’s extensive track record of technological espionage. Anyone using DeepSeek for work is almost certain to share some proprietary data, which could ultimately end up in the wrong hands. Consequently, all of your information effectively sits behind the Great Chinese Firewall, making it vulnerable to government oversight and other potential threats. This isn’t just a privacy issue – it’s a national security risk if sensitive data from millions of users can be accessed by a foreign state.
Using the Less Powerful Models Locally
If you have GPUs and are okay with running the less powerful DeepSeek models, you are relatively safer. By using a platform like Transformers, you can download the weights locally and maintain control of your data.
What Are Model Weights?
Model weights are the numerical parameters in a machine learning model that get tuned during training to capture the relationships between inputs and outputs. They are usually floating points stored in files that are used during inference. (Inference in machine learning refers to the process of using a trained model to make predictions on new, unseen data.)
That being said, although it is possible to hide binary data within weights that would also mean that you’d have to write custom code to load it. Since the projects are open source, such malicious behavior would be discovered eventually. In the case of DeepSeek, it would utterly ruin their reputation and potentially get them shut down so the risk for this is very low.
There is also a risk you take when using a library such as Transformers to load deepseek and you enable the trust_remote_code=True option, as shown below:
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-V2", trust_remote_code=True)
By setting trust_remote_code=True
, the Transformers library will dynamically download and execute custom Python code from DeepSeek’s repository on the Hugging Face Hub. This can be risky if the code originates from an untrusted or malicious source, because it runs on your local machine just like any other Python script. However, these repositories are open source, so you can and should inspect the code before allowing it to run. So while there is an inherent security risk, you also have direct oversight of what is being loaded from the remote repository.
Unlike other commentaries that simply warn users off DeepSeek, here we’ve outlined how running the model locally can reduce exposure – a solution-oriented approach unique to our analysis.
That said, this option requires some technical knowledge and some GPUs. The vast majority of people will probably not have these resources or want to go through the trouble, so they may default to using the web interface.
Conclusion
In conclusion, the biggest risk of using DeepSeek lies in using its web interface. While DeepSeek is not likely to attempt to hijack your browser or try to execute remote code in a malicious way, they are collecting data and if this is affiliated with the Chinese government, there is a significant concern about what happens with that information. Such data exposure could aid espionage or influence operations, which is why experts urge caution. If Chinese AI platforms like DeepSeek continue to proliferate, will we see a bifurcation of AI ecosystems (China vs. West)? What does that mean for global AI governance?
Yes, it’s cool to use new AI tools, especially ones claiming to rival OpenAI. But given the political climate and the security concerns, it’s prudent to be careful. If you have the technical know how, using the less powerful models locally is a much safer approach because you maintain control over your data and avoid any potential mass collection of information through the online platform. The question then becomes: how much do you care about your data, and how far are you willing to go to protect it?
Ultimately, proceed with caution. DeepSeek might be an exciting leap in AI technology, but from an AI security standpoint, that leap could come with big strings attached. Use the app or website at your own risk, and if possible, consider the local model option your data security might just depend on it.