© 2025 CoolTechZone - Latest tech news,
product reviews, and analyses.

DeepSeek might have unlawfully used OpenAI’s data to train its R1 model


Microsoft and OpenAI are currently investigating whether their Chinese competitor DeepSeek might have used OpenAI’s models to train its own chatbot.

The start-up from China launched its AI chatbot DeepSeek last week, a large language model (LLM) that’s similar to AI tools like ChatGPT, Gemini, Llama 4 and Claude.

Security researchers from Microsoft suspect that DeepSeek might have exfiltrated large amounts of data using OpenAI’s API in the fall of 2024 to train its R1 model. This goes against OpenAI’s terms of service, which state that companies are prohibited from developing AI models that compete with OpenAI when using OpenAI’s API.

Software developers can pay for a license to use the API to integrate OpenAI’s proprietary artificial intelligence models into their own applications. However, the terms of use state companies aren’t allowed to “automatically or programmatically extract data or output.”

According to the Financial Times, OpenAI has evidence that DeepSeek has violated its terms of service. An insider with knowledge of the case tells the news outlet that they might have used a technique called ‘distillation.’ Simply put, this means that one AI model can use the output of another model for training purposes to develop similar capabilities.

“There’s substantial evidence that what DeepSeek did here is they distilled knowledge out of OpenAI models, and I don’t think OpenAI is very happy about this,” David Sacks, US President Donald Trump’s advisor on AI and cryptocurrency, told the Financial Times.

In response to Sacks’ comment, OpenAI didn’t directly address his statement about DeepSeek. “We know PRC the [People’s Republic of China, ed.] based companies, and others, are constantly trying to distill the models of leading US AI companies,” an OpenAI spokesperson said in a statement to the media.

“As the leading builder of AI, we engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models, and believe as we go forward that it is critically important that we are working closely with the US government to best protect the most capable models from efforts by adversaries and competitors to take US technology,” he added.

OpenAI itself has been accused of copyright infringement, with lawsuits from The New York Times and prominent authors accusing the company of using their articles and books to train ChatGPT without their consent.


Leave a Reply

Your email address will not be published. Required fields are marked