Deepseek has become a viral.
After the chatbot apps were at the top of the Apple App Store chart, China’s AI Lab Deepseek entered the mainstream consciousness this week. DeepSeek’s AI model is trained using high -calculation -efficient technologies, and whether the US analyst and engineer can maintain a lead in AI races, and maintains the AI chip. I wondered if it was.
But where did DeepSeek come from, and how did it rose quickly to international fame?
Deepseek trader origin
Deepseek is supported by HIGHFLYER CAPITAL MANAGEMENT, a quantitative hedge fund in China, which uses AI to notify transaction decisions.
AI enthusiast Liang Wenfeng jointly established high flyers in 2015. Wenfeng began to work on transactions while a student at Zhijiang university started HighFlyer Capital Management in 2019 and focused on the development and development of AI algorithms.
In 2023, HIGHFLYER started Deepseek as a lab that focuses on investigating AI tools different from the financial business. As one of the investors, there was a high flyer, so the lab rushed to our company, also known as DeepSeek.
From the first day, DeepSeek has built a unique data center cluster for model training. However, as with other AI companies in China, DeepSeek is affected by the ban on export in the United States. To train one of the recent models, the company has to use the H100 NVIDIA H800 chips, a very powerful version of the chips available by US companies.
Deepseek’s technical team is said to distort young teams. company According to what is conveyed, it is actively adopted PhD researcher at the top university in China. DeepSeek also hires people without computer science background According to the New York Times, the technology makes it better to understand a wide range of subjects.
Deepseek’s powerful model
Deepseek announced in November 2023 the first set of the first model (Deepseek Coder, Deepseek LLM, Deepseek Chat). I started to attract attention.
Deepseek-V2, a general-purpose text and image analysis system, worked well with various AI benchmarks and was much cheaper than the same model at the time. It has been forced to reduce the price of some models of DeepSeek, including bytedance and Alibaba, and make others completely free.
The Deepseek-V3, released in December 2024, was added to the notorious of Deepseek.
According to Deepseek’s internal benchmark test, DeepSeek V3 exceeds both downloadable and open-available models, such as “Meta LLAMA”, which can only be accessed through APIs such as Openai GPT-4O.
Similarly, the Deepseek’s R1 “Progress” model is a model. Deepseek, released in January, claims that R1 will execute the O1 model of O1NAI on the key benchmark.
Since R1 is a reasoning model, there is actually a check itself. This helps avoid some pitfalls that normally stumble. It takes a little time for a reasoning model to reach a solution compared to a normal non -rational model. The advantage is that it tends to be more reliable in domains such as physics, science, and mathematics.
However, other models of R1, Deepseek V3, and Deepseek have drawbacks. Because it is an AI developed by China, they are the target benchmark China’s Internet regulatory authorities guarantee that the response will “embody the core socialist value.” For example, in DeepSeek’s ChatBot app, R1 does not answer questions about Tiananmen Square or Taiwan’s autonomy.
Destroyed approach
If DeepSeek has a business model, it is not clear what the model is exactly. The company offers the price of products and services far below the market value and provides free to others.
According to DeepSeek, the efficiency breakthrough has maintained extreme cost competitiveness. Some experts Dispute However, the number provided by the company.
In any case, the developer adopts the model of DeepSeek. This is not an open source because the phrase is generally understood, but can be used under a generous license that enables commercial use. According to Clem Delangue, the CEO of Hugging Face, one of the platforms that hosts the model of DeepSeek The developer with the face to be hugged created a “derivative” model with more than 500 R1. It was acquired by combining 2.5 million downloads.
Deepseek’s success for larger and more established rivals Called “maintenance of AI” and We are guiding “Ai Brinkmanship New Age”. The company’s success was at least partially responsible for at least NVIDIA’s stock price decreased by 18 % on Monday. Immediately draw public reactions From Openai CEO Sam Altman.
It is not clear how DeepSeek’s future is kept. The improved model is given. But the US government seems to be the case Wares to be wary of what it recognizes as a harmful foreign influence。
TechCrunch has a newsletter focusing on AI! Sign up here and get it on the reception tray every Wednesday.