DeepSeek

127 posts
Opens profile photo
DeepSeek
@deepseek_ai
Unravel the mystery of AGI with curiosity. Answer the essential question with long-termism.
deepseek.comJoined October 2023

DeepSeek’s posts

Pinned
To prevent any potential harm, we reiterate that is our sole official account on Twitter/X. Any accounts: - representing us - using identical avatars - using similar names are impersonations. Please stay vigilant to avoid being misled!
🚀 DeepSeek-R1 is here! ⚡ Performance on par with OpenAI-o1 📖 Fully open-source model & technical report 🏆 MIT licensed: Distill & commercialize freely! 🌐 Website & API are live now! Try DeepThink at chat.deepseek.com today! 🐋 1/n
Image
🚀 Introducing DeepSeek-V3! Biggest leap forward yet: ⚡ 60 tokens/second (3x faster than V2!) 💪 Enhanced capabilities 🛠 API compatibility intact 🌍 Fully open-source models & papers 🐋 1/n
GIF
Image
Replying to
📜 License Update! 🔄 DeepSeek-R1 is now MIT licensed for clear open access 🔓 Open for the community to leverage model weights & outputs 🛠️ API outputs can now be used for fine-tuning & distillation 🐋 3/n
Replying to
🔥 Bonus: Open-Source Distilled Models! 🔬 Distilled from DeepSeek-R1, 6 small models fully open-sourced 📏 32B & 70B models on par with OpenAI-o1-mini 🤝 Empowering the open-source community 🌍 Pushing the boundaries of **open AI**! 🐋 2/n
Image
DeepSeek-Coder-V2: First Open Source Model Beats GPT4-Turbo in Coding and Math > Excels in coding and math, beating GPT4-Turbo, Claude3-Opus, Gemini-1.5Pro, Codestral. > Supports 338 programming languages and 128K context length. > Fully open-sourced with two sizes: 230B (also
Show more
Image
🎉 DeepSeek-VL2 is here! Our next-gen vision-language model enters the MoE era. 🤖 DeepSeek-MoE arch + dynamic image tilling ⚡ 3B/16B/27B sizes for flexible use 🏆 Outstanding performance across all benchmarks 🧵 1/n
Image
Replying to
💰 API Pricing Update 🎉 Until Feb 8: same as V2! 🤯 From Feb 8 onwards: Input: $0.27/million tokens ($0.07/million tokens with cache hits) Output: $1.10/million tokens 🔥 Still the best value in the market! 🐋 3/n
Image
Image
🚀 Launching DeepSeek-V2: The Cutting-Edge Open-Source MoE Model! 🌟 Highlights: > Places top 3 in AlignBench, surpassing GPT-4 and close to GPT-4-Turbo. > Ranks top-tier in MT-Bench, rivaling LLaMA3-70B and outperforming Mixtral 8x22B. > Specializes in math, code and reasoning.
Show more
Image
🚀 DeepSeekMath: Approaching Mathematical Reasoning Capability of GPT-4 with a 7B Model. Highlights: - Continue pre-training DeepSeek-Coder-Base-v1.5 7B with 120B math tokens from Common Crawl. - Introduce GRPO, a variant of PPO, that enhances mathematical reasoning and reduces
Show more
Image
Replying to
🌌 Open-source spirit + Longtermism to inclusive AGI 🌟 DeepSeek’s mission is unwavering. We’re thrilled to share our progress with the community and see the gap between open and closed models narrowing. 🚀 This is just the beginning! Look forward to multimodal support and
Show more
Replying to
DeepSeek has not issued any cryptocurrency. Currently, there is only one official account on the Twitter platform. We will not contact anyone through other accounts.Please stay vigilant and guard against potential scams.
🚀 Introducing Janus: a revolutionary autoregressive framework for multimodal AI! By decoupling visual encoding & unifying them with a single transformer, it outperforms previous models in both understanding & generation. ⚡️ Powerful, simple, flexible, & next-gen ready! 🔥
Show more
Image
Image
Image
Image
🎉Exciting news! We open-sourced DeepSeek-V2-0628 checkpoint, the No.1 open-source model on the LMSYS Chatbot Arena Leaderboard . Detailed Arena Ranking: Overall No.11, Hard Prompts No.3, Coding No.3, Longer Query No.4, Math No.7. DeepSeek-V2-0628 is released at
Show more
Image
🚀 Exciting news! We’ve officially launched DeepSeek-V2.5 – a powerful combination of DeepSeek-V2-0628 and DeepSeek-Coder-V2-0724! Now, with enhanced writing, instruction-following, and human preference alignment, it’s available on Web and API. Enjoy seamless Function Calling,
Show more
Image
🎉Exciting news! DeepSeek API now launches context caching on disk, with no code changes required! This new feature automatically caches frequently referenced contexts on distributed storage, slashing API costs by up to 90%. For a 128K prompt with high reference, the first token
Show more
📢 After 3 months, the AI Mathematical Olympiad (AIMO) on Kaggle has announced the winners! 🎉 We're thrilled to see the Top 4 teams all chose DeepSeekMath-7B as their base model, with Numina achieving 29/50 correct answers! 👏 Even Terence Tao was amazed. 🤯
Show more
Image