AI News Daily 02-04

AI News | Daily Briefing | All-Net Data Aggregation | Cutting-Edge Science Exploration | Industry Free Speech | Open Source Innovation Power | AI and Human Future | Visit Web Version 🚀 | Join Group Chat 👋

Today’s Summary

OpenAI releases Codex desktop, supporting multi-agent independent thread execution.
Zhipu GLM-5 and MiniMax M2.2 to be released before Chinese New Year, focusing on programming and reasoning.
Tencent CL-bench reveals models solve only 17% of tasks through in-context learning.
Musk merges SpaceX and xAI, valued at $1.25 trillion, pushing for space-based computing power.
Vibe Coding's first anniversary: embrace LLMs completely and forget coding.

Product and Feature Updates

OpenAI just dropped their new Codex desktop app! 🤯 This isn’t just your average Q&A code generator; it’s a full-on Command Center (AI News) designed specifically for multiple agents. Think of it: you can run tons of tasks at once, with each agent chilling in its own independent thread, all neatly organized by project. Plus, Git Worktree lets these agents work in isolated copies, and you can even craft custom Skills that sync everywhere. Pretty sweet for workflow, right?
Big news on the model front! Zhipu AI’s GLM-5 is set to drop before February 15th (just in time for Chinese New Year, maybe? 😉), aiming for some serious breakthroughs in creative writing, programming, and reasoning. Not to be outdone, MiniMax’s M2.2 version is also coming out, seriously boosting its programming chops—it’s being hyped as a programmer’s ‘secret weapon.’ While DeepSeek only gave its V3 series a minor refresh, keeping us on the edge of our seats for their trillion-parameter monster, keep an eye out because ByteDance and Alibaba are also cooking up some new models! 🚀

Cutting-Edge Research

Heads up, research nerds! Tencent Hunyuan just dropped its new CL-bench evaluation benchmark, which is a big deal since it’s Yao Shunyu’s first co-authored Paper (AI News) after joining Tencent. 📝 This benchmark’s super specific: it tests if models can actually learn and correctly apply new knowledge from in-context learning. And the results? A bit of a reality check! 😬 On average, models only tackled a measly 17.2% of tasks, with even the top dog, GPT-5.1, hitting just 23.7%. Basically, it’s screaming: ‘Models still haven’t truly figured out how to use context!’
Moving beyond just bug fixes, ProjDevBench just rolled out! 🚀 This New Benchmark (AI News) isn’t messing around; it specifically evaluates AI’s capability for end-to-end project development, from initial requirements all the way to a complete repo. It’s packed with 20 programming problems across 8 categories, mixing old-school OJ testing with modern LLM code review. The kicker? Six coding agents together managed a dismal 27.38% pass rate. Yikes! Turns out, complex system design is still a HUGE weak spot for them. 😬
Here’s a fresh take in cognitive modeling: reinforcement learning is now being used to train LLMs to explain human decisions! 🧠 Researchers are leveraging outcome-based Reinforcement Learning (AI News) to guide LLMs to churn out explicit reasoning chains. The big idea? Nail both predictive accuracy AND make those explanations super readable. This means AI won’t just be a black box anymore; it’ll actually tell us why it made a certain call. Pretty cool, huh?
Alright, tech detectives! The secret behind RLVR training instability has finally been cracked. 🕵️‍♂️ While Reinforcement Learning (AI News) with verifiable rewards is awesome for boosting reasoning, those MoE architectures keep hitting the wall. Researchers have introduced a ’target-level hacking’ framework to break down why this happens. The bombshell? It all boils down to token-level credit mismatch creating bogus signals, causing a crazy divergence between training and inference. Mind blown! 🤯

Industry Outlook and Social Impact

Boom! 🚀 The big news we’ve been waiting for: Elon Musk has officially announced the merger of SpaceX and xAI! This colossal move rockets their combined valuation to a mind-blowing $1.25 trillion. In an Internal Letter (AI News) , Musk spilled the beans on pushing forward with his plan to deploy data centers in space. He’s convinced that space-based AI is the only path to true scalability, with a vision to launch millions of satellites to forge orbital data centers. Talk about reaching for the stars — he’s literally aiming for a Kardashev Type II civilization! 🌌
So, SpaceX isn’t just merging; they’re also applying to launch millions of computing satellites! 🛰️ But hold up, this isn’t about better internet. The real core of this wild plan is to build straight-up Orbital Data Centers (AI News) . We’re talking a projected total computing power of 80 EFLOPS for this constellation! By using the super-cold vacuum of space, they’re looking to totally sidestep those pesky heat dissipation problems. Expect this ambitious project to kick off in 2028 and be fully operational by 2030. This could be a game-changer, potentially unleashing a ‘dimension-reduction attack’ on traditional IDC vendors. 🤯
Tencent Hunyuan is on a talent spree, reeling in another top-tier scientist! 🤩 Tsinghua Ph.D. Pang Tianyu has officially come aboard as the Chief Research Scientist (AI News) for Hunyuan’s multimodal department. He’s set to dive deep into reinforcement learning technology, bringing his expertise from his previous gig at Singapore Sea AI Lab. This is a huge catch, following hot on the heels of Yao Shunyu’s arrival! 🔥

Open Source TOP Projects

superpowers: This is a killer agent skill framework and software development methodology, already boasting a whopping ⭐43217 stars! 🚀 Check out the project on GitHub (AI News) . It’s designed to help devs build even more powerful AI agent systems. Pretty neat, right?
dexter: Get this—an autonomous agent specifically crafted for deep financial research, already rocking 🔥9951 stars! 📈 You can snag all the details on GitHub (AI News) . It’s a smart analysis tool built just for the finance sector. Talk about specialized!
ccpm: This bad boy is a Claude Code project management system that uses GitHub Issues and Git worktrees to pull off parallel agent execution. It’s already scored ⭐6563 stars! 🛠️ Head over to GitHub (AI News) to see how it makes multi-agent collaboration way more efficient. Seriously streamlines things!
vm0: Wanna automate workflows using just natural language? This is the simplest way to do it! 🤖 It’s already snagged ⭐585 stars. The Project (AI News) lets you define your workflows with everyday language. Super user-friendly!
review-prompts: Need some help with AI code reviews? This is a dedicated collection of prompts just for that! 🧐 It’s already got ⭐235 stars. You can grab the full content on GitHub (AI News) . Handy!

Social Media Shares

Can you believe it? Andrej Karpathy’s Vibe Coding concept just hit its one-year anniversary! 🥳 This whole philosophy, introduced last year via Vibe Coding (AI News) , is about totally embracing LLMs and basically forgetting that code even exists. Karpathy’s take? Today’s models are so ridiculously powerful that he’s literally just chatting with Composer using only his voice. Talk about next-level! 🎤
So, what’s the word on the street about the Codex App experience? Users are saying it’s like having a super-focused, silent engineer on your team. 🤫 It just grinds away, gets the job done, and doesn’t ask for any credit. You won’t find any flashy dashboards here, unlike those other third-party apps. Apparently, it’s seriously lacking the ’emotional value’ that Claude Code provides. Some folks miss the bells and whistles, it seems! 🤷‍♀️ Check out this Review (AI News) .
A big shout-out from OpenAI developers: they’re pushing for a unified Skills directory path! 📂 The plea is for all agents to consistently use the .agents/skills (AI News) folder to stash their skill files. The current situation, with everyone doing their own thing, is creating a total mess of repetitive directories. Let’s get organized, people! 🧹
Ever wonder where you stand in the AI game? Someone just shared Brex company’s four-level AI capability classification for employees, and it’s pretty spot-on! 📊 Here’s the breakdown, as seen in this Brex Company (AI News) post: Users can ask questions, Advocates push tools, Builders actually create stuff, and Natives just naturally weave AI into their everyday routines. Go on, check it out and see where you fit in! 😉
Alright, coders and aspiring coders! There’s a super hot discussion brewing on Hacker News about learning to code strategies in the AI era. 🔥 The main takeaways are solid: you gotta nail those algorithm architecture fundamentals, treat LLMs like your cool mentor (not the absolute authority), and remember that deliberate struggle is actually the secret sauce for real learning. On the flip side, some folks are legit worried that coding itself might just become commoditized. 🤔 Food for thought! Dive into the Learning to Code in the AI Era (AI News) discussion.

AI News Daily Audio Version

🎙️ Xiaoyuzhou	📹 Douyin
Next Life Tavern	Self-Media Account

02-05 AI News 02-03 AI News