06-11-Daily
AI Insights Daily 2025/6/11
AI Product and Feature Updates
- The Doubao Large Model Family will be dropping a major bombshell at the 2025 FORCE Originality Conference: the brand-new Doubao·Video Generation Model. This model is basically a “creative magic wand”! Thanks to its efficient structure and multi-task unified modeling, it not only supports seamless multi-shot storytelling and precise response to multiple actions, but can also control the camera like a pro! It can easily generate high-quality videos in various styles like realistic and anime. It’s a video creator’s dream come true!
- xAI’s Grok AI is seriously shaking things up by taking over X’s recommendation algorithm and optimizing the comment sorting mechanism. This means the platform will prioritize high-quality content instead of just looking at follower count. It’s a massive opportunity for “small accounts” and newbies with real talent to get some exposure, aiming to create a fairer and more open content ecosystem where good stuff doesn’t go unnoticed.
- The Doubao App also recently got a major upgrade to its “one-sentence photo editing” feature! Powered by the awesome SeedEdit 3.0 model, it now has a bunch of cool new editing tricks like one-click text adding/replacement, texture style transfer, and local image editing enhancements. This upgrade is like having a professional photo editor in your pocket! Even regular users can create personalized photos without any special skills, turning “editing noobs” into “editing masters”.
- Apple unveiled a “killer” feature in iOS 26 at WWDC 2025: Visual Intelligence. With this, you can ask questions about, search for, and even automatically identify event details from any image or information on your screen. It’s basically a “smart eye” for your phone! This upgrade uses AI tech to “instantly recognize” screen content, greatly improving the convenience and intelligence of the interactive experience. It can even automatically extract event info and add it to your calendar, making your digital life even easier.
- Great news! Immersive translation just got a major update and can now translate Twitter (X) videos in real-time! Even if the video doesn’t have original subtitles, it can “magically” display Chinese and English subtitles simultaneously. Now you don’t have to worry about language barriers when browsing X videos. It’s a “godsend” for cross-cultural communication, totally removing language obstacles and bringing the world closer together. Link
AI Cutting-Edge Research
- The University of Hong Kong and Huawei Noah’s Ark Lab have teamed up to launch the groundbreaking FUDOKI model. This model uses a non-masked discrete flow matching architecture, successfully breaking free from the constraints of traditional autoregressive models and achieving more flexible and efficient multi-modal generation and understanding capabilities. Through its unique parallel denoising mechanism, it significantly improves the performance of complex reasoning and generation tasks, especially in image generation. It paves the way for the future development of general artificial intelligence.
- The research team from Hong Kong University of Science and Technology and Kuaishou Technology jointly released EvoSearch (Evolutionary Search) technology, which is a breath of fresh air in the AI art generation field! It completely overturns the previous mindset of “big models, big computing power” and cleverly integrates Darwin’s theory of evolution into the AI generation process. This allows “small” models to generate high-quality images and videos that surpass or even rival “big guys”. This breakthrough technology is expected to usher in an “intelligent evolution” era for AI creation, allowing AI models to unleash deeper potential during the inference stage. Related project homepage, code, and paper links have been released: https://tinnerhrhe.github.io/evosearch/、https://github.com/tinnerhrhe/EvoSearch-codes、https://arxiv.org/abs/2505.17618.
- An academic paper titled “Generalization through Play: Learning Reasoning by Playing Games” reveals an exciting finding: Multi-modal Large Language Models (MLLMs) can significantly improve their cross-domain multi-modal reasoning abilities by playing simple arcade games, even surpassing specialized models trained on specific data! This undoubtedly points to a fun new direction for the future cultivation of general AI capabilities, allowing AI to become smarter through “play”. This link
- A new paper called “Dreamland” proposes a hybrid framework that combines physical simulators with large generative models. Its goal is to create highly controllable and realistic dynamic virtual worlds, which not only significantly improves image quality and controllability, but more importantly, is expected to provide an ideal “playground” and “laboratory” for the training of embodied AI agents, helping AI to better learn and act in the real world. Link
AI Industry Outlook and Social Impact
- Li Auto recently underwent a major “transformation” of its organizational structure and officially established two new second-level departments: “Spatial Robotics” and “Wearable Robotics”. This is more than just a departmental adjustment; it heralds Li Auto’s transformation from a traditional car manufacturer to a smart mobility ecosystem builder. They aim to build a complete smart life service system covering the “third space” inside the car and smart wearable devices outside the car through robotics technology. This will undoubtedly bring new differentiated advantages to Li Auto in the fiercely competitive market, making the “third space” strategy more than just a concept.
- Ohio State University announced that starting this year, it will require all students to receive artificial intelligence (AI) training, which is basically a “tailor-made” skill set for the future workplace! The school launched the “AI Fluency” program, which fully integrates AI education into undergraduate courses, aiming to cultivate students’ ability to effectively combine professional knowledge with AI technology. Of course, the school also emphasizes that students must not use generative AI to “cheat” and strengthens teacher training to maintain academic integrity. This move aims to ensure that every graduate can effectively apply AI in their professional field and actively respond to the Ohio AI Education Alliance’s efforts to promote AI education in K-12 education, making AI a true “super assistant” for everyone.
- The well-known thinker Li Jigang pointed out incisively that when AI technology becomes more and more efficient and powerful, human judgment, taste, and understanding of the purpose of things will become more hardcore. Because although AI can generate thousands of solutions and execute them perfectly, it cannot replace humans in making choices, defining beauty, or understanding complex and profound human nature. This reminds us that in the AI era, what is truly valuable may be the “human-only skills” that AI cannot reach. Link
Open Source TOP Projects
- The hi lab team of Xiaohongshu recently presented a “big gift” - the first open-source text large model dots.llm1! This Mixture of Experts (MoE) language model with 142 billion parameters, after being trained on massive real data, its performance can actually rival Alibaba’s Qwen2.5-72B. It’s basically a “dark horse” in the model world! This open source not only demonstrates Xiaohongshu’s technical ambition in the field of artificial intelligence, but also aims to provide more intelligent services and encourage developers to join the “chorus” of AI research together.
- Recently, two AI-related projects on GitHub have become very popular. Among them, the “newsnow” project with 10785 stars aims to provide users with an elegant real-time hot news reading experience, making information acquisition convenient and efficient. It’s basically a godsend for “news junkies,” the address is here: This link. The other is the “GenAI_Agents” project, with a high popularity of 12884 stars, providing developers with basic to advanced tutorials and implementations of generative AI agent technology, aiming to empower the construction of more intelligent interactive AI systems. Details can be found at: This link.
Social Media Sharing
- Gorden Sun shared the Mirage virtual human model product on social media. This product is basically a magician of “digital avatars”! It can generate vivid, lip-synced, and expressive virtual human videos driven by audio, which is very lifelike. Gorden Sun also emphasized that the detailed technical report of the product is of great reference value to researchers, and it seems that it will trigger another “arms race” in virtual human technology. Link
- Sam Altman announced on X that the price of the o3 product has been drastically reduced by 80%, which is basically a “welfare giveaway”! He expressed his expectation for innovative uses by users and previewed that the o3-pro version will also offer satisfactory pricing. It seems that the father of Sora is encouraging everyone to let go and explore the infinite possibilities of AI at a lower cost. Link
- Ryan ᵐᶠᵉʳ 🦄d/acc threw out a profound point of view about the next generation of entrepreneurs: they should not be bound by imitating previous successful models such as Jobs, nor should they be limited by limited low-quality input, but should be true to themselves and freely explore with a unique “vibe” and playful spirit. It’s like saying, don’t be someone else’s shadow, go create your own “rules of the game”! Link
- User wwwgoubuli shared an interesting shift in the use of AI in actual work. He mentioned that remote team members initially did not dare to fully use AI for fear of being seen as slacking off, but after he shared the “correct way” to use AI many times, the team gradually “let go”, and as a result, the comments, specifications, and quality of the code were significantly improved, and colleagues also showed greater confidence. This is basically a “textbook” case of AI empowering team efficiency, breaking the “AI anxiety” in their hearts. Link
Listen to the Audio Version
🎙️ Xiaoyuzhou | 📹 Douyin |
---|---|
Next Life Tavern | Next Life Intelligence Station |
![]() | ![]() |
Last updated on