Avsnitt
-
INSANE AI news: OpenAI operator, DeepSeek R1, UI-TARS, Hunyuan 3D, Imagen 3 002, Kimi k1.5, TokenVerse
Note that most of the examples are visual. See my Youtube video for the best experience.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
INSANE AI news: Luma Ray2, Vidu 2.0, ChatGPT Tasks, Microsoft Phi-4, rStar-Math, MangaNinja, Nvidia Sana 4K images, Unitree demos, SVFR face restoration, & more!
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
Saknas det avsnitt?
-
SPAR3D realtime 3D models, TransPixar transparent videos, Nvidia Digits supercomputer, Hailuo Minimax consistent characters, Stereocrafter 2D to 3D videos, VideoAnydoor full video editing, SE01 robot, & more!
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
Add any person or style to videos with Hunyuan Loras, TangoFlux text to audio generator, Pixverse 3.5, 3DTrajMaster for full motion control, PERSE editable 3D heads
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
Open-source (Deepseek v3) beats GPT, AniDoc cartoon colorizer, image to realistic 3D heads, AI designs cars & clothes, interactive 3D models, & more!
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
OpenAI unveils o3 which beats humans at the ARC-AGI benchmark. Google has introduced Gemini Flash Thinking and an updated version of its video generation model, Veo 2, which can produce high-quality videos up to 4K resolution. Other updates include CAP4D, which creates realistic 4D avatars from reference images, new methods for generating 3D videos, a hyperrealistic simulator called Genesis, and more!
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
Gemini 2.0 & OpenAI SORA are out! MSFT Trellis makes insane 3D models, DiffSensei makes full comics, New AI Video to Audio & more!See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
AI makes any video game, open-source Hunyuan video beats all, OpenAI releases o1 pro, Google's Gemini tops the charts, AI predicts extreme weather & more!
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
This week in AI: AI makes 4D videos, AI separates video into layers, new open-source AI video, SORA gets leaked, realistic 3D textures, & more!
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
This week in AI: AnimateAnything for full video control, AI Matrix for infinite 3D words, BiomedParse for medical image analysis, Deepseek open source model beats o1 & more.See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
This week in AI, Alibaba Cloud unveiled a new open-source AI model that rivals GPT-4 in programming capabilities, while Google released a cutting-edge AI model that topped the Chatbot Arena leaderboard. Researchers also made breakthroughs in AI-assisted surgery, image editing, and video editing, while Nvidia unveiled a novel method for seamless image editing using prompts. Additionally, new AI tools were introduced for tracking account activity, creating 3D and 4D scenes from single images.See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
Huge AI updates this week. Flux Ultra and Raw modes have been released for high-resolution images. Nvidia’s ConsiStory generates consistent images of a subject from different text prompts. InstantIR restores damaged images with incredible detail and realism. MoGe turns 2D images into 3D point maps. X-Portrait 2 animates static portraits with realistic expressions. MVPaint generates high-quality 3D textures for 3D models. Microsoft’s OmniParser helps computers understand and interact with user interfaces…. and more!
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
There have been huge updates in AI this week, including OpenAI's integration of ChatGPT with search, Nvidia's Hover model for humanoid robots, an new video interpolation tool called Framer, a realtime AI-generated Minecraft simulation, plus an AI that can recreate smells.
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
There have been huge updates in AI this week, including Motion Inversion for transferring video motion, enhancing filmmakers' control over camera movements. Anthropic's AI agent, Claude, can interact with computers like a human, automating complex tasks. Alibaba's Tora enables drawing paths for object motion in videos. DeepMind's Vidpanos creates panoramic videos from regular footage. We have 2 new open-source video generators, RhymesAI's Allegro and Genmo's Mochi 1. AiOS detects human poses in videos, and Harvard's AI detects cancer with 96% accuracy. Ideogram Canvas offers advanced image editing features.
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
Researchers have developed a realtime AI-generated Counter-Strike game. Google's new RF-inversion method simplifies image editing, and Animate-X provides open-source animation for non-human characters. HALLO2 generates high-resolution videos from text prompts, and Google updates NotebookLM with new audio features. Archetype AI's Newton excels in understanding physical phenomena through sensor data. NVIDIA's Llama-3.1-Nemotron-70B-Instruct outperforms larger models in key benchmarks, and Mistral AI introduces small language models for edge devices.See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
Scientists have developed an AI-powered tongue that accurately identifies subtle differences in food and beverages. Meanwhile, Pyramid Flow, is a new open-source AI video generator that can create high-quality videos. Geoffrey Hinton and Demis Hassabis received Nobel Prizes for their contributions to AI. OpenAI is planning to establish its own data center in Texas to reduce reliance on Microsoft. NVIDIA introduced EdgeRunner, an AI model that generates detailed 3D mesh models.
See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
Meta announces their video generator called Movie Gen. Black Forest Labs released FLUX1.1 [pro], an AI image model that's significantly faster and offers improved quality. Microsoft Copilot releases free Voice feature. OpenAI's Canvas enhances ChatGPT's writing and coding capabilities, while new developer tools improve AI application efficiency and cost-effectiveness. NVIDIA's NVLM-1.0-D-72B excels in text and image interpretation. Pika 1.5 enhances video creativity with 'Pikaffects'. See my Youtube for the full video.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
Researchers have developed MIMO, an AI that can replace a person in a video using just one photo, simplifying video editing processes. OpenAI has introduced Advanced Voice Mode. Google’s updated Gemini AI models offer better performance and reduced costs. TxGNN, an AI model, identifies drugs for rare diseases. Key members have left OpenAI, signaling leadership changes. Meta released Llama 3.2 for interpreting images and text, along with Orion augmented reality glasses.
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
This week in AI news, Alibaba launched the open-source Qwen 2.5, which includes specialized models for coding and mathematics, outperforming larger models with fewer parameters. Kolors introduced a virtual try-on tool for online shopping, enhancing user experience by allowing photo uploads for realistic outfit previews. YouTube announced new AI features for creators, such as imaginative video backgrounds and automatic dubbing. Google developed an AI model to recognize whale sounds, aiding conservation efforts by tracking whale populations. WonderWorld introduced a system that creates customizable 3D scenes from a single image in under 10 seconds. Finally, Kling AI released version 1.5, enhancing its video generation capabilities with improved quality and new interactive tools.
Check out our Youtube for more AI news & reviews!
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com -
Google has introduced the "Audio Overview" feature, which transforms documents into engaging audio discussions, allowing users to listen to AI-generated summaries that connect topics, though it may have inaccuracies. OpenAI's new o1 model excels in reasoning and science at the PhD level. Meanwhile, French startup Mistral released Pixtral 12B, a free multimodal AI model that processes both images and text, and Chai Discovery launched Chai-1, an AI model for predicting molecular structures to aid drug discovery. Finally, a study shows AI-generated research ideas are perceived as more novel than those from human experts, raising questions about AI's role in fostering creativity.
Check out our Youtube for more AI news & reviews!
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit aisearch.substack.com - Visa fler