Avsnitt
-
Amazon has launched Q, an AI-powered assistant for businesses and developers that offers advanced capabilities such as code generation, testing, debugging, reasoning, and agents for step-by-step planning.
Microsoft's $1 billion investment in OpenAI was triggered by fears of falling behind Google in the AI race. The investment has helped Microsoft catch up and be seen as more of a leader in AI, with OpenAI's models integrated into their products.
A new dataset for the Global Artificial Intelligence Championship Math 2024 has been created, consisting of 387 math problems curated by professional math problem writers from prestigious institutions.
Three AI research papers were discussed, including a new approach to evaluating large language models using a panel of diverse models, a method for real-time, controllable motion generation, and the use of ranked list truncation for large language model-based re-ranking.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:36 Amazon Q, a generative AI-powered assistant for businesses and developers
03:08 Microsoftβs OpenAI investment was triggered by Google fears, emails reveal
05:12 A Dataset for The Global Artificial Intelligence Championship Math 2024
06:21 Fake sponsor
08:25 Ranked List Truncation for Large Language Model-based Re-Ranking
10:04 Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models
11:35 MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
13:16 Outro
-
Cohere Command R & R+ now available on Amazon for enterprise-grade workloads and multilingual support.
Big tech companies dominating AI lobbying efforts in Washington, potentially leading to weak regulations.
Multi-token prediction proposed as a new way of training large language models, resulting in higher sample efficiency and faster inference.
KANs, a new type of neural network with learnable activation functions on edges or weights, outperform MLPs in accuracy and interpretability, and can help scientists discover mathematical and physical laws.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:54 Cohere Command R & R+ now available on Amazon
03:25 Thereβs an AI Lobbying Frenzy in Washington. Big Tech Is Dominating
05:22 THE 150X PGVECTOR SPEEDUP: A YEAR-IN-REVIEW
06:31 Fake sponsor
08:04 Better & Faster Large Language Models via Multi-token Prediction
09:55 KAN: Kolmogorov-Arnold Networks
11:51 Iterative Reasoning Preference Optimization
13:43 Outro
-
Saknas det avsnitt?
-
OpenAI partners with the Financial Times to enhance ChatGPT with their award-winning journalism and develop new AI products and features for FT readers.
GitHub announces the technical preview of GitHub Copilot Workspace, a Copilot-native developer environment that could revolutionize the way developers work.
Memary, an open-source long-term memory system for autonomous agents, solves the problem of limited context windows for agents by allowing them to store a large corpus of information in knowledge graphs and retrieve only relevant information for meaningful responses.
The papers discussed in this episode showcase the latest advancements in AI research, including AdvPrompter, HaLo-NeRF, and PLLaVA, which address issues related to large language models, digital exploration of large-scale tourist landmarks, and video understanding.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:51 Weβre bringing the Financial Timesβ world-class journalism to ChatGPT
02:54 GitHub Copilot Workspace: Welcome to the Copilot-native developer environment
04:43 memary: Open-Source Longterm Memory for Autonomous Agents
05:55 Fake sponsor
07:44 AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs
09:26 HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring Unconstrained Photo Collections
11:17 PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning
13:18 Outro
-
SenseTime's new AI model, SenseNova 5.0, beats GPT-4 Turbo across key benchmarks, suggesting China's AI may be closer to competing with the US than previously thought.
Apple is in talks with OpenAI to potentially integrate their features into iOS 18, which could trigger a new era of AI adoption.
"Toward Inference-optimal Mixture-of-Expert Large Language Models" proposes a new scaling law for MoE-based LLMs to efficiently scale without sacrificing performance.
"How to Train Data-Efficient LLMs" investigates data-efficient approaches for pre-training LLMs, which can significantly reduce the amount of data needed to train LLMs.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:30 Chinese AI model bests GPT-4 Turbo
02:35 Apple Intensifies Talks With OpenAI for iPhone Generative AI Features
04:17 OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
05:33 Fake sponsor
07:55 Toward Inference-optimal Mixture-of-Expert Large Language Models
09:21 Scaling Laws For Dense Retrieval
11:01 How to Train Data-Efficient LLMs
12:50 Outro
-
Microsoft's investment in AI is paying off, with a 17% jump in revenue and a 20% increase in profit for the first three months of the year.
Apple has released eight small AI language models aimed at on-device use, using a "layer-wise scaling strategy" to improve performance and transparency.
Multi-Head Mixture-of-Experts is a new approach to address issues with Sparse Mixtures of Experts, outperforming existing models on three different tasks.
Stream of Search (SoS) is a new technique for teaching language models to search, resulting in improved search accuracy and the ability to solve previously unsolved problems.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:28 Microsoft Reports Rising Revenues as A.I. Investments Bear Fruit
03:14 Apple releases eight small AI language models aimed at on-device use
05:00 Fake sponsor
07:01 Multi-Head Mixture-of-Experts
08:43 Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Perfect Reasoners
10:30 Stream of Search (SoS): Learning to Search in Language
12:31 Outro
-
Meta's aggressive AI investments have caused a 13% plunge in their stock, threatening to wipe out almost $163 billion from their market value.
TSMC's new A16 manufacturing process promises to outperform its predecessor, N2P, by a significant margin, with an up to 10% higher clock rate at the same voltage and a 15% - 20% lower power consumption at the same frequency and complexity.
The Instruction Hierarchy proposes a data generation method to demonstrate hierarchical instruction following behavior, which drastically increases robustness for LLMs against attacks.
SPLATE is a lightweight adaptation of the ColBERTv2 model that improves the efficiency of late interaction retrieval, particularly for running ColBERT on CPU environments.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:27 Metaβs stock plunges on βaggressiveβ AI spending plans
02:49 TSMC unveils 1.6nm process technology with backside power delivery, rivals Intel's competing design
04:48 tiny-gpu
05:59 Fake sponsor
07:35 The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
08:43 A Reproducibility Study of PLAID
10:18 SPLATE: Sparse Late Interaction Retrieval
12:00 Outro
-
Perplexity becomes an AI unicorn with a new $63 million funding round.
NVIDIA acquires Run:ai, an Israeli startup that provides Kubernetes-based workload management and orchestration software for AI computing resources.
Llama-3 language model reaches the top-5 of the LM arena leaderboard.
New AI research papers explore efficient language models, LLMs that can read your minds, and mixtures of experts.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:44 Perplexity becomes an AI unicorn with new $63 million funding round
03:20 NVIDIA to Acquire GPU Orchestration Software Provider Run:ai
05:24 Llama 3 on top-5 of LM arena leaderboard
06:48 Fake sponsor
08:54 OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
10:32 SnapKV: LLM Knows What You are Looking for Before Generation
12:37 Multi-Head Mixture-of-Experts
14:37 Outro
-
Microsoft has launched its smallest AI model yet, the Phi-3 Mini, which is designed to be smaller and cheaper to run than its larger counterparts.
SoftBank plans to invest nearly $1 billion in Nvidia's chips to bolster its computing facilities and develop its own generative AI, giving Japan a strong domestic player in the AI space.
HuggingFace has released FineWeb, a dataset consisting of more than 15 trillion tokens of cleaned and deduplicated English web data from CommonCrawl, which outperforms models trained on other commonly used high-quality web datasets.
The papers discussed in this episode cover topics such as extending embedding models for long context retrieval, automating graphic design using large multimodal models, and Microsoft's innovative approach to training the Phi-3 Mini AI model.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:35 Microsoft launches Phi-3, its smallest AI model yet
03:10 SoftBank will reportedly invest nearly $1 billion in AI push, tapping Nvidiaβs chips
05:11 HuggingFace Releases FineWeb: 15 Trillion tokens to train on
06:02 Fake sponsor
08:15 Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
09:42 LongEmbed: Extending Embedding Models for Long Context Retrieval
11:04 Graphic Design with Large Multimodal Model
12:53 Outro
-
Google merges Android, Chrome, and hardware divisions to deliver higher quality products and experiences for users and partners, with a focus on AI innovation.
Boston Dynamics introduces the electric Atlas robot, designed for real-world applications and stronger, more dexterous, and more agile than its predecessors.
"Towards Large Language Models as Copilots for Theorem Proving in Lean" explores using large language models to assist humans in theorem proving.
"AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation" introduces AutoCrawler, a framework for generating web crawlers that leverages the power of large language models to handle diverse and changing web environments more efficiently.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:32 Google merges the Android, Chrome, and hardware divisions
03:02 New Atlas Robot from Boston Dynamics
05:01 Karpathi On Llama3
06:19 Fake sponsor
08:14 Towards Large Language Models as Copilots for Theorem Proving in Lean
09:47 AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation
11:21 Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
12:58 Outro
-
Meta announces the release of Llama 3, their new open-source language model with improved reasoning and instruction-following capabilities.
Microsoft invests $1.5 billion in UAE-based AI firm G42, with concerns over its China links requiring negotiations with the Biden administration.
Researchers present "Dynamic Typography," an automated text animation scheme that combines deforming letters to convey semantic meaning and infusing them with movement based on user prompts.
The AI Safety Benchmark from MLCommons is a tool to assess the safety risks of AI systems that use chat-tuned language models, covering 7 of the 13 hazard categories identified by the working group.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:47 Meta Announces Llama 3
03:11 Microsoft invests $1.5B in UAE AI firm
04:59 Randar: A Minecraft exploit that uses LLL lattice reduction to crack server RNG
06:23 Fake sponsor
08:08 Dynamic Typography: Bringing Text to Life via Video Diffusion Prior
09:38 Introducing v0.5 of the AI Safety Benchmark from MLCommons
11:15 BLINK: Multimodal Large Language Models Can See but Not Perceive
13:05 Outro
-
Boston Dynamics has revealed their new Atlas robot, which boasts impressive dexterity and agility, and is designed for real-world applications.
Stable Assistant, a chatbot powered by Stability AI's text and image generation technology, is now available via an API on the Stability AI developer platform, and features Stable Diffusion 3 and Stable LM 2 12B.
Google DeepMind's "Many-Shot In-Context Learning" proposes a new method of learning from a few examples in a specific context, and found that Reinforced and Unsupervised ICL settings can be quite effective in the many-shot regime.
AWS AI Labs' "Fewer Truncations Improve Language Modeling" introduces a new method called Best-fit Packing that packs documents into training sequences through length-aware combinatorial optimization, and achieved superior performance compared to concatenation.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:25 Boston Dynamics reveals the new Atlas robot
02:47 Stable Diffusion 3 API now available as Stable Assistant effort looms
04:54 Cyc: history's forgotten AI project
06:12 Fake sponsor
08:02 Many-Shot In-Context Learning
09:59 Fewer Truncations Improve Language Modeling
11:38 Can Language Models Solve Olympiad Programming?
13:12 Outro
-
Adobe is introducing new AI-powered tools to their video editing software, including the ability to extend video clips, add or remove objects from scenes, and generate B-roll footage using prompts.
Amazon's Bedrock platform is adding all three versions of Anthropic's Claude 3 AI model, enhancing the ability of customers to rapidly test, build, and deploy generative AI applications across their organizations.
"The Illusion of State in State-Space Models" challenges the assumption that SSMs are inherently better at state tracking than transformers.
"Megalodon" proposes a new neural architecture for efficient sequence modeling, allowing for unlimited context length and better efficiency than Transformers.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:35 Adobe previews AI video features
02:56 Amazon Puts All Three Claude AI Models on Bedrock
05:07 Automating Complex Business Workflows with Cohere: Multi-Step Tool Use in Action
07:02 Fake sponsor
09:02 The Illusion of State in State-Space Models
10:58 Generative Information Retrieval Evaluation
12:49 Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
14:26 Outro
-
Reka Core, a comprehensive multimodal solution, is one of only two commercially available models that can handle input from text, images, videos, and audio.
OpenAI's Batch API promises to save costs and increase rate limits on certain async tasks like summarization, translation, and image classification.
COCONut is the largest and most comprehensive segmentation dataset to date, with high-quality annotations and harmonized segmentation types.
DR-PO algorithm directly resets the policy optimizer to the states in the offline dataset, leading to better generative models that are fine-tuned to human preferences.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:52 Reka Core: Our Frontier Class Multimodal Language Model
03:43 OpenAI Batch API
05:14 OpenAI fires two researchers for leaking info
06:42 Fake sponsor
08:54 COCONut: Modernizing COCO Segmentation
10:27 Dataset Reset Policy Optimization for RLHF
12:22 Probing the 3D Awareness of Visual Foundation Models
13:58 Outro
-
Meta is testing an AI-powered search bar in Instagram, which could improve the quality of search and help users discover new content on the platform.
Grok-1.5V is a new multimodal model that can process a wide variety of visual information and outperforms its peers in the new RealWorldQA benchmark.
"Scaling (Down) CLIP" explores the performance of the Contrastive Language-Image Pre-training (CLIP) when scaled down to limited computation budgets, and shows that smaller datasets and models can still achieve comparable performance.
"Pre-training Small Base LMs with Fewer Tokens" investigates a simple approach called Inheritune to develop a small base language model (LM) from a larger existing LM, which can effectively match the val loss of their bigger counterparts when trained from scratch for the same number of training steps.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:40 Meta is testing an AI-powered search bar in Instagram
03:02 Grok-1.5 Vision Preview
04:56 Visualizing Attention, a Transformer's Heart
06:12 Fake sponsor
08:27 Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies
10:11 Pre-training Small Base LMs with Fewer Tokens
11:58 Flying with Photons: Rendering Novel Views of Propagating Light
13:57 Outro
-
The Ai Pin, a new device that offloads smartphone tasks, is discussed, funded by OpenAI's Sam Altman and other companies.
A Twitter thread about OpenAI's spider problem is shared, raising questions about the consequences of AI technology.
The paper "Adapting LLaMA Decoder to Vision Transformer" explores adapting decoder-only Transformers to computer vision, resulting in the creation of iLLaMA.
The paper "Exploring Concept Depth" studies how large language models acquire knowledge at different depths, with implications for understanding learning processes and designing models.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:33 This Artificially Intelligent Pin Wants to Free You From Your Phone
03:32 Anyone got a contact at OpenAI. They have a spider problem.
04:47 STORM: Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking
06:28 Fake sponsor
08:23 Adapting LLaMA Decoder to Vision Transformer
10:11 RULER: What's the Real Context Size of Your Long-Context Language Models?
12:05 Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?
13:42 Outro
-
Meta's Training and Inference Accelerator promises significant performance improvements for AI workloads.
Avi Wigderson receives the Turing Award for his contributions to the theory of computation and randomness in computation.
Intel's Meteor Lake iGPU and Mistral 8x22B offer exciting advancements in the GPU market and language models.
MuPT and Eagle and Finch present new models for music generation and sequence modeling, respectively.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:37 Our next-generation Meta Training and Inference Accelerator
03:02 ACM A.M. Turing Award Honors Avi Wigderson for Foundational Contributions to the Theory of Computation
05:05 Intelβs Ambitious Meteor Lake iGPU
06:06 Mistral 8x22B
07:34 Fake sponsor
09:34 MuPT: A Generative Symbolic Music Pretrained Transformer
11:11 Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence
12:49 Outro
-
Meta is launching its Llama 3 open source LLM with 140 billion parameters, catching up to OpenAI's ChatGPT.
Intel's Gaudi 3 AI accelerator is breaking down proprietary walls to bring choice to enterprise GenAI market, with OEMs like Dell and Lenovo adopting it.
Microsoft Research's Direct Nash Optimization (DNO) is a new approach to improving language models, achieving state-of-the-art win-rates against GPT-4-Turbo.
UniFL is a unified framework that uses feedback learning to enhance diffusion models, improving both the quality of generated models and their acceleration.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:47 Meta confirms that its Llama 3 open source LLM is coming in the next month
03:17 Intel Breaks Down Proprietary Walls to Bring Choice to Enterprise GenAI Market
05:18 QCon London: Meta Used Monolithic Architecture to Ship Threads in Only Five Months
06:42 Fake sponsor
09:10 Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences
10:48 UniFL: Improve Stable Diffusion via Unified Feedback Learning
12:24 MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
14:13 Outro
-
Sam Altman and Jony Ive are raising funding for a secret AI device company that could challenge the conventional smartphone experience and explore new interaction modalities with artificial intelligence. Tesla is unveiling its new 'robotaxi' on August 8th, which is specifically designed for ridesharing and could potentially shift the company's focus away from achieving self-driving on their existing fleet. "Stream of Search (SoS): Learning to Search in Language" is a paper that proposes a new approach to teach language models how to search by representing the search process in language, as a flattened string. "AutoWebGLM: Bootstrap and Reinforce a Large Language Model-based Web Navigating Agent" is a paper that introduces an automated web navigation agent that uses a Large Language Model (LLM) to browse the web, which outperforms the state-of-the-art chatbot-based and rule-based methods.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:30 Sam Altman and Jony Ive Raising Funding for Secret AI Device Company
02:58 Tesla is unveiling its new βrobotaxiβ on August 8
04:27 llm.c by Andrej Karpathy
05:43 Fake sponsor
07:41 Stream of Search (SoS): Learning to Search in Language
09:11 Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model
10:51 AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
12:52 Outro
-
Tesla and OpenAI are in a talent war for AI experts, with OpenAI offering salaries of up to $925,000.
YouTube has warned that OpenAI's training of its text-to-video AI model using YouTube videos would violate its policies.
Three AI research papers are discussed, including a new approach to unsupervised domain adaptation for ranking, a more efficient method for dense retrieval using bit vectors, and a new approach to representation finetuning for language models.
The episode includes humorous banter and a quirky sponsor segment for a mosquito repellent that doesn't work.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:26 Tesla vs OpenAI Talent Wars
02:51 YouTube Says OpenAI Training Sora With Its Videos Would Break Rules
04:25 Your guide to AI: April 2024
06:26 Fake sponsor
08:34 ReFT: Representation Finetuning for Language Models
09:58 Efficient Multi-Vector Dense Retrieval Using Bit Vectors
11:54 DUQGen: Effective Unsupervised Domain Adaptation of Neural Rankers by Diversifying Synthetic Query Generation
13:51 Outro
-
Command R+ is a new language model designed for enterprise-grade workloads that outperforms similar models in the scalable market category and offers multilingual coverage in 10 key languages to support global business operations.
JetMoE-8B is a new model that was trained with less than $0.1 million cost and outperformed LLaMA2-7B from Meta AI, who has multi-billion-dollar training resources.
Mixture-of-Depths is a new method proposed for transformer-based language models that dynamically allocates compute to specific positions in a sequence, optimizing the allocation along the sequence for different layers across the model depth.
Think-and-Execute is a new framework that aims to improve algorithmic reasoning in large language models by decomposing the reasoning process into two steps: discovering task-level logic that is shared across all instances for solving a given task and expressing it with pseudocode, and simulating the generated pseudocode to execute the code.
Contact: [email protected]
Timestamps:
00:34 Introduction
01:42 Introducing Command R+: A Scalable LLM Built for Business
03:38 JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars
05:08 AI & the Web: Understanding and managing the impact of Machine Learning models on the Web
06:37 Fake sponsor
08:44 Do language models plan ahead for future tokens?
10:04 Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
11:33 Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models
13:40 Outro
- Visa fler