11: CVPR Workshop on Autonomous Driving Keynote by Ashok Elluswamy, a Tesla engineer

Avsnitt

LoRA
2 sep 2023· Argmax
We talk about Low Rank Approximation for fine tuning Transformers. We are also on YouTube now! Check out the video here: https://youtu.be/lLzHr0VFi3Y
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
15: InstructGPT
28 mar 2023· Argmax
In this episode we discuss the paper "Training language models to follow instructions with human feedback" by Ouyang et al (2022). We discuss the RLHF paradigm and how important RL is to tuning GPT.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
Saknas det avsnitt?

Klicka här för att uppdatera flödet manuellt.
14: Whisper
17 mar 2023· Argmax
This week we talk about Whisper. It is a weakly supervised speech recognition model.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
13: AlphaTensor
11 mar 2023· Argmax
We talk about AlphaTensor, and how researchers were able to find a new algorithm for matrix multiplication.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
12: SIRENs
25 okt 2022· Argmax
In this episode we talked about "Implicit Neural Representations with Periodic Activation Functions" and the strength of periodic non-linearities.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
11: CVPR Workshop on Autonomous Driving Keynote by Ashok Elluswamy, a Tesla engineer
30 sep 2022· Argmax
In this episode we discuss this video: https://youtu.be/jPCV4GKX9Dw

How Tesla approaches collision detection with novel methods.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
10: Outracing champion Gran Turismo drivers with deep reinforcement learning
23 aug 2022· Argmax
We discuss Sony AI's accomplishment of creating a novel AI agent that can beat professional racers in Gran Turismo. Some topics include:
- The crafting of rewards to make the agent behave nicely
- What is QR-SAC?
- How to deal with "rare" experiences in the replay buffer

Link to paper: https://www.nature.com/articles/s41586-021-04357-7
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
9: Heads-Up Limit Hold'em Poker Is Solved
29 jul 2022· Argmax
Today we talk about recent AI advances in Poker; specifically the use of counterfactual regret minimization to solve the game of 2-player Limit Texas Hold'em.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
8: GATO (A Generalist Agent)
29 jul 2022· Argmax
Today we talk about GATO, a multi-modal, multi-task, multi-embodiment generalist agent.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
7: Deep Unsupervised Learning Using Nonequilibrium Thermodynamics (Diffusion Models)
14 jun 2022· Argmax
We start talking about diffusion models as a technique for generative deep learning.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
6: Deep Reinforcement Learning at the Edge of the Statistical Precipice
6 jun 2022· Argmax
We discuss NeurIPS outstanding paper award winning paper, talking about important topics surrounding metrics and reproducibility.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
5: QMIX
26 apr 2022· Argmax
We talk about QMIX https://arxiv.org/abs/1803.11485 as an example of Deep Multi-agent RL.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
4: Can Neural Nets Learn the Same Model Twice?
6 apr 2022· Argmax
Todays paper: Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility
and Double Descent from the Decision Boundary Perspective (https://arxiv.org/pdf/2203.08124.pdf)

Summary:
A discussion of reproducibility and double descent through visualizations of decision boundaries.

Highlights of the discussion:
Relationship between model performance and reproducibilityWhich models are robust and reproducibleHow they calculate the various scores
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
3: VICReg
21 mar 2022· Argmax
Todays paper: VICReg (https://arxiv.org/abs/2105.04906)

Summary of the paper
VICReg prevents representation collapse using a mixture of variance, invariance and covariance when calculating the loss. It does not require negative samples and achieves great performance on downstream tasks.

Highlights of discussion
The VICReg architecture (Figure 1)Sensitivity to hyperparameters (Table 7)Top 5 metric usefulness
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
2: data2vec
7 mar 2022· Argmax
Todays paper: data2vec (https://arxiv.org/abs/2202.03555)

Summary of the paper
A multimodal SSL algorithm that predicts latent representation of different types of input.
Highlights of discussion
What are the motivations of SSL and multimodalHow does the student teacher learning work?What are similarities and differences between ViT, BYOL, and Reinforcement Learning algorithms.
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare
1: Reward is Enough
21 feb 2022· Argmax
This is the first episode of Argmax! We talk about our motivations for doing a podcast, and what we hope listeners will get out of it.

Todays paper: Reward is Enough

Summary of the paper
The authors present the Reward is Enough hypothesis: Intelligence, and its associated abilities, can be understood as subserving the maximisation of reward by an agent acting in its environment.
Highlights of discussion
High level overview of Reinforcement LearningHow evolution can be encoded as a reward maximization problemWhat is the one reward signal we are trying to optimize?
- Lyssna Lyssna igen Fortsätt Lyssnar...
- Lyssna senare Lyssna senare

Avsnitt

LoRA

15: InstructGPT

14: Whisper

13: AlphaTensor

12: SIRENs

10: Outracing champion Gran Turismo drivers with deep reinforcement learning

9: Heads-Up Limit Hold'em Poker Is Solved

8: GATO (A Generalist Agent)

7: Deep Unsupervised Learning Using Nonequilibrium Thermodynamics (Diffusion Models)

6: Deep Reinforcement Learning at the Edge of the Statistical Precipice

5: QMIX

4: Can Neural Nets Learn the Same Model Twice?

3: VICReg

2: data2vec

1: Reward is Enough