Avsnitt

  • Many, many signs of life for preference fine-tuning beyond spoofing chat evaluation tools.
    This is AI generated audio with Python and 11Labs.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/how-rlhf-works-2

    00:00 How RLHF works, part 2: A thin line between useful and lobotomized
    04:27 The chattiness paradox
    08:09 The mechanism for making models chattier
    10:42 Next steps for RLHF research

    Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_012.webp
    Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_018.png
    Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/rlhf/img_025.png

  • Models that seem totally out of scope from recent open LLMs give us a sneak peek of where the industry will be in 6 to 18 months.
    This is AI generated audio with Python and 11Labs.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/phi-3-and-arctic-llms

    0:00 Phi 3 and Arctic: Outlier LMs are hints
    1:01 Arctic & open mixture of expert trends
    6:10 Phi 3, synthetic data, and small models

    Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/phi3/img_004.png
    Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/phi3/img_008.png
    Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/phi3/img_018.png

  • Saknas det avsnitt?

    Klicka här för att uppdatera flödet manuellt.

  • Certain definitions of AGI are backing people into a pseudo-religious corner.
    This is AI generated audio with Python and 11Labs.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/agi-is-what-you-want-it-to-be

    00:00 AGI is what you want it to be
    04:01 RL still rules the AGI discourse
    05:43 Modern AGI tests
    07:37 Agency and shifting goalposts

    Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/agi/img_018.png
    Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/agi/img_020.png

  • Meta shows that scaling won't be a limit for open LLM players in the near future.
    This is AI generated audio with Python and 11Labs.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/llama-3-and-scaling-open-llms

    00:00 Llama 3; scaling open LLMs to AGI
    01:44 Pretraining, data, and basic evals
    06:06 Alignment and human evaluations
    10:08 Chatting with Meta AI and Llama 3 70B Instruct
    11:55 Same Llama license (mostly)
    12:52 The healthy open LLM ecosystem

    Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_011.jpeg
    Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_013.png
    Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_015.png
    Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_020.png
    Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_036.png
    Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_040.png
    Fig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_046.jpeg
    Fig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_061.png
    Fig 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_063.webp
    Fig 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_066.png
    Fig 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama3/img_068.jpeg

  • Integrating some non computing science into reinforcement learning from human feedback can give us the models we want.
    This is AI generated audio with Python and 11Labs.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/reinventing-llm-alignment

    0:00 Stop "reinventing" everything to "solve" AI alignment
    2:19 Social Choice for AI Alignment: Dealing with Diverse Human Feedback
    7:03 OLMo 1.7 7B: A truly open model with actually good benchmarks


    Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_013.png
    Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_015.png
    Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_018.png
    Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_024.png
    Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reinvention/img_027.png

  • Modeling the compute versus performance tradeoff of many open LLMs.
    This is AI generated audio with Python and 11Labs.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/compute-efficient-open-llms

    0:00 The end of the "best open LLM"
    3:05 Compute efficient open LLMs

    Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_004.jpeg
    Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_009.png
    Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_014.png
    Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_016.png
    Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_018.png
    Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_020.png
    Fig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_022.png
    Fig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_024.png
    Fig 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/scaling/img_028.png

  • Last minute title change from: The tech industry can't agree on what open-source AI means. That's the process.
    How to read what multiple people mean by the word openness and see through the PR speak.
    This is AI generated audio with Python and 11Labs.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/flavors-of-open-source-ai

    0:00 The tech industry can't agree on what open-source AI means. That's the process.
    2:45 1. Effective Accelerationists, Techno-Optimists, capitalists, etc.
    3:39 2. Scientists, promoting understanding and transparency
    5:16 3. Inclusion, public interest, and fighting concentration of power
    6:19 4. Freedom advocates
    7:25 Dissecting "openness"

    Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/openness/img_004.png

  • Databricks' new model is surpassing the performance of Mixtral and Llama 2 while still being in a size category that's reasonably accessible.
    This is AI generated audio with Python and 11Labs.
    Source code: https://github.com/natolambert/interconnects-tools
    https://www.interconnects.ai/p/databricks-dbrx-open-llm

    00:00 DBRX: The new best open model and Databricks' ML strategy
    03:36 The DBRX narrative
    07:33 Databricks' open LLM (and AI) strategy
    09:42 Playing with DBRX Instruct
    14:54 Digging for details

    Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_007.png
    Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_012.png
    Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_023.png
    Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_045.png
    Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_047.png
    Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_059.png
    Fig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_066.jpeg
    Fig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/dbrx/img_068.png

  • Evaluation is not only getting harder with modern LLMs, it's getting harder because it means something different.
    This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/evaluations-trust-performance-and-price

    00:00 Evaluations: Trust, performance, and price (bonus, announcing RewardBench)
    03:14 The rising price of evaluation
    05:40 Announcing RewardBench: The First reward model evaluation tool
    08:37 Updates to RLHF evaluation tools

    YouTube code intro: https://youtu.be/CAaHAfCqrBA

    Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/evals/img_026.png
    Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/evals/img_030.png
    Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/evals/img_034.png
    Figure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/evals/img_040.png

  • Where moats are tested now that so many people have trained GPT4 class models. Claude 3, Gemini 1.5, Inflection 2.5, and Mistral Large are here to party.
    This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/gpt4-commoditization-and-moats

    00:00 Building LLM moats despite the commoditization of GPT4
    04:38 The Open's opportunities
    08:02 It's amazing people still think LLMs aren't going to be useful
    09:50 Things that are coming

    Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/moats/img_004.png
    Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/moats/img_028.png
    Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/moats/img_032.png

  • A proposal for a new definition of an "open source" LLM and why no definition will ever just work.
    This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/an-open-source-llm

    00:00 The koan of an open-source LLM
    03:22 A new naming scheme for open LLMs
    07:09 Pivot points and politics
    08:16 Claude 3, arms race, commoditization, and national security
    10:01 Doomers debunking bio risks of LLMs themselves
    11:21 Mistral's perceived reversal and the EU
    13:22 Messy points: Transparency, safety, and copyright
    13:32 The muddling of transparency
    15:22 The muddling of "safety"
    16:30 The muddling of licenses and copyright
    20:12 Vibes points and next steps

    Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/open-source/img_046.png
    Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/open-source/img_064.png

  • Louis recently has been founding a new startup focused on synthetic data for alignment, Synth Labs, and is a researcher at Eleuether AI. This interview should speak for itself, and it’ll need re-listens, even for myself. The list of topics we cover touches on pretty much every major and minor issue facing model fine-tuning. Please reach out or comment if there’s a paper we mention that I didn’t link before. Happy to dig it up for you. This post is very technical. If you’re having a hard time with it, I suggest you listen to my RLHF 201 post on Latent Space first.

    Full transcript available here: https://www.interconnects.ai/p/rlhf-interview-1-louis

    00:00:00: Introduction00:01:24: Gemini News and RLHF’s Part in it00:09:05: Long Context, In-Context, and Multimodal RLHF00:21:20: What are people missing about RLHF these days?00:30:30: OpenAI's Influence and the Need for Alternatives00:39:20: Synth Labs and the Future of Alignment00:55:00: Evaluation Talk p2: Open-ended Evaluation and Data Diversity00:59:20: Algorithm Roundup: PPO, DPO, KTO, IPO01:18:38: CarperAI, Early Days of RLHF, Reflecting on ChatGPT
  • Basic tips on how to assess inbound ML content and cultivate your news feed.
    This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/making-a-ml-feed

    00:00 How I assess all these AI releases
    01:22 1. Model access and demos are king of credibility
    02:31 2. Focus your feed on depth or breadth
    03:09 3. Examples of using the model normally show its usable, shockingly
    04:10 4. Leaderboards as the single leading claim is often anti-signal
    05:00 5. Basic deep learning conceptual checks will often save you
    06:13 6. If it's not even remotely reproducible or verifiable, it's not science
    07:10 7. Don't over-index on Twitter
    08:32 8. Data sharing, licenses, communication clarity, and small things add up
    08:58 9. Research papers, technical reports, blog posts, and Tweets all serve different purposes
    09:49 10. Socialize your information and build relationships

  • Google rejoins the open model party and gets some backlash for a frequent problem for generative AI.
    This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/gemma-google-ships-it

    00:00 Google ships it: Gemma open LLMs and Gemini backlash
    03:12 Getting to know Gemma
    07:11 Alignment details
    08:55 Aside: What is REINFORCE? Some history of RL
    11:08 Implementation details and RLHF
    12:18 Terms of use: RAIL Licenses history repeated
    14:05 Is Google back on top? Gemini's woes

    Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/gemma/img_008.webp
    Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/gemma/img_014.png
    Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/gemma/img_035.png
    Figure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/gemma/img_051.png
    Figure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/gemma/img_055.png

  • 10 Sora and Gemini 1.5 follow-ups: code-base in context, deepfakes, pixel-peeping, inference costs, and more
    This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/sora-gemini-follow-up

    00:00 10 Sora and Gemini 1.5 follow-ups: code-base in context, deepfakes, pixel-peeping, inference costs, and more
    00:46 1. Deepfake detection of Sora
    01:59 2. Playing with long-context, problem settings, and prompting
    03:39 3. Gemini paper snooping: contamination and citation games
    05:42 4. Training data and token estimates of YouTube
    07:42 5. Unlocking model-based RL and downstream research
    08:52 6. Midjourney style matching, V-JEPA, replicating Sora in the open
    10:09 7. Architectures and academic links
    10:57 8. Pixel peeping from the arts
    11:58 9. Inference costs
    13:24 10. Pressure on Llama and Mistral
    14:03 11. Sound effects, physics, and the complete picture

    Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_003.png
    Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_007.mp4
    Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_009.mp4
    Figure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_011.mp4
    Figure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_037.mp4
    Figure 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_044.png
    Figure 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_047.png
    Figure 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-2/img_049.mp4

  • Emergency blog! Three things you need to know from the ML world that arrived yesterday.
    This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/sora-gemini-and-mistral-next

    0:00 OpenAI's Sora for video, Gemini 1.5, and a secret Mistral model
    0:53 Sora: OpenAI's text-to-video model
    4:59 Gemini 1.5: Google's effectively infinite context length
    8:01 Mistral-next: Another funny release method

    Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_015.png
    Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_023.png
    Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_026.png
    Figure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/sora-gemini-mistral/img_036.png

  • In an era dominated by direct preference optimization and LLMasajudge, why do we still need a model to output only a scalar reward?
    This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: In an era dominated by direct preference optimization and LLM-as-a-judge, why do we still need a model to output only a scalar reward?

    Podcast figures:
    Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reward-models/img_004.png
    Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/reward-models/img_009.png

    0:00 Why reward models are still key to understanding alignment

  • Scale's making over $750 million per year selling data for RLHF, who's coming to take it?
    This is AI generated audio with Python and 11Labs. Music generated by Meta's MusicGen.
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/alignment-as-a-service

    00:00 Alignment-as-a-Service upstarts taking on Scale AI
    04:25 The competition with humans-in-the-loop
    06:05 Scaling Alignment-as-a-Service via AI feedback

    Podcast figures:
    Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/aaas/img_008.png

  • A small model at the beginning of big changes.
    This is AI generated audio with Python and 11Labs
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/olmo

    0:00 Open Language Models (OLMos) and the LLM landscape
    6:24 Thought experiments
    7:51 The LLM landscape heading into 2024

    Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/olmo/img_010.png

  • Note: some of the audio in the second half is a little wonky, but the general voice was upgraded so hopefully it's a little less "poppy" until then!
    I'm trying to fix little pronunciation problems on a weekly basis. Thanks to my early fans! It'll keep improving. E.g. some of the months were wonky.

    When what seems like pure LLM black magic is actually supported by the literature.
    This is AI generated audio with Python and 11Labs
    Source code: https://github.com/natolambert/interconnects-tools
    Original post: https://www.interconnects.ai/p/model-merging

    00:00 Model merging lessons in The Waifu Research Department
    02:21 How and why does model merging work?
    07:13 Aside: merging vs. ensembles vs. mixture of experts
    08:21 Why are people doing this?
    11:22 Tools & Links
    11:51 Brief (visual) literature review
    12:07 Full model merging and recent methods
    15:55 Weight averaging during pretraining
    17:18 LoRA merging
    17:53 More background

    Figure 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_005.png
    Figure 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_016.png
    Figure 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_042.png
    Figure 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_051.png
    Figure 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_055.png
    Figure 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_058.png
    Figure 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_060.png
    Figure 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_062.png
    Figure 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_065.png
    Figure 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_075.png
    Figure 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_077.png
    Figure 12: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/model-merging/img_084.png