DeepLearning.AI - The Batch | DeepLearning.AI (Page 5)

A futuristic museum with advanced security lighting featuring a sleek and modern interior.

Data Points

A new technique to build simple but powerful reasoning models: Gemini 2.0 Pro experimental is here, Flash is generally available

Claude’s new method to thwart universal jailbreaks. DeepMind team shares recipes for model scaling. Copilot adds agent mode, previews more autonomous tools. π0 robotics foundation models are now open source.

Comic-style illustration of a confident woman and man standing beside bold ‘10X’ text on a bright background.

The Batch Newsletter

o3-mini Puts Reasoning in High Gear, How to Train for Computer Use, Gemini 2.0 Thinks Faster, More-Responsive Voice Interactions

The Batch AI News and Insights: A “10x engineer” — a widely accepted concept in tech — purportedly has 10 times the impact of the average engineer.

Letters

How AI can make you a 10x professional: Every profession can become more efficient and strategic by applying more intelligence.

A “10x engineer” — a widely accepted concept in tech — purportedly has 10 times the impact of the average engineer.

Diagram illustrating Moshi’s use of an LLM to process user audio input, inner monologue, and output.

Machine Learning Research

Okay, But Please Don’t Stop Talking: Moshi, an open alternative to OpenAI’s Realtime API for Speech

Even cutting-edge, end-to-end, speech-to-speech systems like ChatGPT’s Advanced Voice Mode tend to get interrupted by interjections like “I see” and “uh-huh” that keep human conversations going. Researchers built an open alternative that’s designed to go with the flow of overlapping speech.

Line charts showing performance improvements in math and science with 2.0 Flash Thinking models.

Machine Learning Research

Gemini Thinks Faster: Google’s Gemini 2.0 Flash Thinking advances in reasoning, outperforms DeepSeek-R1

Google updated the December-vintage reasoning model Gemini 2.0 Flash Thinking and other Flash models, gaining ground on OpenAI o1 and DeepSeek-R1.

Flowchart illustrating the automation of opening, editing, and saving a Word document using PyAutoGUI.

Machine Learning Research

Training for Computer Use: UI-TARS shows strong computer use capabilities in benchmarks

As Anthropic, Google, OpenAI, and others roll out agents that are capable of computer use, new work shows how underlying models can be trained to do this.

Bar chart animation showing accuracy improvements in AIME 2024 competition math models.

Machine Learning Research

Reasoning in High Gear: o3-mini, a faster, more affordable reasoning model for coding, math, and science

OpenAI introduced a successor to its o1 models that’s faster, less expensive, and especially strong in coding, math, and science.

A traditional music studio where software engineers, audio engineers, and musicians collaborate.

Data Points

Deep research brings PhD analysis to ChatGPT: YuE’s music model released under Apache 2.0 open license

Qwen updates its many multimodal models. Nvidia’s Eagle vision-language models are small but sharp. Tülu open post-training recipe whips Llama 3.1 405B into shape. Microsoft Azure is of two minds regarding DeepSeek R1.

A futuristic cityscape with AI regulations displayed on a billboard, resembling Times Square.

Data Points

Open-R1 is building a training pipeline and datasets for reasoning models: Canvas now has o1, GPT-4o has a new knowledge cutoff

Wiz finds DeepSeek’s unprotected user database. Open weights model Mistral Small gets an update. Janus-Pro is DeepSeek’s top multimodal model. Yoshua Bengio’s team releases long-awaited AI safety report.

Letters

Three Takeaways from DeepSeek’s Big Week: Innvations by China’s AI powerhouse DeepSeek highlight major shifts in the international scene

The buzz over DeepSeek this week crystallized, for many people, a few important trends that have been happening in plain sight.

The Batch Newsletter

Reinforcement Learning Heats Up, White House Orders Muscular AI Policy, Computer Use Gains Momentum, Fine Control of Fine-Tuning

The Batch AI News and Insights: The buzz over DeepSeek this week crystallized, for many people, a few important trends that have been happening in plain sight.

Bar chart comparing active vs. random sampling effects on length, diversity, and toxicity after fine-tuning.

Machine Learning Research

Fine-Tuning Fine Points: Active inheritance, a smarter way to fine-tune models on synthetic data

The practice of fine-tuning models on synthetic data is becoming well established. But synthetic training data, even if it represents the training task well, may include characteristics like toxicity that impart unwelcome properties in the trained model’s output...