Machine Learning Research - The Batch | DeepLearning.AI (Page 7)

Machine Learning Research

Making LLMs Explainable: Google’s Gemma Scope probes how large language models think

Researchers have probed the inner workings of individual layers of large language models. A new tool applies this approach to all layers.

Machine Learning Research

Models Ranked for Hallucinations: Measuring language model hallucinations during information retrieval

How often do large language models make up information when they generate text based on a retrieved document? A study evaluated the tendency of popular models to hallucinate while performing retrieval-augmented generation (RAG).

Throughput and latency at different context lengths

Machine Learning Research

Long Context Gets Up to Speed: AI21 Labs’ Jamba 1.5 outpaces transformers in long-text processing

A new model generates tokens faster than current transformers, especially when processing long inputs.

A graphic shows an any-to-any multimodal model, with text mapping to RGB or geometric modalities.

Machine Learning Research

Multimodal to the Max: 4M-21 multimodal model excels in handling diverse input and output types

Researchers introduced a model that handles an unprecedented number of input and output types, including many related to performing computer vision tasks.

The SWE-bench full leaderboard shows Cosine Genie outperforming its competitors.

Generative AI

Agentic Coding Strides Forward: Genie coding assistant outperforms competitors on SWE-bench by over 30 percent

An agentic coding assistant boosted the state of the art in an important benchmark by more than 30 percent.

Given an initial data pool of 128M samples, we train ViT-B/32 CLIP models for a total of 640M samples.

Machine Learning Research

Scaling Laws for Data Quality: Scaling laws reveal the impact of data quality in vision-language model training

When training vision-language models, developers often remove lower-quality examples from the training set. But keeping only the highest-quality examples may not be ideal, researchers found.

How Qwen2-Audio performs against the competitors.

Generative AI

Open Models for Math and Audio: Alibaba advances open-weight LLMs with Qwen2 Math and Audio variants

Alibaba followed up its open-weights Qwen2 large language models with specialized variations.

Generative AI

Google Imagen 3 Raises the Bar: Google’s Imagen 3 outperforms rivals in text-to-image benchmarks

Image generation continued its rapid march forward with a new version of Google’s flagship text-to-image model.

Conceptual illustration of The A I Scientist, an end-to-end LLM-driven scientific discovery process.

Generative AI

AI Agents for AI Research: Agentic workflow generates novel scientific research papers

While some observers argue that large language models can’t produce truly original output, new work prompted them to generate novel scientific research.

Machine Learning Research

Machine Translation Goes Agentic: TransAgents, a system that boosts literary translation with a multi-agent workflow

Literary works are challenging to translate. Their relative length, cultural nuances, idiomatic expressions...

Generative AI

Out of the Black Forest: Black Forest Labs’ Flux.1 outperforms top text-to-image models

A new company with deep roots in generative AI made an eye-catching debut.

Machine Learning Research

Art Attack: ArtPrompt, a technique that exploits ASCII art to bypass LLM safety measures

Seemingly an innocuous form of expression, ASCII art opens a new vector for jailbreak attacks on large language models (LLMs), enabling them to generate outputs that their developers tuned them to avoid producing.