
Multimodal
One Model Does It All: Multi-task AI models got more sophisticated in 2022.
Individual deep learning models proved their mettle in hundreds of tasks. The scope of multi-task models expanded dramatically in the past year.
Multimodal
Individual deep learning models proved their mettle in hundreds of tasks. The scope of multi-task models expanded dramatically in the past year.
Machine Learning Research
Large language models are trained only to predict the next word based on previous ones. Yet, given a modest fine-tuning set, they acquire enough information to learn how to perform tasks such as answering questions.
Machine Learning Research
If a robot can predict what it’s likely to see next, it may have a better basis for choosing an appropriate action — but it has to predict quickly. Transformers, for all their utility in computer vision, aren’t well suited to this because of their steep computational and memory requirements...
Machine Learning Research
Given a number of images of the same scene, a neural network can synthesize images from novel vantage points, but it can take hours to train. A new approach cuts training time to a few minutes.
Machine Learning Research
While neural networks perform well on image, text, and audio datasets, they fall behind decision trees and their variations for tabular datasets. New research looked into why.
Machine Learning Research
Neural networks can describe in words what’s happening in pictures and videos — but can they make sensible guesses about things that happened before or will happen afterward? Researchers probed this ability.
Machine Learning Research
The route to improving transformer-based language models like GPT-3 and Gopher, which are trained on immense quantities of text scraped from the web, has been to increase their size. But research shows that, given a processing budget, bigger doesn’t necessarily mean better.
Machine Learning Research
Recent work showed that models for multilingual machine translation can increase the number of languages they translate by scraping the web for pairs of equivalent sentences in different languages. A new study radically expanded the language repertoire through training on untranslated web text.
Machine Learning Research
Sentence pairs that have equivalent meanings in different languages — typically used to train machine translation systems — have been available in sufficient quantities for only around 100 languages. New work doubled that number and produced a more capable model.
Machine Learning Research
Only a week ago, researchers unveiled a system that generates a few seconds of video based on a text prompt. New work enables a text-to-video system to produce an entire visual narrative from several sentences of text.
Machine Learning Research
In spoken conversation, people naturally take turns amid interjections and other patterns that aren’t strictly verbal. A new approach generated natural-sounding audio dialogs without training on text transcriptions that mark when one party should stop speaking and the other should chime in.
Machine Learning Research
Text-to-image generators like DALL·E 2, Midjourney, and Stable Diffusion are winning art contests and worrying artists. A new approach brings the magic of text-to-image generation to video.