AI learns how vision and sound are connected, without human intervention

This new machine-learning model can match corresponding audio and visual data, which could someday help robots interact in the real world.

Adam Zewe | MIT News • mit
May 22, 2025 ~7 min

Hybrid AI model crafts smooth, high-quality videos in seconds

The CausVid generative AI tool uses a diffusion model to teach an autoregressive (frame-by-frame) system to rapidly produce stable, high-resolution videos.

Alex Shipps | MIT CSAIL • mit
May 6, 2025 ~6 min


Combining next-token prediction and video diffusion in computer vision and robotics

A new method can train a neural network to sort corrupted data while anticipating next steps. It can make flexible plans for robots, generate high-quality video, and help AI agents navigate digital environments.

Alex Shipps | MIT CSAIL • mit
Oct. 16, 2024 ~8 min

Study: AI could lead to inconsistent outcomes in home surveillance

Researchers find large language models make inconsistent decisions about whether to call the police when analyzing surveillance videos.

Adam Zewe | MIT News • mit
Sept. 19, 2024 ~8 min

Looking for a specific action in a video? This AI-based method can find it for you

A new approach could streamline virtual training processes or aid clinicians in reviewing diagnostic videos.

Adam Zewe | MIT News • mit
May 29, 2024 ~7 min

A system to keep cloud-based gamers in sync

By synchronizing media streams transmitted from the cloud to two devices, researchers could improve cloud gaming and AR/VR applications.

Adam Zewe | MIT News • mit
Aug. 31, 2023 ~8 min

In machine learning, synthetic data can offer real performance improvements

Models trained on synthetic data can be more accurate than other models in some cases, which could eliminate some privacy, copyright, and ethical concerns from using real data.

Adam Zewe | MIT News Office • mit
Nov. 3, 2022 ~8 min

Artificial intelligence system learns concepts shared across video, audio, and text

A machine-learning model can identify the action in a video clip and label it, without the help of humans.

Adam Zewe | MIT News Office • mit
May 4, 2022 ~7 min


Security tool guarantees privacy in surveillance footage

“Privid” could help officials gather secure public health data or enable transportation departments to monitor the density and flow of pedestrians, without learning personal information about people.

Rachel Gordon | MIT CSAIL • mit
March 28, 2022 ~8 min

How to reduce the environmental impact of your next virtual meeting

Study uncovers overlooked environmental impacts of internet use by estimating associated carbon, land, and water footprints.

Kelley Travers | MIT Energy Initiative • mit
March 4, 2021 ~9 min

/

3