MIT researchers advance automated interpretability in AI models

MAIA is a multimodal agent that can iteratively design experiments to better understand various components of AI systems.

Rachel Gordon | MIT CSAIL • mit
July 23, 2024 ~10 min

Researchers leverage shadows to model 3D scenes, including objects blocked from view

This technique could lead to safer autonomous vehicles, more efficient AR/VR headsets, or faster warehouse robots.

Adam Zewe | MIT News • mit
June 18, 2024 ~8 min


Understanding the visual knowledge of language models

LLMs trained primarily on text can generate complex visual concepts through code with self-correction. Researchers used these illustrations to train an image-free computer vision system to recognize real photos.

Alex Shipps | MIT CSAIL • mit
June 17, 2024 ~6 min

Researchers use large language models to help robots navigate

The method uses language-based inputs instead of costly visual data to direct a robot through a multistep navigation task.

Adam Zewe | MIT News • mit
June 12, 2024 ~7 min

New algorithm discovers language just by watching videos

DenseAV, developed at MIT, learns to parse and understand the meaning of language just by watching videos of people talking, with potential applications in multimedia search, language learning, and robotics.

Rachel Gordon | MIT CSAIL • mit
June 11, 2024 ~9 min

New computer vision method helps speed up screening of electronic materials

The technique characterizes a material’s electronic properties 85 times faster than conventional methods.

Jennifer Chu | MIT News • mit
June 11, 2024 ~8 min

Looking for a specific action in a video? This AI-based method can find it for you

A new approach could streamline virtual training processes or aid clinicians in reviewing diagnostic videos.

Adam Zewe | MIT News • mit
May 29, 2024 ~7 min

Controlled diffusion model can change material properties in images

“Alchemist” system adjusts the material attributes of specific objects within images to potentially modify video game models to fit different environments, fine-tune VFX, and diversify robotic training.

Alex Shipps | MIT CSAIL • mit
May 28, 2024 ~8 min


Natural language boosts LLM performance in coding, planning, and robotics

Three neurosymbolic methods help language models find better abstractions within natural language, then use those representations to execute complex tasks.

Alex Shipps | MIT CSAIL • mit
May 1, 2024 ~13 min

AI generates high-quality images 30 times faster in a single step

Novel method makes tools like Stable Diffusion and DALL-E-3 faster by simplifying the image-generating process to a single step while maintaining or enhancing image quality.

Rachel Gordon | MIT CSAIL • mit
March 21, 2024 ~7 min

/

14