Yilun Du

Email: yilundu [at] mit [dot] edu
Twitter: https://twitter.com/du_yilun
Github: https://github.com/yilundu
CV: CV

I am a final year PhD student at MIT EECS, advised by Prof. Leslie Kaelbling, Prof. Tomas Lozano-Perez and Prof. Joshua B. Tenenbaum. Previously, I obtained my bachelor's degree from MIT, was a research fellow at OpenAI, an intern and visiting researcher at FAIR and Google Deepmind, and got a gold medal at the International Biology Olympiad. My research focuses on generative models, decision making, robot learning, embodied agents, and the applications of such tools to scientific domains.

My research is driven by the goal of developing intelligent embodied agents that interact in the physical world. My research has primarily focused on the use of generative AI as an approach towards this goal. A major challenge in applying generative AI in this setting is the lack of available decision-making data and the necessity to generalize well to previously unseen situations. My work addresses this by constructing composable generative models using the idea of learning energy landscapes as a means to generalize beyond the narrow amount of data that is available. Such composable models enable compositional visual generation and compositional scene understanding. My work has further focused on how such compositional models can enable the synthesis of new trajectories in trajectory planning, enabling flexible adaptation to novel goals and rewards across both synthesized videos and on real robots. Finally, an energy optimization perspective on prediction enables us to combine the strengths of large pre-trained models together at prediction time, enabling both hierarchical planning and multimodal perception. More broadly, I am interested in constructing a decentralized generative architecture for decision-making, consisting of a society of different multimodal models, each with separate responsibilities such as 3D perception, memory, and auditory understanding, which jointly cooperate to make decisions in an environment. I am also interested in additional techniques to improve generative models such as reinforcement learning training as well as broader applications of my research in domains in science such as computational biology.

News

I am looking for academic positions this upcoming year! You can find my research statement here.
We are organizing a workshop on foundation models for decision making at NeurIPS 2023 and a workshop on generative models for decision making at ICLR 2024!
I gave a recent talks summarizing my work on compositional generative models and on using video models in robotics.
Check out a list of our work on energy-based models!

Research Highlights

Generative Modeling: constructing generative models of the world.
Perception and Scene Understanding: inferring the 3D / visual structure of the world.
Interactive Learning: building agents which may interact in the world.

Publications ( show selected / show all by date / show all by topic )

Topics: Generative Modeling / Perception and Scene Understanding / Interactive Learning (* indicates equal contribution and ^† indicates equal advising)

Yilun Du

News

Research Highlights

Publications ( show selected / show all by date / show all by topic )

Compositional Generative Modeling: A Single Model is Not All You Need

Yilun Du, Leslie Kaelbling

Video Language Planning

Yilun Du, Sherry Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, Tianhe Yu, Pieter Abbeel, Joshua B. Tenenbaum, Leslie Kaelbling, Andy Zeng, Jonathan Tompson

Learning to Act from Actionless Video through Dense Correspondences

Po-Chen Ko, Jiayuan Mao, Yilun Du, Shao-Hua Sun, Joshua B. Tenenbaum

Learning Interactive Real-World Simulators

Mengjiao Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Leslie Kaelbling, Dale Schuurmans, Pieter Abbeel

Compositional Generative Inverse Design

Tailin Wu*, Takashi Maruyama*, Long Wei*, Tao Zhang*, Yilun Du*, Gianluca Iaccarino, Jure Leskovec

Probabilistic Adaptation of Text-to-Video Models

Mengjiao Yang*, Yilun Du*, Bo Dai, Dale Schuurmans, Joshua B. Tenenbaum, Pieter Abbeel

Training Diffusion Models with Reinforcement Learning

Kevin Black*, Michael Janner*, Yilun Du, Ilya Kostrikov, Sergey Levine

Learning to Jointly Understand Visual and Tactile Signals

Yichen Li, Yilun Du, Chao Liu, Chao Liu, Francis Williams, Michael Foshey, Benjamin Eckart, Jan Kautz, Joshua B. Tenenbaum, Antonio Torralba, Wojciech Matusik

Building Cooperative Embodied Agents Modularly with Large Language Models

Hongxin Zhang, Weihua Du, Jiaming Shan, Qinhong Zhou, Yilun Du, Joshua B. Tenenbaum, Tianmin Shu, Chuang Gan

HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments

Qinhong Zhou, Sunli Chen, Yisong Wang, Haozhe Xu, Weihua Du, Hongxin Zhang, Yilun Du, Joshua B. Tenenbaum, Chuang Gan

Improving Factuality and Reasoning in Language Models through Multiagent Debate

Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch

Compositional Foundation Models for Hierarchical Planning

Anurag Ajay*, Seungwook Han*, Yilun Du*, Shuang Li, Abhi Gupta, Tommi Jaakkola, Joshua B. Tenenbaum, Leslie Kaelbling, Akash Srivastava, Pulkit Agrawal

Learning Universal Policies via Text-Guided Video Generation

Yilun Du*, Mengjiao Yang*, Bo Dai, Hanjun Dai, Ofir Nachum, Joshua B. Tenenbaum, Dale Schuurmans, Pieter Abbeel

DiffuseBot: Breeding Soft Robots With Physics-Augmented Generative Diffusion Models

Tsun-Hsuan Wang, Juntian Zheng, Pingchuan Ma, Yilun Du, Byungchul Kim, Andrew Spielberg, Joshua Tenenbaum, Chuang Gan, Daniela Rus

Adaptive Online Replanning with Diffusion Models

Siyuan Zhou, Yilun Du, Shun Zhang, Mengdi Xu, Yikang Shen, Wei Xiao, Dit-Yan Yeung, Chuang Gan

3D-LLM: Injecting the 3D World into Large Language Models

Yining Hong, Haoyu Zhen, Peihao Chen, Shuhong Zheng, Yilun Du, Zhenfang Chen, Chuang Gan

FlowCam: Training Generalizable 3D Radiance Fields without Camera Poses via Pixel-Aligned Scene Flow

Cameron Smith, Yilun Du, Ayush Tewari, Vincent Sitzmann

Secure Out-of-Distribution Task Generalization with Energy-Based Models

Shengzhuang Chen, Long-Kai Huang, Jonathan Schwarz, Yilun Du, Ying Wei

Compositional Diffusion-Based Continuous Constraint Solvers

Zhutian Yang, Jiayuan Mao, Yilun Du, Jiajun Wu, Joshua B. Tenenbaum, Tomas Lozano-Perez, Leslie Kaelbling

Unsupervised Compositional Concepts Discovery with Text-to-Image Generative Models

Nan Liu*, Yilun Du*, Shuang Li*, Joshua B. Tenenbaum, Antonio Torralba

Foundation Models for Decision Making: Problems, Methods, and Opportunities

Mengjiao Yang, Ofir Nachum, Yilun Du, Jason Wei, Pieter Abbeel, Dale Schuurmans

Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC

Yilun Du, Conor Durkan, Robin Strudel, Joshua B. Tenenbaum, Sander Dieleman, Rob Fergus, Jascha Sohl-Dickstein, Arnaud Doucet, Will Grathwohl

Inferring Relational Potentials in Interacting Systems

Armand Comas, Yilun Du, Christian Fernandez, Sandesh Ghimire, Mario Sznaier, Joshua B. Tenenbaum, Octavia Camps

NeuSE: Neural SE(3)-Equivariant Embedding for Consistent Spatial Understanding with Objects

Jiahui Fu, Yilun Du, Kurran Singh, Joshua B. Tenenbaum, John J. Leonard

StructDiffusion: Language-Guided Creation of Physically-Valid Structures using Unseen Objects

Weiyu Liu, Yilun Du, Tucker Hermans, Sonia Chernova, Chris Paxton

Diffusion Policy: Visuomotor Policy Learning via Action Diffusion

Cheng Chi, Siyuan Feng, Yilun Du, Zhenjia Xu, Eric Cousineau, Benjamin Burchfiel, Shuran Song

Learning to Render Novel Views from Wide-Baseline Stereo Pairs

Yilun Du, Cameron Smith, Ayush Tewari†, Vincent Sitzmann†

3D Concept Learning and Reasoning from Multi-View Images

Yining Hong, Chunru Lin, Yilun Du, Zhenfang Chen, Joshua B. Tenenbaum, Chuang Gan

Is Conditional Generative Modeling all You Need for Decision-Making?

Anurag Ajay*, Yilun Du*, Ahbi Gupta*, Joshua B. Tenenbaum, Tommi S. Jaakkola, Pulkit Agrawal

Composing Ensembles of Pre-trained Models via Iterative Consensus

Shuang Li*, Yilun Du*, Joshua B. Tenenbaum, Antonio Torralba, Igor Mordatch

Planning with Sequence Models through Iterative Energy Minimization

Hongyi Chen*, Yilun Du*, Yiye Chen*, Joshua B. Tenenbaum, Patricio Antonio Vela

Seeing 3D Objects in a Single Image via Self-Supervised Static-Dynamic Disentanglement

Prafull Sharma, Ayush Tewari, Yilun Du, Sergey Zakharov, Rares Ambrus, Adrien Gaidon, William T. Freeman, Fredo Durand, Joshua B. Tenenbaum, Vincent Sitzmann

Local Neural Descriptor Fields: Locally Conditioned Object Representations for Manipulation

Ethan Chun, Yilun Du, Anthony Simeonov, Tomas Lozano-Perez, Leslie Kaelbling

Visiblity-Aware Navigation Among Movable Objects

Jose Iturralde*, Aiden Curtis*, Yilun Du, Leslie Kaelbling, Tomas Lozano-Perez

Language Models Generalize Beyond Natural Proteins

Robert Verkuil*, Ori Kabeli*, Yilun Du, Basile Wicky, Lukas Milles, Justas Dauparas, David Baker, Sergey Ovchinnikov, Tom Sercu, Alexander Rives

Self-conditioned Embedding Diffusion for Text Generation

Robin Strudel, Corentin Tallec, Florent Altche, Yilun Du, Yaroslav Ganin, Arthur Mensch, Will Grathwohl, Nikolay Savinov, Sander Dieleman, Laurent Sifre, Remi Lebond

SE(3)-Equivariant Relational Rearrangement with Neural Descriptor Fields

Anthony Simeonov*, Yilun Du*, Yen-Chen Lin, Alberto Rodriguez, Leslie Kaelbling, Tomas Lozano-Perez, Pulkit Agrawal

MIRA: Mental Imagery for Robotic Affordances

Yen-Chen Lin, Pete Florence, Andy Zheng, Johnathon T. Barron, Yilun Du, Wei-Chiu Ma, Anthony Simeonov, Alberto Rodriguez, Phillip Isola

Tailin Wu, Takashi Maruyama, Long Wei, Tao Zhang, Yilun Du*, Gianluca Iaccarino, Jure Leskovec

Mengjiao Yang, Yilun Du, Bo Dai, Dale Schuurmans, Joshua B. Tenenbaum, Pieter Abbeel

Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine

Anurag Ajay, Seungwook Han, Yilun Du*, Shuang Li, Abhi Gupta, Tommi Jaakkola, Joshua B. Tenenbaum, Leslie Kaelbling, Akash Srivastava, Pulkit Agrawal

Yilun Du, Mengjiao Yang, Bo Dai, Hanjun Dai, Ofir Nachum, Joshua B. Tenenbaum, Dale Schuurmans, Pieter Abbeel

Nan Liu, Yilun Du, Shuang Li*, Joshua B. Tenenbaum, Antonio Torralba

Yilun Du, Cameron Smith, Ayush Tewari^†, Vincent Sitzmann^†

Anurag Ajay, Yilun Du, Ahbi Gupta*, Joshua B. Tenenbaum, Tommi S. Jaakkola, Pulkit Agrawal

Shuang Li, Yilun Du, Joshua B. Tenenbaum, Antonio Torralba, Igor Mordatch

Hongyi Chen, Yilun Du, Yiye Chen*, Joshua B. Tenenbaum, Patricio Antonio Vela

Jose Iturralde, Aiden Curtis, Yilun Du, Leslie Kaelbling, Tomas Lozano-Perez

Robert Verkuil, Ori Kabeli, Yilun Du, Basile Wicky, Lukas Milles, Justas Dauparas, David Baker, Sergey Ovchinnikov, Tom Sercu, Alexander Rives

Anthony Simeonov, Yilun Du, Yen-Chen Lin, Alberto Rodriguez, Leslie Kaelbling, Tomas Lozano-Perez, Pulkit Agrawal

Shuang Li, Xavier Puig, Chris Paxton, Yilun Du, Clinton Wang, Linxi Fan, Tao Chen, De-An Huang, Ekin Akyurek, Anima Anandkumar⁺, Jacob Andreas⁺, Igor Mordatch⁺, Antonio Torralba⁺, Yuke Zhu⁺

Nan Liu, Shuang Li, Yilun Du*, Antonio Torralba, Joshua B. Tenenbaum

Michael Janner, Yilun Du, Joshua B. Tenenbaum, Sergey Levine

Anthony Simeonov, Yilun Du, Andrea Tagliasacchi, Joshua B. Tenenbaum, Alberto Rodriguez, Pulkit Agrawal⁺, Vincent Sitzmann⁺

Nan Liu, Shuang Li, Yilun Du*, Joshua B. Tenenbaum, Antonio Torralba