Jeremy Collins

Hey! I'm Jeremy, and I'm a Machine Learning PhD student at Georgia Tech advised by Animesh Garg.

My research focuses on scaling robot learning by harnessing diverse sources of data. Specifically, I'm leveraging internet-scale human videos, designing efficient data augmentation algorithms, and building accessible platforms for large-scale teleoperation, with the goal of enabling robots to generalize to new tasks and environments and operate reliably in the real world.

Publications

TAPESTRY: A Single Backbone for Video Generation and Robot Control

Jeremy A. Collins, Seungjae Lee, Rachanon Wachakorn, Krishnan Srinivasan, Animesh Garg, Vitor Campagnolo Guizilini, Paarth Shah
In Submission
Paper | Website

TAPESTRY treats robot actions as single patch tokens in the video model's latent space. Video latents and action patches share the same patch embedding layers and the same diffusion backbone. At test time, the model generates an action chunk in a single DDIM step from pure noise in 84ms on consumer hardware, no video generation required.

AMPLIFY: Actionless Motion Priors for Robot Learning from Videos

Jeremy A Collins*, Loránd Cheng*, Kunal Aneja, Albert Wilcox, Benjamin Joffe, Animesh Garg
ICRA 2026
Paper | Code | Website | Tweet | Podcast

AMPLIFY is a novel framework that leverages large-scale video data by encoding visual dynamics into compact, discrete motion tokens derived from keypoint trajectories. Our modular approach decouples the challenges of learning what motion defines a task from how robots can perform it.

COBALT: Crowdsourcing Robot Learning via Cloud-Based Teleoperation with Smartphones

Ayush Agarwal*, Ansh Gandhi*, Jeremy A Collins, Omar Rayyan, Aryan Sarswat, Ranjani Koushik, Kaavya Menon, Masoud Moghani, Ajay Mandlekar, Animesh Garg
ICRA 2026
Paper | Website

COBALT democratizes robot learning by letting hundreds of users worldwide teleoperate cloud‐hosted, vectorized simulators via everyday devices while a suite of metrics and a structured training curriculum ensure the collection of large, high-quality datasets for both simulated and real-world robots.

FLASH: Flow-Based Language-Annotated Grasp Synthesis for Dexterous Hands

Hrishit Leen, Jeremy A Collins, Kunal Aneja, Nhi Nguyen, Priyadarshini Tamilselvan, Sri Siddarth Chakaravarthy P, Animesh Garg
CoRL 2025 Workshop on Dexterous Manipulation (Spotlight)
Paper | Poster

FLASH is a method for language-conditioned dexterous grasping that jointly models task intent and physical contact quality for robot hands.

RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations

Ezra Ameperosa*, Jeremy A Collins*, Mrinal Jain, Animesh Garg
ICRA 2025
Paper | Code | Website

RoCoDA is a data augmentation framework unifying the concepts of invariance, equivariance, and causality to enhance data augmentation for imitation learning, thus improving generalization and sample efficiency across manipulation tasks.

DexMOTS: Dexterous Manipulation with Differentiable Simulation

Krishnan Srinivasan, Jeremy A Collins, Eric Heiden, Ian Ng, Jeannette Bohg, Animesh Garg
ISRR 2024
Paper

DexMOTS leverages object-centric motion trajectories and differentiable simulation to efficiently learn dexterous manipulation policies, improving policy performance in contact-rich tasks.

ForceSight: Text-Guided Mobile Manipulation with Visual-Force Goals

Jeremy A Collins*, Cody Houff*, You Liang Tan*, Charles C Kemp
ICRA 2024, CoRL 2023 Demo
Paper | Website | Video

ForceSight is an RGBD-adapted, text-conditioned vision transformer. Given an RGBD image and a text prompt, ForceSight produces visual-force goals for a mobile manipulator, and successfully generalizes to novel environments and unseen object instances.

Visual Contact Pressure Estimation for Grippers in the Wild

Jeremy A Collins, Cody Houff, Patrick Grady, Charles C Kemp
IROS 2023
Paper | Video

ViPER is a model that leverages multiple sources of data to visually estimate contact pressure on robotic grippers, enabling precise manipulation in varied, uncontrolled environments.

PressureVision++: Estimating Fingertip Pressure from Diverse RGB Images

Patrick Grady, Jeremy A Collins, Chengcheng Tang, Christopher D Twigg, James Hays, Charles C Kemp
WACV 2024, CVPR 2023 Demo
Paper | Website | Video

PressureVision++ visually estimates fingertip contact pressure using weak supervision and adversarial domain adaptation, enabling mixed reality interfaces with everyday surfaces.

Force/Torque Sensing for Soft Grippers using an External Camera

Jeremy A Collins, Patrick Grady, Charles C Kemp
ICRA 2023
Paper | Code | Video

VFTS is a deep learning approach that uses a wrist-mounted camera to visually estimate 6-axis force and torque on robotic grippers.

Visual Pressure Estimation and Control for Soft Robotic Grippers

Patrick Grady, Jeremy A Collins, Samarth Brahmbhatt, Christopher D Twigg, Chengcheng Tang, James Hays, Charles C Kemp
IROS 2022
Paper | Code | Video

VPEC is a model that infers and controls the pressure applied by soft robotic grippers using a single RGB image, enabling precise manipulation of small objects such as coins and microSD cards.

Tendon-Driven Soft Robotic Gripper for Blackberry Harvesting

Anthony L Gunderman, Jeremy A Collins, Andrea L Myers, Renee T Threlfall, Yue Chen
RA-L 2022
Paper

We develop a tendon-driven soft robotic gripper with active force feedback that gently harvests blackberries, minimizing postharvest damage.

Selected Personal Projects

Video Prediction Using Stable Diffusion

Code | Video

Trained a video prediction model that autoregressively predicts and then denoises embeddings from Stable Diffusion.

Learning Robotic Tasks from Video Demonstrations

Code | Video

Implemented several models to learn a control policy from expert video demonstrations, achieving an 85% success rate in pick-and-place tasks.

Distracted Driver Detection

Code | Video

Created a model that visually classifies driver behavior in real-time.

Midnight Stretch: A Semi-Autonomous Robotic Caregiver

Code | Video

Midnight Stretch is a robotic system that autonomously detects falls in older adults. Features include human motion detection, voice activation, and a web interface for video calls.

Vision-Based Robotic Navigation

Video

Developed an algorithm enabling a mobile robot to detect and follow objects, avoid obstacles, and navigate complex environments.

CuberBot

Code | Video

Designed robotic hands with tendon-actuated fingers to autonomously solve a Rubik's cube.

Fractal Generation

Code

I made this to get some intuition for recursion. Did you know the "B." in Benoit B. Mandelbrot stands for Benoit B. Mandelbrot?

Education

Georgia Institute of Technology

Aug. 2023 - Present
Ph.D. in Machine Learning

Georgia Institute of Technology

Aug. 2021 - May 2023
M.S. in Robotics | GPA: 4.00

University of Arkansas

Aug. 2017 - May 2021
B.S. in Mechanical Engineering, Minor in Mathematics | GPA: 4.00

Jeremy Collins

Publications

TAPESTRY: A Single Backbone for Video Generation and Robot Control

AMPLIFY: Actionless Motion Priors for Robot Learning from Videos

COBALT: Crowdsourcing Robot Learning via Cloud-Based Teleoperation with Smartphones

FLASH: Flow-Based Language-Annotated Grasp Synthesis for Dexterous Hands

RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations

DexMOTS: Dexterous Manipulation with Differentiable Simulation

ForceSight: Text-Guided Mobile Manipulation with Visual-Force Goals

Visual Contact Pressure Estimation for Grippers in the Wild

PressureVision++: Estimating Fingertip Pressure from Diverse RGB Images

Force/Torque Sensing for Soft Grippers using an External Camera

Visual Pressure Estimation and Control for Soft Robotic Grippers

Tendon-Driven Soft Robotic Gripper for Blackberry Harvesting

Selected Personal Projects

Video Prediction Using Stable Diffusion

Learning Robotic Tasks from Video Demonstrations

Distracted Driver Detection

Midnight Stretch: A Semi-Autonomous Robotic Caregiver

Vision-Based Robotic Navigation

CuberBot

Fractal Generation

Education

Georgia Institute of Technology

Georgia Institute of Technology

University of Arkansas

Experience

Toyota Research Institute

Eleuther AI

Dorabot, Inc.

Marshalltown Company

J.B. Hunt Transport Services, Inc.

DaVoice