Jesse Thomason

Current

National Science Foundation (NSF) CISE IIS grant for HCC: Medium: Language-guided Haptic Texture Modeling and Signal Generation (Award 2504241). PI: Heather Culbertson; Co-PI: Jesse Thomason; Oct 2025–Sep 2029

The object of the propoosed work is to enable language-guided, iterative generation of haptic feedback, including surface object models and vibration signals, so that designers can easily bring haptic interactions into games and virtual worlds. This proposed research will incorporate language guidance so a designer can offer a physical description of the an object (e.g., 'a brick wall') as a starting point, and provide corrective feedback about candidate textures models (e.g., 'make it coarser') at each iteration. This same human-AI collaboration loop can be used to author more abstract notions, such as vibration patterns for smartphones and VR to capture ideas like 'an urgent notification' or 'a comforting heartbeat'.

USC-Capital One Center for Responsible AI and Decision Making in Finance (CREDIF) grant for Human-in-the-loop Multi-Agentic AI Systems for Task-oriented Dialogue. PI: Jesse Thomason; Aug 2025–Jul 2026

We propose to empower multi-agentic AI systems with transparency and explainability by surfacing of their reasoning processes in natural language, including frictive dialogue with human users to clarify information, agent-agent introspection to overcome single-agent uncertainties, and human-initiated interruption for human-in-the-loop control.

DARPA ARC Safe and Assured Foundation Robots for Open Environments (SAFRON) grant for Assuring Pretrained Robot Policy Behavior by Combining Statistical ML and Software Testing Methods (Award HR0011-25-3-0154). PI: Jesse Thomason; Co-PIs: Souti Chattopadhyay, William G.J. Halfond; Jul 2025–Jun 2026

We propose a method to assure that the robot behaves consistently with the expectation of a human operator by combining the strengths of statistical machine learning and systematic robustness assurance methods and applying them to pretrained vision-language-action (VLA) policies. We propose methods to quantify 1) whether inputs to such controllers are out-of-distribution and could lead to unpredictable behavior and 2) whether inputs are likely to lead to policy behavior that is not consistent with human expectation.

IARPA Bias Effects and Notable Generative AI Limitations (BENGAL) in final negotiations with USC to award grant for Enabling Efficient Unlearning in Pretrained Large Language Models through Information Localization. PI: Jesse Thomason; Co-PI: Robin Jia; Jan 2025–Jan 2027

The main objective of the proposed work is to enable safe information flow in sensitive environments by developing algorithms to identify and ``unlearn'' specified information in large language models (LLMs). Pretrained large language models have been shown to memorize sensitive information from their training data verbatim. We propose to formally define LLM memorization and introduce standardized evaluation protocols for methods that localize such memorized information and methods that then remove information from specified parameters in a pretrained model.

Army Research Laboratory: Army Artificial Intelligence Innovation Institute (A2I2) grant for Communicating with Natural Language Dialogue for Teams of Intelligent Systems and Humans (Award W911NF-23-2-0010). PI: David Traum; Co-PI: Jesse Thomason; Feb 2022–Nov 2025

Investigate closing the perception-action-communication loop between heterogeneous agents and human teammates using natural language dialogue. Leverage natural language generation and understanding to enable interpretable, dialogue-based communication in mixed teams of agents and humans.

PSALM-V: Automating Symbolic Planning in Interactive Visual Environments with Large Language Models
Wang Zhu, Miaosen Chai, Ishika Singh, Robin Jia, and Jesse Thomason.
arXiv, 2025.
categories: physical robots, neurosymbolic, language and planning
preprint paper

@article{zhu:psalmv,
  title={{PSALM-V}: Automating Symbolic Planning in Interactive Visual Environments with Large Language Models},
  author={Wang Zhu and Miaosen Chai and Ishika Singh and Robin Jia and Jesse Thomason},
  journal={arXiv},
  year={2025},
  url={https://arxiv.org/abs/2506.20097}
}

Language Models can Infer Action Semantics for Classical Planners from Environment Feedback
Wang Zhu, Ishika Singh, Robin Jia, and Jesse Thomason.
North American Chapter of the Association for Computational Linguistics (NAACL), 2025.
categories: neurosymbolic, language and planning
conference paper

@inproceedings{zhu:psalm,
  title={Language Models can Infer Action Semantics for Classical Planners from Environment Feedback},
  author={Wang Zhu and Ishika Singh and Robin Jia and Jesse Thomason},
  booktitle={North American Chapter of the Association for Computational Linguistics (NAACL)},
  year={2025},
  url={https://arxiv.org/abs/2406.02791}
}

Efficient Evaluation of Multi-Task Robot Policies With Active Experiment Selection
Abrar Anwar, Rohan Gupta, Zain Merchant, Sayan Ghosh, Willie Neiswanger, and Jesse Thomason.
Conference on Robot Learning (CoRL), 2025.
categories: physical robots, evaluation
—
Also presented at the Workshop on Robot Evaluation for the Real World (RobotEvaluation@RSS2025), 2025.
conference paper

@inproceedings{anwar:efficientroboeval,
  title={Efficient Evaluation of Multi-Task Robot Policies With Active Experiment Selection},
  author={Abrar Anwar and Rohan Gupta and Zain Merchant and Sayan Ghosh and Willie Neiswanger and Jesse Thomason},
  booktitle={Conference on Robot Learning (CoRL)},
  year={2025},
  url={https://arxiv.org/abs/2502.09829}
}

| RobotEvaluation@RSS2025 website

TwoStep: Multi-agent Task Planning using Classical Planners and Large Language Models
David Bai, Ishika Singh, David Traum, and Jesse Thomason.
Workshop on Language and Semantics of Task and Motion Planning @ ICRA, 2025.
categories: neurosymbolic, language and planning
workshop paper website

@inproceedings{singh:twostep,
  title={{TwoStep}: Multi-agent Task Planning using Classical Planners and Large Language Models},
  author={David Bai and Ishika Singh and David Traum and Jesse Thomason},
  booktitle={Workshop on Language and Semantics of Task and Motion Planning @ ICRA},
  year={2025},
  url={https://arxiv.org/abs/2403.17246}
}

ReWiND: Language-Guided Rewards Teach Robot Policies without New Demonstrations
Jiahui Zhang, Yusen Luo, Abrar Anwar, Sumedh Anand Sontakke, Joseph J Lim, Jesse Thomason, Erdem Biyik, and Jesse Zhang.
Conference on Robot Learning (CoRL), 2025.
categories: reinforcement learning, language and robotics, physical robots
—
Also presented at the 2nd Workshop on Out-of-Distribution Generalization in Robotics (OODWorkshop@RSS25), 2025. *Best Paper Award.
conference paper website

@inproceedings{zhang:rewind,
  title={{ReWiND}: Language-Guided Rewards Teach Robot Policies without New Demonstrations},
  author={Jiahui Zhang and Yusen Luo and Abrar Anwar and Sumedh Anand Sontakke and Joseph J Lim and Jesse Thomason and Erdem Biyik and Jesse Zhang},
  booktitle={Conference on Robot Learning (CoRL)},
  year={2025},
  url={https://arxiv.org/abs/2505.10911}
}

| OODWorkshop@RSS25 website

Contrast Sets for Evaluating Language-Guided Robot Policies
Abrar Anwar, Rohan Gupta, and Jesse Thomason.
Conference on Robot Learning (CoRL), 2024.
categories: language and robotics, physical robots, evaluation
conference paper

@inproceedings{anwar:robotcontrasteval,
  title={Contrast Sets for Evaluating Language-Guided Robot Policies},
  author={Abrar Anwar and Rohan Gupta and Jesse Thomason},
  booktitle={Conference on Robot Learning (CoRL)},
  year={2024},
  url={https://arxiv.org/abs/2406.13636}
}

Completed

DARPA Friction and Accountability in Conversational Transactions (FACT) AI Exploration grant for BECAREFUL: Building Embodied Conversational Agent Reliability by Exerting Friction through Uncertain Language (Award HR00112490376). PI: Dilek Hakkani-Tur; Co-PIs: Gokhan Tur, Malihe Alikhane, Jesse Thomason, Julia Hockenmeier; Mar 2024–Aug 2025

The objective of BECAREFUL is to enhance decision making mechanisms for conversational embodied AI agents by reducing user over-reliance to possible mis-information from AI systems due to AI hallucinations, AI sycophancy or misunderstanding the user as a result of low-bandwidth and unreliable communication situations common in humanitarian relief and search-and-rescue. Our approach involves creating dialogue systems that can track intentions based on the interaction history, assess the accountability of potential actions and responses, and encourage critical thinking through introducing friction when appropriate.

Can VLMs Recall Factual Associations From Visual References?
Dhananjay Ashok, Ashutosh Chaubey, Hirona J. Arai, Jonathan May, and Jesse Thomason.
Findings of Empirical Methods in Natural Language Processing (EMNLP Findings), 2025.
categories: evaluation, language and vision, interpretability
conference paper

@inproceedings{ashock:vlmfactvizrefs,
  title={Can {VLM}s Recall Factual Associations From Visual References?},
  author={Dhananjay Ashok and Ashutosh Chaubey and Hirona J. Arai and Jonathan May and Jesse Thomason},
  booktitle={Findings of Empirical Methods in Natural Language Processing (EMNLP Findings)},
  year={2025},
  url={https://arxiv.org/abs/2508.18297}
}

Adjust for Trust: Mitigating Trust-Induced Inappropriate Reliance on AI Assistance
Tejas Srinivasan and Jesse Thomason.
arXiv, 2025.
categories: AI trust, dialogue
preprint paper

@article{srinivasan:adjustfortrust,
  title={Adjust for Trust: Mitigating Trust-Induced Inappropriate Reliance on AI Assistance},
  author={Tejas Srinivasan and Jesse Thomason},
  journal={arXiv},
  year={2025},
  url={https://arxiv.org/abs/2502.13321}
}

From Calibration to Collaboration: LLM Uncertainty Quantification Should Be More Human-Centered
Siddartha Devic, Tejas Srinivasan, Jesse Thomason, Willie Neiswanger, and Vatsal Sharan.
arXiv, 2025.
categories: interpretability, AI trust, evaluation
preprint paper

@article{devic:calibtocollab,
  title={From Calibration to Collaboration: {LLM} Uncertainty Quantification Should Be More Human-Centered},
  author={Siddartha Devic and Tejas Srinivasan and Jesse Thomason and Willie Neiswanger and Vatsal Sharan},
  journal={arXiv},
  year={2025},
  url={https://arxiv.org/abs/2506.07461}
}

Better Slow than Sorry: Introducing Positive Friction for Reliable Dialogue Systems
Mert Inan, Anthony Sicilia, Suvodip Dey, Vardhan Dongre, Tejas Srinivasan, Jesse Thomason, Gokhan Tur, Dilek Hakkani-Tur, and Malihe Alikhani.
arXiv, 2025.
categories: dialogue, language and action
preprint paper

@article{inan:slowthansorry,
  title={Better Slow than Sorry: Introducing Positive Friction for Reliable Dialogue Systems},
  author={Mert Inan and Anthony Sicilia and Suvodip Dey and Vardhan Dongre and Tejas Srinivasan and Jesse Thomason and Gokhan Tur and Dilek Hakkani-Tur and Malihe Alikhani},
  journal={arXiv},
  year={2025},
  url={https://arxiv.org/abs/2501.17348}
}

USC Undergraduate Research Associates Program (URAP) gift for Glass is to Shatter as Rubber is to Bounce: Analogies for Natural Language Processing. PI: Jesse Thomason; Aug 2023–Apr 2024

We propose to collect and curate a benchmark of linear word analogies to test the understanding capabilities of modern large language models. Benchmarks that probe physical and social understanding are difficult and expensive to create, and so we propose analogies as a minimal probe that require little human effort to create, and will collect analogies that are easy for humans and hard for models as an online game.

Laboratory for Analytic Sciences grant for Multimodal Transformers with Compositional Modules for Continual Learning. PI: Mohammad Rostami; Co-PI: Jesse Thomason; Jan 2023–Apr 2023

We aim to develop a method for learning compositional, adapter-based modules for multimodal transformers in continual learning (CL) settings. We will develop this method for language and vision classification tasks, then explore applying it to audiovisual classification and visuolinguistic, sequential decision making.

Task-Attentive Transformer Architecture for Continual Learning of Vision-and-Language Tasks Using Knowledge Distillation
Yuliang Cai, Jesse Thomason, and Mohammad Rostami.
Findings of Empirical Methods in Natural Language Processing (EMNLP Findings), 2023.
categories: language and vision, continual learning
conference paper

@inproceedings{cai:taskattentive,
  title={Task-Attentive Transformer Architecture for Continual Learning of Vision-and-Language Tasks Using Knowledge Distillation},
  author={Yuliang Cai and Jesse Thomason and Mohammad Rostami},
  booktitle={Findings of Empirical Methods in Natural Language Processing (EMNLP Findings)},
  year={2023},
  url={https://arxiv.org/abs/2303.14423}
}

I2I: Initializing Adapters with Improvised Knowledge
Tejas Srinivasan, Furong Jia, Mohammad Rostami, and Jesse Thomason.
Conference on Lifelong Learning Agents (CoLLAs), 2023.
categories: continual learning, language and vision
conference paper

@inproceedings{srinivasan:i2i,
  title={{I2I}: Initializing Adapters with Improvised Knowledge},
  author={Tejas Srinivasan and Furong Jia and Mohammad Rostami and Jesse Thomason},
  booktitle={Conference on Lifelong Learning Agents (CoLLAs)},
  year={2023},
  url={https://arxiv.org/abs/2304.02168}
}

NSF Convergence Accelerator Track H grant for Determining Community Needs for Accessibility Tools that Facilitate Programming Education and Workforce Readiness for Personas with Disabilities (Award 2236320). PI: Maja Matarić; Co-PIs: Stephen Aguilar, Sook-Lei Liew, Gisele Ragusa, Jesse Thomason; Dec 2022–Nov 2024

The objective of this planning grant is to develop a means of ameliorating the negative labor outcomes faced by PWD with a set of early prototypes of multimodal interfaces (e.g., speech, eye tracking, pedals) that enable PWD to learn programming skills. Our project aims to develop a means for PWD to train for—and ultimately enter—the tech workforce, bridging the programming career gap that currently blocks most PWD from such career paths.

UPenn's Alzheimer's Disease Research Center supported by NIH 'Penn Artificial Intelligence and Technology Collaboratory for Healthy Aging' subaward for An Accessible Machine Learning-Based ADRD Screening Tool for Families and Caregivers (Award 5-P30-AG-073105-02). PI: Maja Matarić; Co-PI: Jesse Thomason; Dec 2022–Nov 2024

The objective of the Penn AI Tech Collab is to perform early detection of dementia symptoms by leveraging multimodal speech, gaze, and pencil pressure inputs during a standard dementia diagnostic suite of tasks embedded in an easy-to-use mobile application.

USC Undergraduate Research Associates Program (URAP) gift for Language-Guided Mobile Manipulators. PI: Jesse Thomason; Aug 2022–Apr 2023

This work will build a real-world dataset to investigate challenges of robot learning with human collaborators. We will leverage our robotics infrastructure at USC to collect data and demonstrate learned robot policies that collaborate efficiently with humans.

Laboratory for Analytic Sciences grant for Continual Learning of Few Shot Learners for Natural Language Processing. PI: Mohammad Rostami; Co-PI: Jesse Thomason; May 2022–Dec 2022

The proposed work aims to enable NLP models to learn a downstream task using only a few task-specific annotated data points to relax the need for generating large-scale annotated datasets. We expect the learned model exhibits strong few-shot generalization ability as a result of positive knowledge transfer from past learned tasks when a new task is learned, while not suffering from catastrophic forgetting of past learned tasks.

Curriculum Learning for Data-Efficient Vision-Language Alignment
Tejas Srinivasan, Xiang Ren, and Jesse Thomason.
Workshop on Open-Domain Reasoning Under Multi-Modal Settings (ODRUM) @ CVPR, 2023.
categories: language and vision
workshop paper

@inproceedings{srinivasan:tonics,
  title={Curriculum Learning for Data-Efficient Vision-Language Alignment},
  author={Tejas Srinivasan and Xiang Ren and Jesse Thomason},
  booktitle={Workshop on Open-Domain Reasoning Under Multi-Modal Settings (ODRUM) @ CVPR},
  year={2023},
  url={https://arxiv.org/abs/2207.14525}
}

CLiMB: A Continual Learning Benchmark for Vision-and-Language Tasks
Tejas Srinivasan, Ting-Yun Chang, Leticia Leonor Pinto Alva, Georgios Chochlakis, Mohammad Rostami, and Jesse Thomason.
Neural Information Processing Systems (NeurIPS), 2022.
categories: benchmark, continual learning, language and vision
conference paper source

@inproceedings{srinivasan:climb,
  title={{CLiMB}: A Continual Learning Benchmark for Vision-and-Language Tasks},
  author={Tejas Srinivasan and Ting-Yun Chang and Leticia Leonor Pinto Alva and Georgios Chochlakis and Mohammad Rostami and Jesse Thomason},
  booktitle={Neural Information Processing Systems (NeurIPS)},
  year={2022},
  url={https://arxiv.org/abs/2206.09059}
}

Amazon AWS Credits for Amazon Vising Academics gift for Language-Guided Mobile Manipulators. PI: Jesse Thomason; Sep 2021–Aug 2022

We will overcome noisy actuators and sensors while executing language instructions by building and maintaining a visio-linguistic memory. Combining language and multimodal sensory inputs, such memory will ground language instructions in physical observations.

Jesse Thomason GLAMOR Lab Sponsors

Current

Completed