Aikyam
Lab
ऐक्यम् — University of Virginia
Home
About
Research
Publications
Team
Teaching
Outreach
Talks
Open Positions
Home
About
Research
Publications
Team
Teaching
Outreach
Talks
Open Positions
Trustworthy AI · Aikyam Lab
Publi
cations
All
XAI
Interpretability
Robustness/Safety
Unlearning
Reasoning
Multimodal
Benchmark
2026
arxiv'26
Interpretability
Agent
LLM Reasoning
STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks
E. Lobo, X. Chen, J. Meng, N. Xi, Y. Jiao,
C. Agarwal
, Y. Zick, Y. Gao
arxiv, 2026
PDF
arxiv'26
Explainability
GNNs
OOD
Quantifying Explanation Quality in Graph Neural Networks using Out-of-Distribution Generalization
D. Zhang, S. Betala,
C. Agarwal
arxiv, 2026
PDF
arxiv'26
Interpretability
Unlearning
LLMs
Towards Understanding Unlearning Difficulty: A Mechanistic Perspective and Circuit-Guided Difficulty Metric
J. Cheng, Z. Chen,
C. Agarwal,
H. Amiri
arxiv, 2026
PDF
IUI'26
XAI
LLM Reasoning
Interactive
Improving Human Verification of LLM Reasoning through Interactive Explanation Interfaces
R. Zhou, G. Nguyen, N. Kharya, A. T. Nguyen,
C. Agarwal
IUI 2026
PDF
Video
HuggingFace
AAAI'26 Oral 🏆
Alignment
Probing
LLMs
Polarity-Aware Probing for Quantifying Latent Alignment in Language Models
S. Sadiekh, E. Ericheva,
C. Agarwal
AAAI 2026 — Oral Presentation
PDF
Code
Video
arxiv'26
Multilingual
LLM Reasoning
CURE-Med: Curriculum-Informed Reinforcement Learning for Multilingual Medical Reasoning
E. Onyame, A. Ghosh, S. Baidya, S. Saha, X. Chen,
C. Agarwal
arxiv 2026
Website
PDF
GitHub
Dataset
2025
arXiv
Multimodal Explainability
Interpretability
Multimodal AI
Rethinking Explainability in the Era of Multimodal AI
C. Agarwal
arXiv 2025
PDF
arXiv
Benchmark
GNNs
Multimodal LLMs
A Graph Talks, But Who’s Listening? Rethinking Evaluations for Graph-Language Models
S. Petkar, H. Aakash K, A. Vempati, A. Sinha, P. Kumaraguru,
C. Agarwal
arXiv 2025
PDF
Code
HuggingFace
arXiv
Multilingual
Healthcare
LLMs
CLINIC: Evaluating Multilingual Trustworthiness in Language Models for Healthcare
A. Ghosh, S. Sridhar, R.K. Ravi, M. Muhsin, S. Saha,
C. Agarwal
arXiv 2025
PDF
Code
HuggingFace
EMNLP'25
Multilingual
Reasoning
Survey
The Multilingual Mind: A Survey of Multilingual Reasoning in Language Models
A. Ghosh, D. Datta, S. Saha,
C. Agarwal
EMNLP 2025
PDF
EMNLP'25
Hallucination
Video
Multimodal
EGOILLUSION: Benchmarking Hallucinations in Egocentric Video Understanding
A. Seth, U. Tyagi, R. Selvakumar, N. Anand, S. Kumar, S. Ghosh, R. Duraiswami,
C. Agarwal
, D. Manocha
EMNLP 2025 — Main Conference
PDF
Website
EMNLP'25
Hallucination
Vision-Language
VQA
HALLUCINOGEN: Benchmarking Hallucination in Implicit Reasoning within Large Vision Language Models
A. Seth, D. Manocha,
C. Agarwal
UncertaiNLP Workshop @ EMNLP 2025
PDF
Code
NAACL'25 Oral
Memorization
Attribution
LLMs
Analyzing Memorization in Large Language Models through the Lens of Model Attribution
TR Menta, S. Agrawal,
C. Agarwal
NAACL 2025 — Oral 🏆
PDF
Code
NAACL'25
Privacy
NLP
Unlearnable
Towards Operationalizing Right to Data Protection
A. Java, S. Shahid,
C. Agarwal
NAACL 2025
PDF
Code
NAACL'25
Chain-of-Thought
Fine-tuning
On the Impact of Fine-tuning on Chain-of-Thought Reasoning
E. Lobo,
C. Agarwal
, H. Lakkaraju
NAACL 2025
PDF
2024
AISTATS'24 · NeurIPS Spotlight
Uncertainty
NL Explanations
Quantifying Uncertainty in Natural Language Explanations of Large Language Models
S. H. Tanneru,
C. Agarwal
, H. Lakkaraju
AISTATS 2024
PDF
Code
NeurIPS'24
Safety
Medicine
LLMs
Towards Safe Large Language Models for Medicine
T. Han, A. Kumar,
C. Agarwal
, H. Lakkaraju
NeurIPS 2024
PDF
Code
ICML'24
Truthfulness
Prompting
Understanding the Effects of Iterative Prompting on Truthfulness
S. Krishna,
C. Agarwal
, H. Lakkaraju
ICML 2024
PDF
COLM'24
Robustness
Certified Safety
Certifying LLM Safety Against Adversarial Prompting
A. Kumar,
C. Agarwal
, S. Srinivas, A. Li, S. Feizi, H. Lakkaraju
COLM 2024
PDF
Code
2022 – 2023 · Selected
NeurIPS'22
XAI
Benchmark
OpenXAI: Towards a Transparent Evaluation of Post hoc Model Explanations
C. Agarwal
, S. Krishna, E. Saxena, M. Pawelczyk, et al.
NeurIPS Dataset & Benchmark 2022 · 218 ★
PDF
Code
CVPR'22
XAI
Training Dynamics
Estimating Example Difficulty Using Variance of Gradients
C. Agarwal
, D. D'souza, S. Hooker
CVPR 2022 · 58 ★
PDF
Code
Nature'23
GNN
XAI
Evaluating Explainability for Graph Neural Networks
C. Agarwal
, O. Queen, H. Lakkaraju, M. Zitnik
Nature Scientific Data 2023 · 142 ★
PDF
Code
CVPR'23
Debiasing
Vision-Language
DeAR: Debiasing Vision-Language Models with Additive Residuals
A. Seth, M. Hemani,
C. Agarwal
CVPR 2023
PDF
ICLR'23
Unlearning
GNN
GNNDelete: A General Strategy for Unlearning in Graph Neural Networks
J. Cheng, G. Dasoulas*, H. He*,
C. Agarwal
, M. Zitnik
ICLR 2023
PDF
Code
ICLR'23
RL
Explanations
Explaining RL Decisions with Trajectories
S. Deshmukh*, A. Dasgupta*, B. Krishnamurthy, N. Jiang,
C. Agarwal
, et al.
ICLR 2023
PDF
Code
SIGIR'23
IR
Interpretability
Explain like I am BM25: Interpreting a Dense Model's Ranked-List with a Sparse Approximation
M. Llordes, D. Ganguly, S. Bhatia,
C. Agarwal
SIGIR 2023
PDF
Code
View All Papers on Google Scholar →