Sarthak Kumar Maharana

Sarthak Kumar Maharana

I'm a CS PhD candidate at the University of Texas at Dallas (UTD), advised by Dr. Yunhui Guo. Before this, I obtained my MS in Electrical Engineering from the University of Southern California (USC) and a Bachelor's degree from IIIT Bhubaneswar (IIIT-Bh), India, with an honors degree in Electrical and Electronics Engineering.

Here are some of the key aspects driving my research:

Foundational models have been showing great promise in various tasks. How can we improve their robustness and generalization to an open and dynamic world?
Construct AI systems that can continuously learn and adapt while retaining the knowledge of previously learned tasks.
Data- and label-efficient learning paradigms.

During my Masters, I closely worked with Dr. Yonggang Shi. Previously, I had also worked with Dr. Shri Narayanan. As an undergraduate, I was fortunate enough to work with Dr. Ren Hongliang (NUS), Dr. Prasanta Kumar Ghosh (IISc), and Dr. Aurobinda Routray (IIT-Kharagpur).

I have published at top-tier ML/computer vision/signal processing conferences such as ICCV, NeurIPS(2x), AAAI, ECCV, and ICASSP(2x).

I'm happy to chat and discuss potential collaborations. Feel free to contact me.

Email / CV / Google Scholar / GitHub / LinkedIn

June '25	BATCLIP has been accepted to ICCV 2025. See you in the gorgeous Hawai'i! 🌴
May '25	We're organizing the 1st Workshop on Multimodal Continual Learning at ICCV 2025!
May '25	Crushed my quals — officially a PhD candidate now!
Mar '25	Excited to be co-organizing the 2nd Workshop on Test-Time Adaptation: Putting Updates to the Test at ICML 2025!
Feb '25	This summer, I'll be joining Dolby Laboratories as a PhD Research Intern!
Dec '24	PALM has been accepted to AAAI 2025 for an Oral presentation!
Nov '24	Serving as a CVPR 2025 reviewer.
Oct '24	Variational Diffusion Unlearning (VDU) is accepted to the NeurIPS SafeGenAI workshop 2024!
Sep '24	Our paper on submodular optimization for active 3D object detection has been accepted to NeurIPS 2024!
Aug '24	Serving as a reviewer for ICLR 2025.
Jul '24	Our paper on DNN watermarking has been accepted to ECCV 2024!
May '24	Serving as a reviewer for BMVC 2024.
Mar '24	Serving as a reviewer for CVPR 2024 Workshop on Test-Time Adaptation: Model, Adapt Thyself! (MAT).
Feb '24	Serving as a reviewer for ECCV 2024.
Jan '24	Our paper on SSL features for dysarthric speech has been accepted to the SASB workshop @ ICASSP 2024!
Jan '24	I am glad to be selected to attend the MLx Representation Learning and Generative AI Oxford Summer School.

First author works are highlighted.

AVROBUSTBENCH: Benchmarking the Robustness of Audio-Visual Recognition Models at Test-Time
Sarthak Kumar Maharana, Saksham Singh Kushwaha, Baoming Zhang, Adrian Rodriguez, Songtao Wei, Yapeng Tian, Yunhui Guo
Under Review

[arXiv] [Code] [Datasets] [Demo]

A comprehensive benchmark designed to evaluate the test-time robustness of audio-visual recognition models.

SELECT: A Submodular Approach for Active LiDAR Semantic Segmentation
Ruiyu Mao, Sarthak Kumar Maharana, Xulong Tang, Yunhui Guo
Under Review

[arXiv]

A voxel-centric submodular approach tailored for active LiDAR semantic segmentation.

BATCLIP: Bimodal Online Test-Time Adaptation for CLIP
Sarthak Kumar Maharana, Baoming Zhang, Leonid Karlinsky, Rogerio Feris, Yunhui Guo
In ICCV 2025

[Paper] [Project] [Code]

Bimodal online test-time adaptation method to improve CLIP's robustness to common corruptions. Also extends to domain generalization settings.

PALM: Pushing Adaptive Learning Rate Mechanisms for Continual Test-Time Adaptation
Sarthak Kumar Maharana, Baoming Zhang, Yunhui Guo
In AAAI 2025 (Oral)

[Paper] [Project] [Code]

Adaptive learning rate continual test-time adaptation method based on model prediction uncertainty and parameter sensitivity to rapid distributional shifts.

Variational Diffusion Unlearning: A Variational Inference Framework for Unlearning in Diffusion Models
Subhodip Panda, MS Varun, Shreyans Jain, Sarthak Kumar Maharana, Prathosh AP
In NeurIPS Safe Generative AI Workshop 2024

[Paper]

Machine unlearning of user-specific classes/concepts in pre-trained diffusion models (DDPMs).

STONE: A Submodular Optimization Framework for Active 3D Object Detection
Ruiyu Mao, Sarthak Kumar Maharana, Rishabh K Iyer, Yunhui Guo
In NeurIPS 2024

[Paper] [Code]

A submodular optimization scheme to handle data imbalance and label distributional coverage for active 3D object detection.

Not Just Change the Labels, Learn the Features: Watermarking Deep Neural Networks with Multi-View Data
Yuxuan Li, Sarthak Kumar Maharana, Yunhui Guo
In ECCV 2024

[Paper] [Code]

Novel watermarking technique based on multi-view data for defending against model extraction attacks.

Acoustic-to-Articulatory Inversion for Dysarthric Speech: Are Pre-Trained Self-Supervised Representations Favorable?
Sarthak Kumar Maharana, Krishna Kamal Adidam, Shoumik Nandi, Ajitesh Srivastava
In ICASSP 2024 Workshop on Self-supervision in Audio, Speech, and Beyond (SASB) 2024

[Paper] [Poster]

Effectiveness of pre-trained self-supervised learning representations for acoustic-to-articulatory inversion of dysarthric speech.

Acoustic-to-Articulatory Inversion for Dysarthric Speech by Using Cross-Corpus Acoustic-Articulatory Data
Sarthak Kumar Maharana, Aravind Illa, Renuka Mannem, Yamini Bellur, Veeramani Preethish Kumar, Seena Vengalil, Kiran Polavarapu, Nalini Atchayaram, Prasanta Kumar Ghosh
In ICASSP 2021

[BibTeX] [Paper] [Code] [Video]

Joint and multi-corpus training for acoustic-to-articulatory inversion of dysarthric speech, using x-vectors, at low-resource data conditions.

Harmonics analysis of a PV integrated hysteresis current control inverter connected with grid and without grid
Jayanta Kumar Sahu, Sudhakar Sahu, J.P Patra, Sarthak Kumar Maharana, Bhagabat Panda
In ICSSIT 2019

[BibTeX] [Paper]

Harmonics analysis of a PV integrated hysteresis current control inverter connected with grid and without grid.

Workshop Co-organizer: 2nd Workshop on Test-Time Adaptation: Putting Updates to the Test @ ICML 2025, 1st Workshop on Multimodal Continual Learning @ ICCV 2025.
Reviewer - CVPR 2025, ICLR 2025, NeurIPS Workshops 2024, BMVC 2024, ECCV 2024, CVPR Workshops 2024, AAAI 2024
Building CORD.ai, a deep learning research community, as a core member and volunteer researcher.

I'm a cis male.
I consider myself lucky to have grown up in two beautiful cities in India - Bangalore and Bhubaneswar, which have infused in me a lot of character and development. I've also spent two quality years in the vibrant, diverse, gently warm, and sprawling city of Los Angeles, California. Absolutely look forward to staying in new places and experiencing different cultures.
I'm a HUGE fan of the classical formats of cricket. You'd often find me watching old test match highlights or SRT straight drives. Nothing can get more sublime than that. I bet! I don't consider IPL/T20 cricket as a thing AT ALL.
I think mobile photography is like a side gig for me? My phone instantly comes out the moment my eyes catch sight of a beautiful view.
I also spend a lot of time on quality humor, specifically dark humor. We could talk about that later.

Source code by Jon Barron.