Sarthak Kumar Maharana

I'm a first-year Computer Science PhD student at the University of Texas at Dallas (UTD), advised by Dr. Yunhui Guo. Before this, I obtained my MS in Electrical Engineering from the University of Southern California (USC) and a Bachelor of Technology (BTech) from International Institute of Information Technology Bhubaneswar (IIIT-Bh), India, with an honors degree in Electrical and Electronics Engineering.

During my Masters, I closely worked with Dr. Yonggang Shi. Previously, I had also worked with Dr. Shrikanth Narayanan. During my undergraduate studies, I was fortunate enough to work with Dr. Ren Hongliang (NUS), Dr. Prasanta Kumar Ghosh (IISc), and Dr. Aurobinda Routray (IIT-Kharagpur).

My current research focuses on computer vision and learning. Specifically, I'm more interested in making models robust and adaptive to rapid distributional shifts (continual/lifelong learning). Steering towards privacy and real-world machine perception systems, I'm also actively working at the intersection of continual learning and machine unlearning.

I'm happy to chat and discuss potential collaborations. Feel free to contact me.

Email  /  CV  /  Google Scholar  /  Github  /  LinkedIn

profile photo

Jul '24  

Our paper on DNN watermarking has been accepted to ECCV 2024!

May '24  

Serving as a reviewer for BMVC 2024.

Mar '24  

Serving as a reviewer for CVPR 2024 Workshop on Test-Time Adaptation: Model, Adapt Thyself! (MAT).

Feb '24  

Serving as a reviewer for ECCV 2024.

Jan '24  

Our paper on SSL features for dysarthric speech has been accepted to the SASB workshop @ ICASSP 2024!

Jan '24  

I am glad to be selected to attend the MLx Representation Learning and Generative AI Oxford Summer School.

Aug '23  

Started PhD @ UTD!

May '23  

Graduated from USC with an MS in Electrical Engineering!

Mar '23  

Accepted the CS PhD offer from UTD!

Aug '21  

Started MS in Electrical Engineering at USC!

June '21  

Virtually presented the paper on acoustic-to-articulatory inversion of dysarthric speech at IEEE ICASSP 2021.

Mar '21  

Our paper on acoustic-to-articulatory inversion using cross-corpus data was accepted to IEEE ICASSP 2021!

Jun '20  

Graduated from IIIT-Bh with a BTech (Honors) in Electrical and Electronics Engineering.
  • Continual/Lifelong learning.
  • Data and parameter-efficient deep learning, model robustness, and adaptation.
  • General ML and computer vision.
  • Human-centered AI, which includes multi-modal machine learning with applications to speech and medical images.

First author works are highlighted.

PALM: Pushing Adaptive Learning Rate Mechanisms for Continual Test-Time Adaptation
Sarthak Kumar Maharana, Baoming Zhang, Yunhui Guo

[arXiv]

Adaptive learning rate continual test-time adaptation method based on model prediction uncertainty and parameter sensitivity to rapid distributional shifts.

Not Just Change the Labels, Learn the Features: Watermarking Deep Neural Networks with Multi-View Data
Yuxuan Li, Sarthak Kumar Maharana, Yunhui Guo
European Conference on Computer Vision (ECCV) 2024

[arXiv]

Novel watermarking technique based on multi-view data for defending against model extraction attacks.

Acoustic-to-Articulatory Inversion for Dysarthric Speech: Are Pre-Trained Self-Supervised Representations Favorable?
Sarthak Kumar Maharana, Krishna Kamal Adidam, Shoumik Nandi, Ajitesh Srivastava
IEEE International Conference of Acoustics, Speech, and Signal Processing Workshops (ICASSPW) 2024

[Paper] [Poster]

Effectiveness of pre-trained self-supervised learning representations for acoustic-to-articulatory inversion of dysarthric speech.

Acoustic-to-Articulatory Inversion for Dysarthric Speech by Using Cross-Corpus Acoustic-Articulatory Data
Sarthak Kumar Maharana, Aravind Illa, Renuka Mannem, Yamini Bellur, Veeramani Preethish Kumar, Seena Vengalil, Kiran Polavarapu, Nalini Atchayaram, Prasanta Kumar Ghosh
IEEE International Conference of Acoustics, Speech, and Signal Processing (ICASSP) 2021

[BibTeX] [Paper] [Code] [Video]

Joint and multi-corpus training for acoustic-to-articulatory inversion of dysarthric speech, using x-vectors, at low-resource data conditions.

Harmonics analysis of a PV integrated hysteresis current control inverter connected with grid and without grid
Jayanta Kumar Sahu, Sudhakar Sahu, J.P Patra, Sarthak Kumar Maharana, Bhagabat Panda
IEEE International Conference on Smart Systems and Inventive Technology (ICSSIT) 2019

[BibTeX] [Paper]

Harmonics analysis of a PV integrated hysteresis current control inverter connected with grid and without grid.

The University of Texas at Dallas
Research Assistant
Richardson, TX
Aug 2023 - Present

  • Supervisor - Dr. Yunhui Guo
  • Activities -
    • Currently working on problems related to efficient model fine-tuning and continual test-time domain adaptation.

University of Southern California
Student Researcher
Los Angeles, CA
May 2022 - July 2023

  • Supervisor - Dr. Yonggang Shi
  • Activities -
    • Developed an end-to-end general software tool to automate the reconstruction of fiber bundles in the brainstem of the human brain, using diffusion MRI images, for the HCP Aging dataset (to be publicly released soon).
    • Leveraged deep learning based registration and label fusion methods to automatically generate the anatomical ROIs that are critical for fiber bundle reconstruction.

University of Southern California
Student Researcher
Los Angeles, CA
Dec 2021 - Dec 2022

  • Supervisor - Dr. Shrikanth (Shri) Narayanan
  • Activities -
    • Performed speaker recognition from rt-MRI videos, based on an unsupervised disentanglement representation learning scheme.
    • Contributed to the development of generating embeddings from 2D sagittal-view rt-MRI videos to distinguish between speakers based on their articulatory representations from vocal tract landmarks.

National University of Singapore
Part-time Research Assistant
Remote
July 2020 - Apr 2021

  • Supervisor - Dr. Ren Hongliang
  • Activities -
    • Experimented with different encoder-decoder architectures (ex. LinkNet) by plugging in spatio-temporal modules (ex. convLSTM) to perform pixel-wise prediction of the needle trajectory in ultrasound images during a kidney biopsy.
    • Proposed the integration of a DGMN (Dynamic Graph Message Passing) network in DGCN (Dual Graph Convolutional Network), for efficient semantic segmentation, to model long-range dependencies in an OCT image.

Indian Institute of Science
Bachelor's Thesis and Student Researcher
Bangalore, India
Dec 2019 - Sep 2020

  • Supervisor - Dr. Prasanta Ghosh
  • Activities -
    • Studied acoustic-to-articulatory inversion (AAI) model’s performance on the dysarthric speech when the model was trained in a corpus dependent manner using a matched low-resource dysarthric corpus or using a mismatched cross-corpus with rich acoustic-articulatory data.
    • Investigated the benefit of utilizing cross-corpus acoustic-articulatory data using transfer learning and joint-training techniques for the articulatory predictions of dysarthric subjects.

Indian Institute of Technology Kharagpur
Summer Research Intern
Kharagpur, India
May 2019 - Jul 2019

  • Supervisor - Dr. Aurobinda Routray
  • Activities -
    • Developed an in-house template matching algorithm, of various phases, to detect breaths in speech recordings using end-to-end deep neural networks.
    • Employed a heuristic technique to join close predicted breath segments, and segments below a certain threshold were removed, for postprocessing and to remove any misclassification errors.

  • Fully funded tuition, with a stipend, to pursue CS PhD at UTD.
  • Governing Body Merit Scholarship (April 2021).
  •    Awarded to top 3 students of each department at IIIT-Bh. Received for the academic year 2019-2020.
  • Indian Academy of Sciences (IAS) - Summer Research Fellowship (April 2019)
  •    An annual research fellowship program (<10% selection rate) conducted by the Indian Academy of Sciences, under IISc Bangalore.
  • Reviewer - BMVC 2024, ECCV 2024, CVPR Workshops 2024
  • Building CORD.ai, a deep learning research community, as a core member and volunteer researcher.
  • Course Mentor/Grader for graduate level EE 541: An Introduction to Deep Learning (Spring 2022).
  • USC IEEE Graduate Society - Member, strengthen academic and social growth of the members, and host workshops.
  • PyCon India 2020 - Content writer for social media handles, helped the promotions team to reach out to organizations and colleges, and interacted with individuals who have contributed to the language, and also worked on creating virtual swags.
  • I'm a cis male.
  • I consider myself lucky to have grown up in two beautiful cities in India - Bangalore and Bhubaneswar, that have infused in me a lot of character and development. I've also spent two quality years in the vibrant, diverse, gently warm, and sprawling city of Los Angeles, California. Absolutely look forward to staying in new places and experiencing different cultures.
  • I'm a HUGE fan of the classical formats of cricket. You'd often find me watching old test match highlights or SRT straight drives. Nothing can get more sublime than that. I bet! I don't consider IPL/T20 cricket as a thing AT ALL.
  • I think mobile photography is like a side gig for me? My phone instantly comes out the moment my eyes catch sight of a beautiful view.
  • I also spend a lot of time in quality humor - dark humor per se. We could talk about that later.

Source code by Jon Barron.