I'm a first-year Computer Science PhD student at the University of Texas at Dallas (UTD), advised by Dr. Yunhui Guo. Before this, I obtained my MS in Electrical Engineering from the University of Southern California (USC) and a Bachelor of Technology (BTech) from International Institute of Information Technology Bhubaneswar (IIIT-Bh), India, with an honors degree in Electrical and Electronics Engineering.
My current research focuses on computer vision and learning. Specifically, I'm more interested in making models robust and adaptive to rapid distributional shifts (continual/lifelong learning).
Steering towards privacy and real-world machine perception systems, I'm also actively working at the intersection of continual learning and machine unlearning.
I'm happy to chat and discuss potential collaborations. Feel free to contact me.
Real-world vision models in dynamic environments face rapid shifts in domain distributions, leading to decreased recognition performance. Continual test-time adaptation (CTTA) directly adjusts a pre-trained source discriminative model to these changing domains using test data. A highly effective CTTA method involves applying layer-wise adaptive learning rates, and selectively adapting pre-trained layers. However, it suffers from the poor estimation of domain shift and the inaccuracies arising from the pseudo-labels. In this work, we aim to overcome these limitations by identifying layers through the quantification of model prediction uncertainty without relying on pseudo-labels. We utilize the magnitude of gradients as a metric, calculated by backpropagating the KL divergence between the softmax output and a uniform distribution, to select layers for further adaptation. Subsequently, for the parameters exclusively belonging to these selected layers, with the remaining ones frozen, we evaluate their sensitivity in order to approximate the domain shift, followed by adjusting their learning rates accordingly. Overall, this approach leads to a more robust and stable optimization than prior approaches. We conduct extensive image classification experiments on CIFAR-10C, CIFAR-100C, and ImageNet-C and demonstrate the efficacy of our method against standard benchmarks and prior methods.
With the increasing prevalence of Machine Learning as a Service (MLaaS) platforms, there is a growing focus on deep neural network (DNN) watermarking techniques. These methods are used to facilitate the verification of ownership for a target DNN model to protect intellectual property. One of the most widely employed watermarking techniques involves embedding a trigger set into the source model. Unfortunately, existing methodologies based on trigger sets are still susceptible to functionality-stealing attacks, potentially enabling adversaries to steal the functionality of the source model without a reliable means of verifying ownership. In this paper, we first introduce a novel perspective on trigger set-based watermarking methods from a feature learning perspective. Specifically, we demonstrate that by selecting data exhibiting multiple features, also referred to as multi-view data, it becomes feasible to effectively defend functionality stealing attacks. Based on this perspective, we introduce a novel watermarking technique based on Multi-view dATa, called MAT, for efficiently embedding watermarks within DNNs. This approach involves constructing a trigger set with multi-view data and incorporating a simple feature-based regularization method for training the source model. We validate our method across various benchmarks and demonstrate its efficacy in defending against model extraction attacks, surpassing relevant baselines by a significant margin.
Acoustic-to-articulatory inversion (AAI) involves mapping from the
acoustic to the articulatory space. Signal-processing features like
the MFCCs, have been widely used for the AAI task. For subjects
with dysarthric speech, AAI is challenging because of an imprecise and indistinct pronunciation. In this work, we perform AAI
for dysarthric speech using representations from pre-trained selfsupervised learning (SSL) models. We demonstrate the impact of
different pre-trained features on this challenging AAI task, at low-resource conditions. In addition, we also condition x-vectors to the
extracted SSL features to train a BLSTM network. In the seen case,
we experiment with three AAI training schemes (subject-specific,
pooled, and fine-tuned).
In this work, we focus on estimating articulatory movements from acoustic features, known as acoustic-to-articulatory inversion (AAI), for dysarthric patients with amyotrophic lateral sclerosis (ALS). Unlike healthy subjects, there are two potential challenges involved in AAI on dysarthric speech. Due to speech impairment, the pronunciation of dysarthric patients is unclear and inaccurate, which could impact the AAI performance. In addition, acoustic-articulatory data from dysarthric patients is limited due to the difficulty in recording. These challenges motivate us to utilize cross-corpus acoustic-articulatory data. In this study, we propose an AAI model by conditioning speaker information using x-vectors at the input, and multi-target articulatory trajectory outputs for each corpus separately.
Generally, two devices are responsible for the generation of time-variant power. They are alternators and inverters. Harmonics are the unwanted signals generally created on the output of the inverter. In this paper, hysteresis current control inverters are described. Here the HCC inverters are connected with a grid and without a grid and integrated with a photo voltaic panel. The HCC inverters are connected to the grid with the help of a phase lock loop. Finally, the total harmonics distortion is calculated in this model their results are compared based on total harmonics distortion.
The University of Texas at Dallas Graduate Research Assistant Richardson, TX
Aug 2023 - Present
Developed an end-to-end general software tool to automate the reconstruction of fiber bundles in the brainstem of the human brain, using diffusion
MRI images, for the HCP Aging dataset (to be publicly released soon).
Leveraged deep learning based registration and label fusion methods to automatically generate the anatomical ROIs that are critical for fiber
bundle reconstruction.
University of Southern California Student Researcher Los Angeles, CA
Dec 2021 - Dec 2022
Performed speaker recognition from rt-MRI videos, based on an unsupervised disentanglement representation learning scheme.
Contributed to the development of generating embeddings from 2D sagittal-view rt-MRI videos to distinguish between speakers based on their
articulatory representations from vocal tract landmarks.
National University of Singapore Part-time Research Assistant Remote
July 2020 - Apr 2021
Experimented with different encoder-decoder architectures (ex. LinkNet) by plugging in spatio-temporal modules (ex. convLSTM) to perform
pixel-wise prediction of the needle trajectory in ultrasound images during a kidney biopsy.
Proposed the integration of a DGMN (Dynamic Graph Message Passing) network in DGCN (Dual Graph Convolutional Network), for efficient
semantic segmentation, to model long-range dependencies in an OCT image.
Indian Institute of Science Bachelor's Thesis and Student Researcher Bangalore, India
Dec 2019 - Sep 2020
Studied acoustic-to-articulatory inversion (AAI) modelβs performance on the dysarthric speech when the model was trained in a corpus dependent
manner using a matched low-resource dysarthric corpus or using a mismatched cross-corpus with rich acoustic-articulatory data.
Investigated the benefit of utilizing cross-corpus acoustic-articulatory data using transfer learning and joint-training techniques for the articulatory
predictions of dysarthric subjects.
Indian Institute of Technology Kharagpur Summer Research Intern Kharagpur, India
May 2019 - Jul 2019
Developed an in-house template matching algorithm, of various phases, to detect breaths in speech recordings using end-to-end deep neural
networks.
Employed a heuristic technique to join close predicted breath segments, and segments below a certain threshold were removed, for postprocessing
and to remove any misclassification errors.
Fully funded tuition, with a stipend, to pursue CS PhD at UTD.
Governing Body Merit Scholarship (April 2021).
   Awarded to top 3 students of each department at IIIT-Bh.
Received for the academic year 2019-2020.
Indian Academy of Sciences (IAS) - Summer Research Fellowship (April 2019)
   An annual research fellowship program (<10% selection rate) conducted by the Indian Academy of Sciences, under IISc Bangalore.
Reviewer - ECCV 2024, CVPR Workshops 2024
Building CORD.ai, a deep learning research community, as a core member and volunteer researcher.
Course Mentor/Grader for graduate level EE 541: An Introduction to Deep Learning (Spring 2022).
USC IEEE Graduate Society - Member, strengthen academic and social growth of the members, and host workshops.
PyCon India 2020 - Content writer for social media handles, helped the promotions team to reach out to organizations and colleges, and interacted
with individuals who have contributed to the language, and also worked on creating virtual swags.
I'm a cis male.
I consider myself lucky to have grown up in two beautiful cities in India - Bangalore and Bhubaneswar, that have infused in me a lot of character and development. I've also spent two quality years in the vibrant, diverse, gently warm, and sprawling city of Los Angeles, California. Absolutely look forward to staying in new places and experiencing different cultures.
I'm a HUGE fan of the classical formats of cricket. You'd often find me watching old test match highlights or SRT straight drives. Nothing can get more sublime than that. I bet! I don't consider IPL/T20 cricket as a thing AT ALL.
I think mobile photography is like a side gig for me? My phone instantly comes out the moment my eyes catch sight of a beautiful view.
I also spend a lot of time in quality humor - dark humor per se. We could talk about that later.