Adarsh K.

San Francisco, California, United States Contact Info

Sign in to view Adarsh’s full profile

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

3K followers 500+ connections

View mutual connections with Adarsh

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

Dashworks

University of Wisconsin-Madison

Experience & Education

Dashworks

******

**** ******* *********
****** ******

******* *********
********** ** *********-*******

**, *** *********(******* ***) ******** *******
****** ********* ** **********, ********

******** ** ********** - ***** ******** ******* *** *********** ********** **** *, ********* ****** ********

View Adarsh’s full experience

See their title, tenure and more.

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

Accelerating Deep Learning Inference via Learned Caches

HotClouds, USENIX ATC

Over the last few years, Deep Neural Networks (DNNs)have become ubiquitous owing to their high accuracy on real-world tasks. However, this increase in accuracy comes at the cost of computationally expensive models leading to higher prediction latencies. Prior efforts to reduce this latency such as quantization, model distillation, and any-time prediction models typically trade-off accuracy for performance. In this work, we observe that caching intermediate layer outputs can help us avoid…

Over the last few years, Deep Neural Networks (DNNs)have become ubiquitous owing to their high accuracy on real-world tasks. However, this increase in accuracy comes at the cost of computationally expensive models leading to higher prediction latencies. Prior efforts to reduce this latency such as quantization, model distillation, and any-time prediction models typically trade-off accuracy for performance. In this work, we observe that caching intermediate layer outputs can help us avoid running all the layers of a DNN for a sizeable fraction of inference requests. We find that this can potentially reduce the number of effective layers by half for 91.58% ofCIFAR-10 requests run on ResNet-18. We present Freeze Inference, a system that introduces approximate caching at each intermediate layer and we discuss techniques to reduce the cache size and improve the cache hit rate. Finally, we discuss some of the open research challenges in realizing such a design.

See publication
Accelerating Deep Learning Inference via Learned Caches

Under Submission

Deep Neural Networks (DNNs) are witnessing in-creased adoption in multiple domains owing to their high accuracy in solving real-world problems. However, this high accuracy has been achieved by building deeper networks,posing a fundamental challenge to the low latency inference desired by user-facing applications. Current low latency solutions trade-off on accuracy or fail to exploit the inherent temporal locality in prediction serving workloads.We observe that caching hidden layer outputs of…

Deep Neural Networks (DNNs) are witnessing in-creased adoption in multiple domains owing to their high accuracy in solving real-world problems. However, this high accuracy has been achieved by building deeper networks,posing a fundamental challenge to the low latency inference desired by user-facing applications. Current low latency solutions trade-off on accuracy or fail to exploit the inherent temporal locality in prediction serving workloads.We observe that caching hidden layer outputs of the DNN can introduce a form of late-binding where inference requests only consume the amount of computation needed. This enables a mechanism for achieving low latencies, coupled with an ability to exploit temporal locality. However, traditional caching approaches incur high memory overheads and lookup latencies, leading us to design learned caches- caches that consist of simple ML models that are continuously updated.We present the design of GATI, an end-to-end prediction serving system that incorporates learned caches for low-latency DNN inference. Results show that GATI can reduce inference latency by up to 7.69×on realistic workloads.

See publication
Can Adversarial Weight Perturbations Inject Neural Backdoors?

CIKM 2020

Adversarial machine learning has exposed several security hazards of neural models and has become an important research topic in recent times. Thus far, the concept of an “adversarial perturbation” has exclusively been used with reference to the input space referring to a small, imperceptible change which can cause a ML model to err. In this work we extend the idea of “adversarial perturbations” to the space of model weights, specifically to inject backdoors in trained DNNs, which exposes a…

Adversarial machine learning has exposed several security hazards of neural models and has become an important research topic in recent times. Thus far, the concept of an “adversarial perturbation” has exclusively been used with reference to the input space referring to a small, imperceptible change which can cause a ML model to err. In this work we extend the idea of “adversarial perturbations” to the space of model weights, specifically to inject backdoors in trained DNNs, which exposes a security risk of using publicly available trained models. Here, injecting a backdoor refers to obtaining a desired outcome from the model when a trigger pattern is added to the input, while retaining the original model predictions on a non-triggered input. From the perspective of an adversary, we characterize these adversarial perturbations to be constrained within anℓ∞norm around the original model weights.We introduce adversarial perturbations in the model weights using a composite loss on the predictions of the original model and the desired trigger through projected gradient descent. We empirically show that these adversarial weight perturbations exist universally across several computer vision and natural language processing tasks. Our results show that backdoors can be successfully injectedwith a very small average relative change in model weight values for several applications.

See publication
Doing More by Doing Less: How structured partial backpropagation improves Deep Learning clusters

DistributedML, CoNEXT 2021

Many organizations employ compute clusters equipped with accelerators such as GPUs and TPUs for training deep learning models
in a distributed fashion. Training is resource-intensive, consuming
significant compute, memory, and network resources. Many prior
works explore how to reduce training resource footprint without
impacting quality, but their focus on a subset of the bottlenecks
(typically only the network) limits their ability to improve overall
cluster utilization. In…

Many organizations employ compute clusters equipped with accelerators such as GPUs and TPUs for training deep learning models
in a distributed fashion. Training is resource-intensive, consuming
significant compute, memory, and network resources. Many prior
works explore how to reduce training resource footprint without
impacting quality, but their focus on a subset of the bottlenecks
(typically only the network) limits their ability to improve overall
cluster utilization. In this work, we exploit the unique characteristics of deep learning workloads to propose Structured Partial
Backpropagation(SPB), a technique that systematically controls
the amount of backpropagation at individual workers in distributed
training. This simultaneously reduces network bandwidth, compute
utilization, and memory footprint while preserving model quality.
To efficiently leverage the benefits of SPB at cluster level, we introduce Jigsaw, a SPB aware scheduler, which does scheduling at the
iteration level for Deep Learning Training(DLT) jobs

See publication
MA-DST: Multi-Attention-Based Scalable Dialog State Tracking

AAAI 2020, NeuRIPS 2020

Task oriented dialog agents provide a natural language inter-face for users to complete their goal. Dialog State Tracking(DST), which is often a core component of these systems,tracks the system’s understanding of the user’s goal through-out the conversation. To enable accurate multi-domain DST,the model needs to encode dependencies between past utterances and slot semantics and understand the dialog context,including long-range cross-domain references. We introduce a novel…

Task oriented dialog agents provide a natural language inter-face for users to complete their goal. Dialog State Tracking(DST), which is often a core component of these systems,tracks the system’s understanding of the user’s goal through-out the conversation. To enable accurate multi-domain DST,the model needs to encode dependencies between past utterances and slot semantics and understand the dialog context,including long-range cross-domain references. We introduce a novel architecture for this task to encode the conversation history and slot semantics more robustly by using attention mechanisms at multiple granularities. In particular, we use cross-attention to model relationships between the context and slots at different semantic levels and self-attention to re-solve cross-domain co-references. In addition, our proposed architecture does not rely on knowing the domain ontologies beforehand and can also be used in a zero-shot setting for new domains or unseen slot values. Our model improves the joint goal accuracy by 5% (absolute) in the full-data setting and by up to 2% (absolute) in the zero-shot setting over the present state-of-the-art on the MultiWoZ 2.1 dataset.

See publication
MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines

LREC 2020

MultiWOZ 2.0 (Budzianowski et al., 2018) is a recently released multi-domain dialogue dataset spanning 7 distinct domains and containing over 10,000 dialogues. Though immensely useful and one of the largest resources of its kind to-date, MultiWOZ 2.0 has a few shortcomings. Firstly, there is substantial noise in the dialogue state annotations and dialogue utterances which negatively impact the performance of state-tracking models. Secondly, follow-up work (Lee et al., 2019) has augmented the…

MultiWOZ 2.0 (Budzianowski et al., 2018) is a recently released multi-domain dialogue dataset spanning 7 distinct domains and containing over 10,000 dialogues. Though immensely useful and one of the largest resources of its kind to-date, MultiWOZ 2.0 has a few shortcomings. Firstly, there is substantial noise in the dialogue state annotations and dialogue utterances which negatively impact the performance of state-tracking models. Secondly, follow-up work (Lee et al., 2019) has augmented the original dataset with user dialogue acts. This leads to multiple co-existent versions of the same dataset with minor modifications. In this work we tackle the aforementioned issues by introducing MultiWOZ 2.1. To fix the noisy state annotations, we use crowdsourced workers to re-annotate state and utterances based on the original utterances in the dataset. This correction process results in changes to over 32% of state annotations across 40% of the dialogue turns. In addition, we fix 146 dialogue utterances by canonicalizing slot values in the utterances to the values in the dataset ontology. To address the second problem, we combined the contributions of the follow-up works into MultiWOZ 2.1. Hence, our dataset also includes user dialogue acts as well as multiple slot descriptions per dialogue state slot. We then benchmark a number of state-of-the-art dialogue state tracking models on the MultiWOZ 2.1 dataset and show the joint state tracking performance on the corrected state annotations. We are publicly releasing MultiWOZ 2.1 to the community, hoping that this dataset resource will allow for more effective models across various dialogue sub problems to be built in the future.

See publication
Translating Web Search Queries into Natural Language Questions

LREC 2018

Users often query a search engine with a specific question in mind and often these queries are keywords or sub-sentential fragments.In this paper, we are proposing a method to generate well-formed natural language question from a given keyword-based query,which has the same question intent as the query.Conversion of keyword based web query into a well formed question has lots of applications in search engines, Community Question Answering (CQA) website and…

Users often query a search engine with a specific question in mind and often these queries are keywords or sub-sentential fragments.In this paper, we are proposing a method to generate well-formed natural language question from a given keyword-based query,which has the same question intent as the query.Conversion of keyword based web query into a well formed question has lots of applications in search engines, Community Question Answering (CQA) website and bots communication. We found a synergy between query-to-question problem with standard machine translation (MT) task. We have used both Statistical MT (SMT) and Neural MT(NMT) models to generate the questions from query. We have observed that MT models performs well in terms of both automatic and human evaluation.

See publication

Honors & Awards

Travel Grant AAAI 2020

AAAI

2020
Special CS Scholarship, University of Wisconsin Madison

University of Wisconsin, Madison

2018
Excellence Award for Innovation

Microsoft

2017
Institute Silver Medal

Indian Institute of Technology, Guwahati

2016
Xerox Research Health Challenge

Xerox Research

2016

Invited to present my work on Xerox Research Health Challenge.
Institute Merit Scholarship

IIT Guwahati

May 2014

Awarded Institute Merit Scholarship, consequently for 2014, 2015 and 2016, for being the department topper.
Travel Grant MLSys Conference

UW Madison
Travel Grant NeuRIPS 2020

-
Travel Grant USENIX ATC 2019

UW Madison

View Adarsh’s full profile

See who you know in common
Get introduced
Contact Adarsh directly

Join to view full profile

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Adarsh K. in United States

8 others named Adarsh K. in United States are on LinkedIn

See others named Adarsh K.

Experience & Education

Dashworks

******** ****** | *******

View Adarsh’s full experience

See their title, tenure and more.

Publications

Accelerating Deep Learning Inference via Learned Caches

HotClouds, USENIX ATC

Accelerating Deep Learning Inference via Learned Caches

Under Submission

Can Adversarial Weight Perturbations Inject Neural Backdoors?

CIKM 2020

Doing More by Doing Less: How structured partial backpropagation improves Deep Learning clusters

DistributedML, CoNEXT 2021

MA-DST: Multi-Attention-Based Scalable Dialog State Tracking

AAAI 2020, NeuRIPS 2020

MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines

LREC 2020

Translating Web Search Queries into Natural Language Questions

LREC 2018

Honors & Awards

Travel Grant AAAI 2020

AAAI

Special CS Scholarship, University of Wisconsin Madison

University of Wisconsin, Madison

Excellence Award for Innovation

Microsoft

Institute Silver Medal

Indian Institute of Technology, Guwahati

Xerox Research Health Challenge

Xerox Research

Institute Merit Scholarship

IIT Guwahati

Travel Grant MLSys Conference

UW Madison

Travel Grant NeuRIPS 2020

-

Travel Grant USENIX ATC 2019

UW Madison

View Adarsh’s full profile

Sign in

Other similar profiles

Pratyaksh Sharma

Sri Vardhamanan A

Shreya Rajpal

Prasad Kawthekar

Neel Kamal

Motoki Wu

Ashish Vaswani

Dheeraj Pandey

Rahul Mohan

Mustafa Suleyman

Explore collaborative articles

Others named Adarsh K. in United States

Adarsh K.

Adarsh Kashyap K P

Adarsh K

Adarsh k

Adarsh k