Search CORE

46 research outputs found

Gradient-based Inference for Networks with Output Constraints

Author: Carbonell Jaime
Lee Jay Yoon
Mehta Sanket Vaibhav
Tristan Jean-Baptiste
Wick Michael
Publication venue
Publication date: 22/04/2019
Field of study

Practitioners apply neural networks to increasingly complex problems in natural language processing, such as syntactic parsing and semantic role labeling that have rich output structures. Many such structured-prediction problems require deterministic constraints on the output values; for example, in sequence-to-sequence syntactic parsing, we require that the sequential outputs encode valid trees. While hidden units might capture such properties, the network is not always able to learn such constraints from the training data alone, and practitioners must then resort to post-processing. In this paper, we present an inference method for neural networks that enforces deterministic constraints on outputs without performing rule-based post-processing or expensive discrete search. Instead, in the spirit of gradient-based training, we enforce constraints with gradient-based inference (GBI): for each input at test-time, we nudge continuous model weights until the network's unconstrained inference procedure generates an output that satisfies the constraints. We study the efficacy of GBI on three tasks with hard constraints: semantic role labeling, syntactic parsing, and sequence transduction. In each case, the algorithm not only satisfies constraints but improves accuracy, even when the underlying network is state-of-the-art.Comment: AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Towards Semi-Supervised Learning for Deep Semantic Role Labeling

Author: Carbonell Jaime
Lee Jay Yoon
Mehta Sanket Vaibhav
Publication venue
Publication date: 01/01/2018
Field of study

Neural models have shown several state-of-the-art performances on Semantic Role Labeling (SRL). However, the neural models require an immense amount of semantic-role corpora and are thus not well suited for low-resource languages or domains. The paper proposes a semi-supervised semantic role labeling method that outperforms the state-of-the-art in limited SRL training corpora. The method is based on explicitly enforcing syntactic constraints by augmenting the training objective with a syntactic-inconsistency loss component and uses SRL-unlabeled instances to train a joint-objective LSTM. On CoNLL-2012 English section, the proposed semi-supervised training with 1%, 10% SRL-labeled data and varying amounts of SRL-unlabeled data achieves +1.58, +0.78 F1, respectively, over the pre-trained models that were trained on SOTA architecture with ELMo on the same SRL-labeled data. Additionally, by using the syntactic-inconsistency loss on inference time, the proposed model achieves +3.67, +2.1 F1 over pre-trained model on 1%, 10% SRL-labeled data, respectively.Comment: EMNLP 201

arXiv.org e-Print Archive

Crossref

Assessment of clinical and functional outcomes after single dose injection of autologous platelet rich plasma in patients with chronic lateral epicondylitis: a prospective and brief follow up study

Author: Bajaj Sanket
Garg Rohit N.
Mehta Nirali
Patil Hrishikesh
Publication venue: Medip Academy
Publication date: 26/10/2023
Field of study

Background: Lateral epicondylitis is a chronic, painful, and debilitating elbow condition. The introduction of platelet-rich plasma as an adjunct to the conservative and operative treatment has revolutionized the research in this topic. PRP is considered to be the ideal autologous biological blood-derived product which helps in regenerating the degenerated tissue rather than just repairing it and helps in relieving pain and improving function. Methods: This is a prospective study where 40 patients diagnosed with tennis elbow, failing other conservative treatment modalities were enrolled; and treated with single dose injection of autologous PRP; and were evaluated for clinical and functional outcomes using the visual analogue scale and disabilities of arm, shoulder, and hand scores on the follow-ups. Results: Out of the 40 patients enrolled, there were 15 males and 25 females. The mean age of the population was 45.88±8.87 years. All the patients had improved statistically significant differences in mean VAS and DASH scores (p value<0.001) on each follow-up as compared to the baseline score with VAS score and DASH score improvement being more than 77% and 65% respectively at final follow up. Conclusion: Our study concludes that a single local injection of autologous PRP appears to be the promising and safe modality of treatment in lateral epicondylitis, helping to improve the pain as well as the clinical and functional outcomes

International Journal of Research in Orthopaedics

An Introduction to Lifelong Supervised Learning

Author: Abdelsalam Mohamed
Chandar Sarath
Faramarzi Mojtaba
Janarthanan Janarthanan
Malviya Pranshu
Mehta Sanket Vaibhav
Sodhani Shagun
Publication venue
Publication date: 12/07/2022
Field of study

This primer is an attempt to provide a detailed summary of the different facets of lifelong learning. We start with Chapter 2 which provides a high-level overview of lifelong learning systems. In this chapter, we discuss prominent scenarios in lifelong learning (Section 2.4), provide 8 Introduction a high-level organization of different lifelong learning approaches (Section 2.5), enumerate the desiderata for an ideal lifelong learning system (Section 2.6), discuss how lifelong learning is related to other learning paradigms (Section 2.7), describe common metrics used to evaluate lifelong learning systems (Section 2.8). This chapter is more useful for readers who are new to lifelong learning and want to get introduced to the field without focusing on specific approaches or benchmarks. The remaining chapters focus on specific aspects (either learning algorithms or benchmarks) and are more useful for readers who are looking for specific approaches or benchmarks. Chapter 3 focuses on regularization-based approaches that do not assume access to any data from previous tasks. Chapter 4 discusses memory-based approaches that typically use a replay buffer or an episodic memory to save subset of data across different tasks. Chapter 5 focuses on different architecture families (and their instantiations) that have been proposed for training lifelong learning systems. Following these different classes of learning algorithms, we discuss the commonly used evaluation benchmarks and metrics for lifelong learning (Chapter 6) and wrap up with a discussion of future challenges and important research directions in Chapter 7.Comment: Lifelong Learning Prime

arXiv.org e-Print Archive

Making Scalable Meta Learning Practical

Author: Ahn Hwijeen
Choe Sang Keun
Mehta Sanket Vaibhav
Neiswanger Willie
Strubell Emma
Xie Pengtao
Xing Eric
Publication venue
Publication date: 23/10/2023
Field of study

Despite its flexibility to learn diverse inductive biases in machine learning programs, meta learning (i.e., learning to learn) has long been recognized to suffer from poor scalability due to its tremendous compute/memory costs, training instability, and a lack of efficient distributed training support. In this work, we focus on making scalable meta learning practical by introducing SAMA, which combines advances in both implicit differentiation algorithms and systems. Specifically, SAMA is designed to flexibly support a broad range of adaptive optimizers in the base level of meta learning programs, while reducing computational burden by avoiding explicit computation of second-order gradient information, and exploiting efficient distributed training techniques implemented for first-order gradients. Evaluated on multiple large-scale meta learning benchmarks, SAMA showcases up to 1.7/4.8x increase in throughput and 2.0/3.8x decrease in memory consumption respectively on single-/multi-GPU setups compared to other baseline meta learning algorithms. Furthermore, we show that SAMA-based data optimization leads to consistent improvements in text classification accuracy with BERT and RoBERTa large language models, and achieves state-of-the-art results in both small- and large-scale data pruning on image classification tasks, demonstrating the practical applicability of scalable meta learning across language and vision domains

arXiv.org e-Print Archive

Assessment of Lumbar Lordosis and Lumbar Core Strength in Information Technology Professionals

Author: Ashok Shayam
Parag Sancheti
Rachana Dabadghav
Roma Satish Mehta
Sanket Nagrale
Savita Rairikar
Publication venue: 'Korean Society of Spine Surgery'
Publication date: 01/06/2016
Field of study

Study DesignObservational study.PurposeTo correlate lumbar lordosis and lumbar core strength in information technology (IT) professionals.Overview of LiteratureIT professionals have to work for long hours in a sitting position, which can affect lumbar lordosis and lumbar core strength.MethodsFlexicurve was used to assess the lumbar lordosis, and pressure biofeedback was used to assess the lumbar core strength in the IT professionals. All subjects, both male and female, with and without complaint of low back pain and working for two or more years were included, and subjects with a history of spinal surgery or spinal deformity were excluded from the study. Analysis was done using Pearson's correlation.ResultsFor the IT workers, no correlation was seen between lumbar lordosis and lumbar core strength (r=–0.04); however, a weak negative correlation was seen in IT people who complained of pain (r=–0.12), while there was no correlation of lumbar lordosis and lumbar core in IT people who had no complains of pain (r=0.007).ConclusionsThe study shows that there is no correlation of lumbar lordosis and lumbar core strength in IT professionals, but a weak negative correlation was seen in IT people who complained of pain

Directory of Open Access Journals