611 research outputs found
Revisiting the Importance of Encoding Logic Rules in Sentiment Classification
We analyze the performance of different sentiment classification models on
syntactically complex inputs like A-but-B sentences. The first contribution of
this analysis addresses reproducible research: to meaningfully compare
different models, their accuracies must be averaged over far more random seeds
than what has traditionally been reported. With proper averaging in place, we
notice that the distillation model described in arXiv:1603.06318v4 [cs.LG],
which incorporates explicit logic rules for sentiment classification, is
ineffective. In contrast, using contextualized ELMo embeddings
(arXiv:1802.05365v2 [cs.CL]) instead of logic rules yields significantly better
performance. Additionally, we provide analysis and visualizations that
demonstrate ELMo's ability to implicitly learn logic rules. Finally, a
crowdsourced analysis reveals how ELMo outperforms baseline models even on
sentences with ambiguous sentiment labels.Comment: EMNLP 2018 Camera Read
Developing Instrumentation for Multi-parametric Investigation of Mechanisms of Mechanosensitivity in Ion Channels
Mechanosensitive (MS) channels are implicated in pathologies of the renal and pulmonary systems. Abnormal activity in MS channel reduces cell viability causing a variety of pathologies. MS channels are also responsible for sensation of pain and hearing. Despite the vital importance of MS channels, very little is known about the gating mechanisms of these channels. Attempts to study the mechanisms are severely limited by the lack of suitable instrumentation. A better understanding of the structure-function interaction of MS channels is necessary to find pharmacological leads for the pathologies. Activation data based on indirect activation of MS channels using hypo- or hyper-osmotic solutions or viscous drag is confounded by factors like membrane stretch and cytoskeletal stress. Traditional patch clamp does not allow direct access to the cell by other probes. While a planar patch clamp chip may allow for such access, most of the existing planar patch clamp chips are focused on high throughput screening for pharmaceutical targets and have designs that limit multi-parametric studies. We present here instrumentation that combines atomic force microscopy with cellular electrophysiology based on planar patch clamp approach. The instrumentation allows multi-parametric studies on single cells and provides unique insights into mechanisms of activation of not just MS channels, but ion channels in general by combining cellular electrophysiology, optical microscopy and atomic force microscopy. Using HaCaT cells as our model system we have obtained functional maps of distribution MS channels across cell surface. The maps reveal that the distribution of MS channels on HaCaT cells is highly non-uniform and that the channels are present in small clusters instead of dispersed as single entities. Our results using direct mechanical stimulation of single cells reveal that threshold stress level is required in order to activate MS channels and that the stress has a limited spatial range. Investigation of kinetics of the electrical response to direct mechanical stimulation reveals that the MS channels respond to the mechanical signal after a small time lag, which we attribute to the conformational changes necessary while the channel is being gated. We hope that the insights gained from studying the mechanosensitive channels of HaCaT cells will also advance the understanding of MS channels in general. Apart from opening new avenues in MS channel research, the instrumentation can also be useful in studying the dynamics and gating of ligand gated channels by appropriately tagging the AFM cantilever. With further improvements in the speed of AFM imaging, it will also be possible to observe the gating of channels in real time at molecular scale by imaging the channel on the cell while the channel is being gated
Recommended from our members
Towards Robust Long-form Text Generation Systems
Text generation is an important emerging AI technology that has seen significant research advances in recent years. Due to its closeness to how humans communicate, mastering text generation technology can unlock several important applications such as intelligent chat-bots, creative writing assistance, or newer applications like task-agnostic few-shot learning. Most recently, the rapid scaling of large language models (LLMs) has resulted in systems like ChatGPT, capable of generating fluent, coherent and human-like text. However, despite their remarkable capabilities, LLMs still suffer from several limitations, particularly when generating long-form text. In particular, (1) long-form generated text is filled with factual inconsistencies to world knowledge and the input prompt; (2) it is difficult to accurately evaluate the quality of long-form generated text; (3) it is difficult to identify whether a piece of long-form text was AI-generated, a task necessary to prevent widespread misinformation and plagiarism.
In this thesis I design algorithms aimed at making progress towards these three issues in current LLMs. I will first describe a retrieval-augmented system we built for long-form question answering, to improve factual correctness of long-form generated text. However, a careful empirical analysis reveals issues related to input/output consistency of generated text, and an inherent difficulty in evaluation. I will then describe our model RankGen, which uses large-scale contrastive learning on documents to significantly outperform competing long-form text generation methods to generate text more faithful to the input. Next, I will describe our efforts to improve human evaluation of long-form generation (issue #2) by proposing the LongEval guidelines. LongEval is a set of three simple empirically-motivated ideas to make human evaluation of long-form generation more consistent, less expensive, and cognitively easier for evaluators. Finally, I describe my work on AI-generated text detection (issue #3), and showcase the brittleness of existing methods to paraphrasing attacks I designed. I will describe a simple new AI-generated text detection algorithm using information retrieval, which is significantly more robust to paraphrasing attacks.
Finally, I conclude this thesis with some future research directions that I am excited about, including plan-based long-form text generation, and a deeper dive into understanding large language model training dynamics
"Does human papilloma virus play a role in the histogenesis of the orthokeratinised jaw cyst?"
A research report submitted to the Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, in partial fulfilment of the requirements for the degree
of
Master of Science in Dentistry
Johannesburg, 2015Objectives: To analyse the clinico-pathological features of orthokeratinised jaw cysts (OJCs) and to determine whether human papillomavirus (HPV) DNA can be detected in OJCs.
Material and methods: The clinical and radiological information of 30 patients diagnosed with OJCs were reviewed and the respective histology samples were studied for light microscopic features characteristic of HPV infection. The 30 OJCs were further evaluated for the presence of HPV by using consensus HPV polymerase chain reaction (PCR).
Results: Patients with OJC ranged from 13 to 71-years (mean, 30.9 years; ± 12.9 years). There was a predilection for males (21/30). Most OJCs were found in the mandible (80%) and 44.8% were associated with an impacted tooth. Koilocyte-like characteristics were identified in 70% of cases, while 43.3% of cases showed a verruciform pattern of hyperkeratosis. All 30 OJCs were negative for HPV-DNA.
Conclusion: HPV infection does not appear to play a role in the OJC and is not responsible for the wart-like histological changes that may be encountered in OJCs
A Study of All-Convolutional Encoders for Connectionist Temporal Classification
Connectionist temporal classification (CTC) is a popular sequence prediction
approach for automatic speech recognition that is typically used with models
based on recurrent neural networks (RNNs). We explore whether deep
convolutional neural networks (CNNs) can be used effectively instead of RNNs as
the "encoder" in CTC. CNNs lack an explicit representation of the entire
sequence, but have the advantage that they are much faster to train. We present
an exploration of CNNs as encoders for CTC models, in the context of
character-based (lexicon-free) automatic speech recognition. In particular, we
explore a range of one-dimensional convolutional layers, which are particularly
efficient. We compare the performance of our CNN-based models against typical
RNNbased models in terms of training time, decoding time, model size and word
error rate (WER) on the Switchboard Eval2000 corpus. We find that our CNN-based
models are close in performance to LSTMs, while not matching them, and are much
faster to train and decode.Comment: Accepted to ICASSP-201
- …