2,915 research outputs found
Revisiting the Importance of Encoding Logic Rules in Sentiment Classification
We analyze the performance of different sentiment classification models on
syntactically complex inputs like A-but-B sentences. The first contribution of
this analysis addresses reproducible research: to meaningfully compare
different models, their accuracies must be averaged over far more random seeds
than what has traditionally been reported. With proper averaging in place, we
notice that the distillation model described in arXiv:1603.06318v4 [cs.LG],
which incorporates explicit logic rules for sentiment classification, is
ineffective. In contrast, using contextualized ELMo embeddings
(arXiv:1802.05365v2 [cs.CL]) instead of logic rules yields significantly better
performance. Additionally, we provide analysis and visualizations that
demonstrate ELMo's ability to implicitly learn logic rules. Finally, a
crowdsourced analysis reveals how ELMo outperforms baseline models even on
sentences with ambiguous sentiment labels.Comment: EMNLP 2018 Camera Read
Dual Language Models for Code Switched Speech Recognition
In this work, we present a simple and elegant approach to language modeling
for bilingual code-switched text. Since code-switching is a blend of two or
more different languages, a standard bilingual language model can be improved
upon by using structures of the monolingual language models. We propose a novel
technique called dual language models, which involves building two
complementary monolingual language models and combining them using a
probabilistic model for switching between the two. We evaluate the efficacy of
our approach using a conversational Mandarin-English speech corpus. We prove
the robustness of our model by showing significant improvements in perplexity
measures over the standard bilingual language model without the use of any
external information. Similar consistent improvements are also reflected in
automatic speech recognition error rates.Comment: Accepted at Interspeech 201
Does Confidence Reporting from the Crowd Benefit Crowdsourcing Performance?
We explore the design of an effective crowdsourcing system for an -ary
classification task. Crowd workers complete simple binary microtasks whose
results are aggregated to give the final classification decision. We consider
the scenario where the workers have a reject option so that they are allowed to
skip microtasks when they are unable to or choose not to respond to binary
microtasks. Additionally, the workers report quantized confidence levels when
they are able to submit definitive answers. We present an aggregation approach
using a weighted majority voting rule, where each worker's response is assigned
an optimized weight to maximize crowd's classification performance. We obtain a
couterintuitive result that the classification performance does not benefit
from workers reporting quantized confidence. Therefore, the crowdsourcing
system designer should employ the reject option without requiring confidence
reporting.Comment: 6 pages, 4 figures, SocialSens 2017. arXiv admin note: text overlap
with arXiv:1602.0057
iFair: Learning Individually Fair Data Representations for Algorithmic Decision Making
People are rated and ranked, towards algorithmic decision making in an
increasing number of applications, typically based on machine learning.
Research on how to incorporate fairness into such tasks has prevalently pursued
the paradigm of group fairness: giving adequate success rates to specifically
protected groups. In contrast, the alternative paradigm of individual fairness
has received relatively little attention, and this paper advances this less
explored direction. The paper introduces a method for probabilistically mapping
user records into a low-rank representation that reconciles individual fairness
and the utility of classifiers and rankings in downstream applications. Our
notion of individual fairness requires that users who are similar in all
task-relevant attributes such as job qualification, and disregarding all
potentially discriminating attributes such as gender, should have similar
outcomes. We demonstrate the versatility of our method by applying it to
classification and learning-to-rank tasks on a variety of real-world datasets.
Our experiments show substantial improvements over the best prior work for this
setting.Comment: Accepted at ICDE 2019. Please cite the ICDE 2019 proceedings versio
Majorana Fermion Quantum Mechanics for Higher Rank Tensors
We study quantum mechanical models in which the dynamical degrees of freedom
are real fermionic tensors of rank five and higher. They are the non-random
counterparts of the Sachdev-Ye-Kitaev (SYK) models where the Hamiltonian
couples six or more fermions. For the tensors of rank five, there is a unique
symmetric sixth-order Hamiltonian leading to a solvable large
limit dominated by the melonic diagrams. We solve for the complete energy
spectrum of this model when and deduce exact expressions for all the
eigenvalues. The subset of states which are gauge invariant exhibit
degeneracies related to the discrete symmetries of the gauged model. We also
study quantum chaos properties of the tensor model and compare them with those
of the SYK model. For there is a rapidly growing number of
invariant tensor interactions. We focus on those of them that are
maximally single-trace - their stranded diagrams stay connected when any set of
colors is erased. We present a general discussion of why the tensor
models with maximally single-trace interactions have large limits dominated
by the melonic diagrams. We solve the large Schwinger-Dyson equations for
the higher rank Majorana tensor models and show that they match those of the
corresponding SYK models exactly. We also study other gauge invariant operators
present in the tensor models.Comment: 36 pages, 19 figures, 2 tables, v3: some clarifications and
references adde
Matrix and tensor comparisons of genomic profiles to predict cancer survival and drug targets
disseratationDespite recent large-scale profiling efforts, the best predictor of a glioblastoma (GBM) brain cancer patient's survival remains the patient's age at diagnosis. The best predictor of an ovarian serous cystadenocarcinoma (OV) patient's survival remains the tumor's stage, an assessment - numbering I to IV - of the spread of the cancer. To identify DNA copy-number alterations (CNAs) that might predict GBM or OV patients' survival, we comparatively modeled matched genomic profiles from The Cancer Genome Atlas (TCGA). Generalized singular value decomposition (GSVD) of patient-matched but probe- independent GBM and normal profiles uncovered a previously unknown global pattern of tumor-exclusive co-occurring CNAs that is correlated, and possibly causally related to, GBM patients' survival and response to chemotherapy. This suggests that the GBM survival phenotype is an outcome of its global genotype. The GSVD, formulated as a framework for comparatively modeling two composite datasets, removes from the pattern variations that occur in the normal human genome (e.g., female-specific X chromosome amplification) and experimental variations, without a-priori knowledge of these variations. The pattern is independent of age, and combined with age, makes a better predictor than age alone. The pattern suggests previously unrecognized targets for personalized GBM drug therapy, the kinase TLK2 and the methyltransferase METTL2A. A novel tensor GSVD of patient- and platform-matched OV and normal genomic profiles revealed multiple chromosome arm-wide patterns of CNAs that are correlated with OV patients' survival. These indicate several, previously unrecognized, subtypes of OV. The tensor GSVD is an exact simultaneous decomposition of two high-dimensional datasets arranged in higher-order tensors. The tensor GSVD generalizes the GSVD, which is limited to two second-order tensors, i.e., matrices. The chromosome arm-wide patterns of CNAs are independent of the OV tumor stage. Combined with stage, each of the patterns makes a better predictor than stage alone. We conclude that the GSVD and the novel tensor GSVD can uncover the relations, and possibly causal coordinations, between different recorded aspects of the same medical phenomenon. GSVD and tensor GSVD comparisons can be used to determine one patient's medical status in relation to other patients in a set, and inform the patient's prognosis, and possibly also treatment
Corneal Biomechanics as a Function of Race
Corneal biomechanical properties are known to vary across age, gender, and race. This study aims to explore the differences in corneal biomechanics between different races, in vivo, using corneal deformation response to an applied air puff with the CorVis ST. This preliminary prospective study focuses on young normal subjects, ages 18-30. Thus far, 16 Caucasian subjects and 23 South Asian subjects have been enrolled, and three measurements were taken of each eye with the CorVis ST, as well as Pentacam, Ocular Response Analyzer (ORA), Goldmann Applanation Tonometer (GAT), and Pascal Dynamic Contour Tonometer (DCT). The subjects’ data was compared to the other race and to an existing database of CorVis exams from Italian and Brazilian subjects, matched by biomechanically corrected IOP, central corneal thickness, and age. The stiffness parameter (SP), corneal velocity, deformation amplitude (DA) ratio, and maximum inverse radius were compared between groups. ANOVA tests were performed between groups for each of these parameters using Statistical Analysis Software (SAS). As greater stiffness is associated with greater resistance to deformation, a stiffer cornea would have a higher stiffness parameter, lower corneal velocity, smaller deformation amplitude ratio, and smaller maximum inverse radius. Significant differences (p≤0.05) were found between the Caucasian subjects and the mixed-race database with regards to SP and corneal velocity, with Caucasian subjects having a greater SP and lower velocity, and therefore a stiffer cornea. South Asian subjects had significantly higher SP and significantly lower corneal velocity than the mixed-race database, showing that South Asians had stiffer corneas than the subjects in the database. Caucasian subjects had significantly lower DA ratio and maximum inverse radius than the South Asian subjects. These are the most sensitive CorVis parameters, and the results show that South Asian subjects have softer, more compliant corneas. These results are notable because these differences in corneal biomechanics by race are evident even with a small number of subjects and in a young population. Corneal biomechanical properties affect the accuracy of IOP measurements, disease development, and response to surgery, so further exploring corneal biomechanical differences by race could be very valuable.The Ohio State University College of Engineering Undergraduate Research ScholarshipNo embargoAcademic Major: Biomedical Engineerin
Smartphone Gesture-Based Authentication
In this research, we consider the problem of authentication on a smartphone based on gestures, that is, movements of the phone. Accelerometer data from a number of subjects was collected and we analyze this data using a variety of machine learning techniques, including support vector machines (SVM) and convolutional neural networks (CNN). We analyze both the fraud rate (or false accept rate) and insult rate (or false reject rate) in each case
- …