2,915 research outputs found

    Revisiting the Importance of Encoding Logic Rules in Sentiment Classification

    Full text link
    We analyze the performance of different sentiment classification models on syntactically complex inputs like A-but-B sentences. The first contribution of this analysis addresses reproducible research: to meaningfully compare different models, their accuracies must be averaged over far more random seeds than what has traditionally been reported. With proper averaging in place, we notice that the distillation model described in arXiv:1603.06318v4 [cs.LG], which incorporates explicit logic rules for sentiment classification, is ineffective. In contrast, using contextualized ELMo embeddings (arXiv:1802.05365v2 [cs.CL]) instead of logic rules yields significantly better performance. Additionally, we provide analysis and visualizations that demonstrate ELMo's ability to implicitly learn logic rules. Finally, a crowdsourced analysis reveals how ELMo outperforms baseline models even on sentences with ambiguous sentiment labels.Comment: EMNLP 2018 Camera Read

    Dual Language Models for Code Switched Speech Recognition

    Full text link
    In this work, we present a simple and elegant approach to language modeling for bilingual code-switched text. Since code-switching is a blend of two or more different languages, a standard bilingual language model can be improved upon by using structures of the monolingual language models. We propose a novel technique called dual language models, which involves building two complementary monolingual language models and combining them using a probabilistic model for switching between the two. We evaluate the efficacy of our approach using a conversational Mandarin-English speech corpus. We prove the robustness of our model by showing significant improvements in perplexity measures over the standard bilingual language model without the use of any external information. Similar consistent improvements are also reflected in automatic speech recognition error rates.Comment: Accepted at Interspeech 201

    Does Confidence Reporting from the Crowd Benefit Crowdsourcing Performance?

    Full text link
    We explore the design of an effective crowdsourcing system for an MM-ary classification task. Crowd workers complete simple binary microtasks whose results are aggregated to give the final classification decision. We consider the scenario where the workers have a reject option so that they are allowed to skip microtasks when they are unable to or choose not to respond to binary microtasks. Additionally, the workers report quantized confidence levels when they are able to submit definitive answers. We present an aggregation approach using a weighted majority voting rule, where each worker's response is assigned an optimized weight to maximize crowd's classification performance. We obtain a couterintuitive result that the classification performance does not benefit from workers reporting quantized confidence. Therefore, the crowdsourcing system designer should employ the reject option without requiring confidence reporting.Comment: 6 pages, 4 figures, SocialSens 2017. arXiv admin note: text overlap with arXiv:1602.0057

    iFair: Learning Individually Fair Data Representations for Algorithmic Decision Making

    Get PDF
    People are rated and ranked, towards algorithmic decision making in an increasing number of applications, typically based on machine learning. Research on how to incorporate fairness into such tasks has prevalently pursued the paradigm of group fairness: giving adequate success rates to specifically protected groups. In contrast, the alternative paradigm of individual fairness has received relatively little attention, and this paper advances this less explored direction. The paper introduces a method for probabilistically mapping user records into a low-rank representation that reconciles individual fairness and the utility of classifiers and rankings in downstream applications. Our notion of individual fairness requires that users who are similar in all task-relevant attributes such as job qualification, and disregarding all potentially discriminating attributes such as gender, should have similar outcomes. We demonstrate the versatility of our method by applying it to classification and learning-to-rank tasks on a variety of real-world datasets. Our experiments show substantial improvements over the best prior work for this setting.Comment: Accepted at ICDE 2019. Please cite the ICDE 2019 proceedings versio

    Majorana Fermion Quantum Mechanics for Higher Rank Tensors

    Full text link
    We study quantum mechanical models in which the dynamical degrees of freedom are real fermionic tensors of rank five and higher. They are the non-random counterparts of the Sachdev-Ye-Kitaev (SYK) models where the Hamiltonian couples six or more fermions. For the tensors of rank five, there is a unique O(N)5O(N)^5 symmetric sixth-order Hamiltonian leading to a solvable large NN limit dominated by the melonic diagrams. We solve for the complete energy spectrum of this model when N=2N=2 and deduce exact expressions for all the eigenvalues. The subset of states which are gauge invariant exhibit degeneracies related to the discrete symmetries of the gauged model. We also study quantum chaos properties of the tensor model and compare them with those of the q=6q=6 SYK model. For q>6q>6 there is a rapidly growing number of O(N)q−1O(N)^{q-1} invariant tensor interactions. We focus on those of them that are maximally single-trace - their stranded diagrams stay connected when any set of q−3q-3 colors is erased. We present a general discussion of why the tensor models with maximally single-trace interactions have large NN limits dominated by the melonic diagrams. We solve the large NN Schwinger-Dyson equations for the higher rank Majorana tensor models and show that they match those of the corresponding SYK models exactly. We also study other gauge invariant operators present in the tensor models.Comment: 36 pages, 19 figures, 2 tables, v3: some clarifications and references adde

    Matrix and tensor comparisons of genomic profiles to predict cancer survival and drug targets

    Get PDF
    disseratationDespite recent large-scale profiling efforts, the best predictor of a glioblastoma (GBM) brain cancer patient's survival remains the patient's age at diagnosis. The best predictor of an ovarian serous cystadenocarcinoma (OV) patient's survival remains the tumor's stage, an assessment - numbering I to IV - of the spread of the cancer. To identify DNA copy-number alterations (CNAs) that might predict GBM or OV patients' survival, we comparatively modeled matched genomic profiles from The Cancer Genome Atlas (TCGA). Generalized singular value decomposition (GSVD) of patient-matched but probe- independent GBM and normal profiles uncovered a previously unknown global pattern of tumor-exclusive co-occurring CNAs that is correlated, and possibly causally related to, GBM patients' survival and response to chemotherapy. This suggests that the GBM survival phenotype is an outcome of its global genotype. The GSVD, formulated as a framework for comparatively modeling two composite datasets, removes from the pattern variations that occur in the normal human genome (e.g., female-specific X chromosome amplification) and experimental variations, without a-priori knowledge of these variations. The pattern is independent of age, and combined with age, makes a better predictor than age alone. The pattern suggests previously unrecognized targets for personalized GBM drug therapy, the kinase TLK2 and the methyltransferase METTL2A. A novel tensor GSVD of patient- and platform-matched OV and normal genomic profiles revealed multiple chromosome arm-wide patterns of CNAs that are correlated with OV patients' survival. These indicate several, previously unrecognized, subtypes of OV. The tensor GSVD is an exact simultaneous decomposition of two high-dimensional datasets arranged in higher-order tensors. The tensor GSVD generalizes the GSVD, which is limited to two second-order tensors, i.e., matrices. The chromosome arm-wide patterns of CNAs are independent of the OV tumor stage. Combined with stage, each of the patterns makes a better predictor than stage alone. We conclude that the GSVD and the novel tensor GSVD can uncover the relations, and possibly causal coordinations, between different recorded aspects of the same medical phenomenon. GSVD and tensor GSVD comparisons can be used to determine one patient's medical status in relation to other patients in a set, and inform the patient's prognosis, and possibly also treatment

    Corneal Biomechanics as a Function of Race

    Get PDF
    Corneal biomechanical properties are known to vary across age, gender, and race. This study aims to explore the differences in corneal biomechanics between different races, in vivo, using corneal deformation response to an applied air puff with the CorVis ST. This preliminary prospective study focuses on young normal subjects, ages 18-30. Thus far, 16 Caucasian subjects and 23 South Asian subjects have been enrolled, and three measurements were taken of each eye with the CorVis ST, as well as Pentacam, Ocular Response Analyzer (ORA), Goldmann Applanation Tonometer (GAT), and Pascal Dynamic Contour Tonometer (DCT). The subjects’ data was compared to the other race and to an existing database of CorVis exams from Italian and Brazilian subjects, matched by biomechanically corrected IOP, central corneal thickness, and age. The stiffness parameter (SP), corneal velocity, deformation amplitude (DA) ratio, and maximum inverse radius were compared between groups. ANOVA tests were performed between groups for each of these parameters using Statistical Analysis Software (SAS). As greater stiffness is associated with greater resistance to deformation, a stiffer cornea would have a higher stiffness parameter, lower corneal velocity, smaller deformation amplitude ratio, and smaller maximum inverse radius. Significant differences (p≤0.05) were found between the Caucasian subjects and the mixed-race database with regards to SP and corneal velocity, with Caucasian subjects having a greater SP and lower velocity, and therefore a stiffer cornea. South Asian subjects had significantly higher SP and significantly lower corneal velocity than the mixed-race database, showing that South Asians had stiffer corneas than the subjects in the database. Caucasian subjects had significantly lower DA ratio and maximum inverse radius than the South Asian subjects. These are the most sensitive CorVis parameters, and the results show that South Asian subjects have softer, more compliant corneas. These results are notable because these differences in corneal biomechanics by race are evident even with a small number of subjects and in a young population. Corneal biomechanical properties affect the accuracy of IOP measurements, disease development, and response to surgery, so further exploring corneal biomechanical differences by race could be very valuable.The Ohio State University College of Engineering Undergraduate Research ScholarshipNo embargoAcademic Major: Biomedical Engineerin

    Smartphone Gesture-Based Authentication

    Get PDF
    In this research, we consider the problem of authentication on a smartphone based on gestures, that is, movements of the phone. Accelerometer data from a number of subjects was collected and we analyze this data using a variety of machine learning techniques, including support vector machines (SVM) and convolutional neural networks (CNN). We analyze both the fraud rate (or false accept rate) and insult rate (or false reject rate) in each case
    • …
    corecore