Search CORE

70 research outputs found

Ontology-Aware Token Embeddings for Prepositional Phrase Attachment

Author: Ammar Waleed
Dasigi Pradeep
Dyer Chris
Hovy Eduard
Publication venue
Publication date: 01/01/2017
Field of study

Type-level word embeddings use the same set of parameters to represent all instances of a word regardless of its context, ignoring the inherent lexical ambiguity in language. Instead, we embed semantic concepts (or synsets) as defined in WordNet and represent a word token in a particular context by estimating a distribution over relevant semantic concepts. We use the new, context-sensitive embeddings in a model for predicting prepositional phrase(PP) attachments and jointly learn the concept embeddings and model parameters. We show that using context-sensitive embeddings improves the accuracy of the PP attachment model by 5.4% absolute points, which amounts to a 34.4% relative reduction in errors.Comment: ACL 201

arXiv.org e-Print Archive

Crossref

Basepoint dependence of the unipotent fundamental group of P1 Qp {0,1, ∞}

Author: Dasigi N.
Publication venue: UCL (University College London)
Publication date: 28/01/2012
Field of study

Let X be the scheme P1Qp \ {0,1, ∞} We can assign a fundamental group to each rational basepoint on this scheme. These groups are non-canonically isomorphic, so they need not have isomorphic Galois actions. We study a description of this map from points to groups with Galois action, in terms of non-abelian cohomology. Using this description, we see that the fundamental groups associated to di�fferent basepoints are not isomorphic

UCL Discovery

Recommended from our members

CODACT: Towards Identifying Orthographic Variants in Dialectal Arabic

Author: Dasigi Pradeep
Diab Mona
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

Dialectal Arabic (DA) is the spoken vernacular for over 300M people worldwide. DA is emerging as the form of Arabic written in online communication: chats, emails, blogs, etc. However, most existing NLP tools for Arabic are designed for processing Modern Standard Arabic, a variety that is more formal and scripted. Apart from the genre variation that is a hindrance for any language processing, even in English, DA has no orthographic standard, compared to MSA that has a standard orthography and script. Accordingly, a word may be written in many possible inconsistent spellings rendering the processing of DA very challenging. To solve this problem, such inconsistencies have to be normalized. This work is the ﬁrst step towards addressing this problem, as we attempt to identify spelling variants in a given textual document. We present an unsupervised clustering approach that addresses the problem of identifying orthographic variants in DA. We employ different similarity measures that exploit string similarity and contextual semantic similarity. To our knowledge this is the ﬁrst attempt at solving the problem for DA. Our approaches are tested on data in two dialects of Arabic - Egyptian and Levantine. Our system achieves the highest Entropy of 0.19 for Egyptian (corresponding to 68% cluster precision) and Levantine (corresponding to 64% cluster precision) respectively. This constitutes a signiﬁcant reduction in entropy (from 0.47 for Egyptian and 0.51 for Levantine) and improvement in cluster precision (from 29% for both) from the baseline

Columbia University Academic Commons

On the Relationship between Parsimonious Covering and Boolean Minimization

Author: Dasigi Venu
Thirunarayan Krishnaprasad
Publication venue: CORE Scholar
Publication date: 01/05/1991
Field of study

Minimization of Boolean switching functions is a basic problem in the design of logic circuits. The designer first comes up with a switching function expressed in terms of several binary input variables that satisfies the desired functionality, and then attempts to minimize the function as a sum of products or product of sums. It turns out that a sum of products form of a switching function that has no redundancy is a union of prime implicants of the function. In this paper we would like to explicate some of the relationships of the boolean minimization problem to a formalization of abductive inference called parsimonious covering. Abductive inference often occurs in diagnostic problems such as finding the causes of circuit faults [Reiter, 87] or determining the diseases causing the symptoms reported by a patient [Peng and Reggia, 90]. Parsimonious covering involves covering all observed facts by means of a parsimonious set of explanations that can account for the observations. The relationship of parsimonious covering to boolean minimization has been noted by the developers of the theory; we intend to pursue a detailed mapping here

CORE

Genre Independent Subgroup Detection in Online Discussion Threads: A Pilot Study of Implicit Attitude using Latent Textual Semantics

Author: Dasigi Pradeep
Diab Mona
Guo Weiwei
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

We describe an unsupervised approach to the problem of automatically detecting subgroups of people holding similar opinions in a discussion thread. An intuitive way of identifying this is to detect the attitudes of discussants towards each other or named entities or topics mentioned in the discussion. Sentiment tags play an important role in this detection, but we also note another dimension to the detection of people’s attitudes in a discussion: if two persons share the same opinion, they tend to use similar language content. We consider the latter to be an implicit attitude. In this paper, we investigate the impact of implicit and explicit attitude in two genres of social media discussion data, more formal wikipedia discussions and a debate discussion forum that is much more informal. experimental results strongly suggest that implicit attitude is an important complement for explicit attitudes (expressed via sentiment) and it can improve the sub-group detection performance independent of genre

CiteSeerX

Columbia University Academic Commons

TRAM: Bridging Trust Regions and Sharpness Aware Minimization

Author: Dasigi Pradeep
Peng Hao
Saphra Naomi
Sherborne Tom
Publication venue
Publication date: 12/03/2024
Field of study

Sharpness-aware minimization (SAM) reports improving domain generalization by reducing the loss surface curvature in the parameter space. However, generalization during fine-tuning is often more dependent on the transferability of representations in the function space. Trust-region methods (TR) target this goal by regularizing representation curvature to reduce catastrophic forgetting of pre-trained task-agnostic information while adopting task-specific skills. We consider unifying these strategies for low curvature in both parameter space and function space to improve out-of-domain (OOD) generalization. We propose Trust Region Aware Minimization (TRAM), a SAM algorithm fine-tuning for low parameter sharpness and smooth, informative representations preserving pre-trained structure. TRAM uses a trust region bound to inform the SAM adversarial neighborhood, introducing an awareness of function curvature within optimization for flatter minima. We empirically validate TRAM in vision (cross-dataset adaptation) and text (OOD language modeling, zero-shot cross-lingual transfer) tasks where robust domain transfer and representation generality are critical. TRAM outperforms SAM- and TR-based optimization across all tasks, notably surpassing competing methods for hard transfer between anticorrelated domains. TRAM establishes a novel standard in fine-tuning for domain-generalizable models with minimal additional computation over previous sharpness-aware methods.Comment: Camera Ready for ICLR 2024 (Accepted as Spotlight). 21 pages, 14 tables, 2 figure

arXiv.org e-Print Archive

Subgroup Detection in Ideological Discussions

Author: Abu-Jbara Amjad
Dasigi Pradeep
Diab Mona
Radev Dragomir R.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

The rapid and continuous growth of social networking sites has led to the emergence of many communities of communicating groups. Many of these groups discuss ideological and political topics. It is not uncommon that the participants in such discussions split into two or more subgroups. The members of each subgroup share the same opinion toward the discussion topic and are more likely to agree with members of the same subgroup and disagree with members from opposing subgroups. In this paper, we propose an unsupervised approach for automatically detecting discussant subgroups in online communities. We analyze the text exchanged between the participants of a discussion to identify the attitude they carry toward each other and towards the various aspects of the discussion topic. We use attitude predictions to construct an attitude vector for each discussant. We use clustering techniques to cluster these vectors and, hence, determine the subgroup membership of each participant. We compare our methods to text clustering and other baselines, and show that our method achieves promising results

CiteSeerX

Columbia University Academic Commons

Beyond Educational Videogames to Educational Systems-That-Incorporate Videogames: A Case Study of a System for Learning about Energy

Author: Arias Rodrigo
Dasigi Meghana
Flor Nick
Hayden Megan
Mesibov Melinda
Sweeney Keara
Publication venue: 'HICSS Conference Office'
Publication date: 01/01/2017
Field of study

A common goal for designers of educational videogames is to make learning fun. Unfortunately, the result is often a game that tries to combine the fun aspects of videogames with learning elements, but that is neither fun nor effective for learning. In this paper we present our discovery of an alternative approach—a system that combines both education and entertainment, but that separates them into different modules that are loosely-coupled. Entertainment motivates education through a reward mechanism, where performance in the education module yields tokens that can be redeemed for in-game assets in the entertainment module. We present a case study of our specific implementation of this system, and we discuss how it can be generalized to motivate the learning of any topic where performance can be measured. This research contributes to our understanding of designing cognitive artifacts, and to our understanding of designing educational systems as distributed services

Crossref

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)

Recommended from our members

Toward a multi-sensor-based approach to automatic text classification

Author: Dasigi V. R.
Mann R. C.
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/10/1995
Field of study

Many automatic text indexing and retrieval methods use a term-document matrix that is automatically derived from the text in question. Latent Semantic Indexing is a method, recently proposed in the Information Retrieval (IR) literature, for approximating a large and sparse term-document matrix with a relatively small number of factors, and is based on a solid mathematical foundation. LSI appears to be quite useful in the problem of text information retrieval, rather than text classification. In this report, we outline a method that attempts to combine the strength of the LSI method with that of neural networks, in addressing the problem of text classification. In doing so, we also indicate ways to improve performance by adding additional {open_quotes}logical sensors{close_quotes} to the neural network, something that is hard to do with the LSI method when employed by itself. The various programs that can be used in testing the system with TIPSTER data set are described. Preliminary results are summarized, but much work remains to be done

UNT Digital Library