2,589 research outputs found
Bridging Vision and Language over Time with Neural Cross-modal Embeddings
Giving computers the ability to understand multimedia content is one of the goals
of Artificial Intelligence systems. While humans excel at this task, it remains a challenge,
requiring bridging vision and language, which inherently have heterogeneous
computational representations. Cross-modal embeddings are used to tackle this challenge,
by learning a common space that uni es these representations. However, to grasp
the semantics of an image, one must look beyond the pixels and consider its semantic
and temporal context, with the latter being de ned by images’ textual descriptions and
time dimension, respectively. As such, external causes (e.g. emerging events) change the
way humans interpret and describe the same visual element over time, leading to the
evolution of visual-textual correlations.
In this thesis we investigate models that capture patterns of visual and textual interactions
over time, by incorporating time in cross-modal embeddings: 1) in a relative manner,
where by using pairwise temporal correlations to aid data structuring, we obtained a
model that provides better visual-textual correspondences on dynamic corpora, and 2) in
a diachronic manner, where the temporal dimension is fully preserved, thus capturing
visual-textual correlations evolution under a principled approach that jointly models
vision+language+time. Rich insights stemming from data evolution were extracted from
a 20 years large-scale dataset. Additionally, towards improving the e ectiveness of these
embedding learning models, we proposed a novel loss function that increases the expressiveness
of the standard triplet-loss, by making it adaptive to the data at hand. With our
adaptive triplet-loss, in which triplet speci c constraints are inferred and scheduled, we
achieved state-of-the-art performance on the standard cross-modal retrieval task
Syntactic and Semantic Analysis and Visualization of Unstructured English Texts
People have complex thoughts, and they often express their thoughts with complex sentences using natural languages. This complexity may facilitate efficient communications among the audience with the same knowledge base. But on the other hand, for a different or new audience this composition becomes cumbersome to understand and analyze. Analysis of such compositions using syntactic or semantic measures is a challenging job and defines the base step for natural language processing.
In this dissertation I explore and propose a number of new techniques to analyze and visualize the syntactic and semantic patterns of unstructured English texts.
The syntactic analysis is done through a proposed visualization technique which categorizes and compares different English compositions based on their different reading complexity metrics. For the semantic analysis I use Latent Semantic Analysis (LSA) to analyze the hidden patterns in complex compositions. I have used this technique to analyze comments from a social visualization web site for detecting the irrelevant ones (e.g., spam). The patterns of collaborations are also studied through statistical analysis.
Word sense disambiguation is used to figure out the correct sense of a word in a sentence or composition. Using textual similarity measure, based on the different word similarity measures and word sense disambiguation on collaborative text snippets from social collaborative environment, reveals a direction to untie the knots of complex hidden patterns of collaboration
Knowledge-Based Techniques for Scholarly Data Access: Towards Automatic Curation
Accessing up-to-date and quality scientific literature is a critical preliminary step in any research activity.
Identifying relevant scholarly literature for the extents of a given task or application is, however a complex and time consuming activity.
Despite the large number of tools developed over the years to support scholars in their literature surveying activity, such as Google Scholar, Microsoft Academic search, and others, the best way to access quality papers remains asking a domain expert who is actively involved in the field and knows research trends and directions.
State of the art systems, in fact, either do not allow exploratory search activity, such as identifying the active research directions within a given topic, or do not offer proactive features, such as content recommendation, which are both critical to researchers.
To overcome these limitations, we strongly advocate a paradigm shift in the development of scholarly data access tools: moving from traditional information retrieval and filtering tools towards automated agents able to make sense of the textual content of published papers and therefore monitor the state of the art.
Building such a system is however a complex task that implies tackling non trivial problems in the fields of Natural Language Processing, Big Data Analysis, User Modelling, and Information Filtering.
In this work, we introduce the concept of Automatic Curator System and present its fundamental components.openDottorato di ricerca in InformaticaopenDe Nart, Dari
Holistic recommender systems for software engineering
The knowledge possessed by developers is often not sufficient to overcome a programming problem. Short of talking to teammates, when available, developers often gather additional knowledge from development artifacts (e.g., project documentation), as well as online resources. The web has become an essential component in the modern developer’s daily life, providing a plethora of information from sources like forums, tutorials, Q&A websites, API documentation, and even video tutorials. Recommender Systems for Software Engineering (RSSE) provide developers with assistance to navigate the information space, automatically suggest useful items, and reduce the time required to locate the needed information. Current RSSEs consider development artifacts as containers of homogeneous information in form of pure text. However, text is a means to represent heterogeneous information provided by, for example, natural language, source code, interchange formats (e.g., XML, JSON), and stack traces. Interpreting the information from a pure textual point of view misses the intrinsic heterogeneity of the artifacts, thus leading to a reductionist approach. We propose the concept of Holistic Recommender Systems for Software Engineering (H-RSSE), i.e., RSSEs that go beyond the textual interpretation of the information contained in development artifacts. Our thesis is that modeling and aggregating information in a holistic fashion enables novel and advanced analyses of development artifacts. To validate our thesis we developed a framework to extract, model and analyze information contained in development artifacts in a reusable meta- information model. We show how RSSEs benefit from a meta-information model, since it enables customized and novel analyses built on top of our framework. The information can be thus reinterpreted from an holistic point of view, preserving its multi-dimensionality, and opening the path towards the concept of holistic recommender systems for software engineering
Omani Undergraduate Students’, Teachers’ and Tutors’ Metalinguistic Understanding of Cohesion and Coherence in EFL Academic Writing and their Perspectives of Teaching Cohesion and Coherence
My interpretive study aims to explore how EFL university students verbally articulate their understanding of cohesion and coherence, how they perceive the teaching of cohesion and coherence and how they reflect on the way they have attempted to actualise cohesion and coherence in their EFL academic texts. The study also looks at how their writing teachers and tutors metalinguistically understand cohesion and coherence, and how they perceive issues related to the teaching/tutoring of cohesion and coherence. It has researched the situated realities of students, teachers and tutors through semi-structured interviews, and is informed by Halliday and Hasan’s (1976) taxonomy on cohesion and coherence. Further, the study employs text analysis of students’ essays to find out how well students write with cohesion and coherence. It explores the diversity, density and accuracy of cohesion devices as well as some coherence-related concepts (i.e. text unity, content, logic, writer’s subject and background knowledge and relationships with the reader). The study is largely qualitative, but it also has a quantitative element. It implements the triangulation of different sources of information: three participant groups (students, teachers and tutors) and two research methods (semi-structured interviews and text analysis). The study findings indicate that the students found it hard to verbally articulate what cohesion and coherence are, and defined the two terms through referring more to concepts that were related to coherence than cohesion. They also struggled with writing cohesive and coherent texts. There were also some synergies and discrepancies between teachers and tutors in how they metalinguistically understood cohesion and coherence, and how they perceived issues related to the teaching of cohesion and coherence. The study offers a deep discussion on its findings regarding its context and the prevailing body of research in the area. Its discussion focuses on researching cohesion and coherence and metalinguistic understanding in writing. It also discusses the characteristics of students’ writing regarding their writing cohesion and coherence, the influence of Arabic on the cohesion and coherence of their EFL academic writing, and the teaching of cohesion and coherence. The study offers some significant implications that inform practice, decision making and future research
Combining granularity-based topic-dependent and topic-independent evidences for opinion detection
Fouille des opinion, une sous-discipline dans la recherche d'information (IR) et la linguistique computationnelle, fait référence aux techniques de calcul pour l'extraction, la classification, la compréhension et l'évaluation des opinions exprimées par diverses sources de nouvelles en ligne, social commentaires des médias, et tout autre contenu généré par l'utilisateur. Il est également connu par de nombreux autres termes comme trouver l'opinion, la détection d'opinion, l'analyse des sentiments, la classification sentiment, de détection de polarité, etc. Définition dans le contexte plus spécifique et plus simple, fouille des opinion est la tâche de récupération des opinions contre son besoin aussi exprimé par l'utilisateur sous la forme d'une requête. Il y a de nombreux problèmes et défis liés à l'activité fouille des opinion. Dans cette thèse, nous nous concentrons sur quelques problèmes d'analyse d'opinion. L'un des défis majeurs de fouille des opinion est de trouver des opinions concernant spécifiquement le sujet donné (requête). Un document peut contenir des informations sur de nombreux sujets à la fois et il est possible qu'elle contienne opiniâtre texte sur chacun des sujet ou sur seulement quelques-uns. Par conséquent, il devient très important de choisir les segments du document pertinentes à sujet avec leurs opinions correspondantes. Nous abordons ce problème sur deux niveaux de granularité, des phrases et des passages. Dans notre première approche de niveau de phrase, nous utilisons des relations sémantiques de WordNet pour trouver cette association entre sujet et opinion. Dans notre deuxième approche pour le niveau de passage, nous utilisons plus robuste modèle de RI i.e. la language modèle de se concentrer sur ce problème. L'idée de base derrière les deux contributions pour l'association d'opinion-sujet est que si un document contient plus segments textuels (phrases ou passages) opiniâtre et pertinentes à sujet, il est plus opiniâtre qu'un document avec moins segments textuels opiniâtre et pertinentes. La plupart des approches d'apprentissage-machine basée à fouille des opinion sont dépendants du domaine i.e. leurs performances varient d'un domaine à d'autre. D'autre part, une approche indépendant de domaine ou un sujet est plus généralisée et peut maintenir son efficacité dans différents domaines. Cependant, les approches indépendant de domaine souffrent de mauvaises performances en général. C'est un grand défi dans le domaine de fouille des opinion à développer une approche qui est plus efficace et généralisé. Nos contributions de cette thèse incluent le développement d'une approche qui utilise de simples fonctions heuristiques pour trouver des documents opiniâtre. Fouille des opinion basée entité devient très populaire parmi les chercheurs de la communauté IR. Il vise à identifier les entités pertinentes pour un sujet donné et d'en extraire les opinions qui leur sont associées à partir d'un ensemble de documents textuels. Toutefois, l'identification et la détermination de la pertinence des entités est déjà une tâche difficile. Nous proposons un système qui prend en compte à la fois l'information de l'article de nouvelles en cours ainsi que des articles antérieurs pertinents afin de détecter les entités les plus importantes dans les nouvelles actuelles. En plus de cela, nous présentons également notre cadre d'analyse d'opinion et tâches relieés. Ce cadre est basée sur les évidences contents et les évidences sociales de la blogosphère pour les tâches de trouver des opinions, de prévision et d'avis de classement multidimensionnel. Cette contribution d'prématurée pose les bases pour nos travaux futurs. L'évaluation de nos méthodes comprennent l'utilisation de TREC 2006 Blog collection et de TREC Novelty track 2004 collection. La plupart des évaluations ont été réalisées dans le cadre de TREC Blog track.Opinion mining is a sub-discipline within Information Retrieval (IR) and Computational Linguistics. It refers to the computational techniques for extracting, classifying, understanding, and assessing the opinions expressed in various online sources like news articles, social media comments, and other user-generated content. It is also known by many other terms like opinion finding, opinion detection, sentiment analysis, sentiment classification, polarity detection, etc. Defining in more specific and simpler context, opinion mining is the task of retrieving opinions on an issue as expressed by the user in the form of a query. There are many problems and challenges associated with the field of opinion mining. In this thesis, we focus on some major problems of opinion mining
A Supreme Battle in Metaphor: A Critical Metaphor Analysis of the Culture War in Lawrence v. Texas
This work explores how metaphor, specifically conceptual metaphor, is used to create the argumentative context, carry meaning, supply the enthymematic structure of the arguments, and transform the sexual autonomy controversy within the U.S. Supreme Court\u27s opinion in Lawrence v. Texas. Research questions that guide this work include:
Is there evidence that the metaphor of culture war drives the argumentative context in the Lawrence opinion and is carried by other metaphorical constructions?
How is the social controversy over sexual autonomy advanced by conceptual metaphors in this legal text?
What are the dominant metaphors used to argue for sexual autonomy? What are the dominant metaphors used to resist the advance of sexual autonomy?
Does Critical Metaphor Analysis add substantially to our understanding of the argumentative strategy of both sides in the controversy?
Theoretical assumptions made in this study are in line with George Lakoff and Mark Johnson\u27s conceptual metaphor theory and the embodied nature of cognition. The methodology selected to analyze the conceptual metaphors used in Lawrence v. Texas to argue about sexual autonomy is a variant of the Critical Metaphor Analysis method practiced by Jonathan Charteris-Black.
The textual analysis of Justice Kennedy\u27s majority opinion and Justice Scalia\u27s dissent reveals that liberty functions as the chief metaphor. Although the metaphor culture war is used explicitly by Justice Scalia only twice, and not at all used explicitly by Justice Kennedy, this metaphor as a descriptor of the nature of the argument of Lawrence v. Texas is indeed supported throughout the arguments by other related conceptual metaphors. Despite the absence of the culture war metaphor in Justice Kennedy\u27s argument, the critical analysis of metaphor makes transparent the way he actually waged a very sophisticated rhetorical battle through metaphor in order to advance sexual autonomy. It also demonstrates that Justice Scalia\u27s charge of the Court\u27s engagement in culture war is not arbitrary, but supportable.
This study demonstrates the theoretical and methodological synthesis possible in using Critical Metaphor Analysis on legal texts, and gives apple evidence of the impact cognitive metaphor theory has on advancing understanding of both how a text works and what a text means. Critical Metaphor Analysis facilitates a level of intellectual rigor, as it does not require adopting an a priori ideological stance. Instead the analysis is grounded in the cognitive workings of our shared human minds and bodily experiences as expressed in our use of conceptual metaphor.
This work is a synthesizing demonstration of the need for critical rhetorical analysis of important judicial texts that will clarify the on-going role the courts are playing in the interpreting and shaping of our corporate life as a Nation
Recommended from our members
Designing Exploratory Search Systems that Stimulate Memory and Reduce Cognitive Load
From music fans finding new songs in a genre, graphic designers brainstorming ways to depict a message, and journalists scrutinizing documents for angles, people often conduct exploratory searches to understand complex topics. In contrast to traditional search, which is done to quickly answer simple questions, exploratory search is an iterative learning process that involves understanding an information space in order to find useful pieces of information.
Exploratory search is composed of two, closely-related sub-processes: (1) information foraging, choosing sources and collecting information, and (2) sensemaking, organizing this information into a mental framework. Both of these sub-processes are cognitively taxing and heavily rely on our memory. For information foraging, users need to read long, complex resources and recognize useful pieces of information. For sensemaking, as users encounter more information, it becomes harder to relate new information to their current knowledge.
The spreading activation theory of memory purports that the information we encounter materializes in our working memory, which spreads activation into our long-term memory, enabling us to recall related semantic information to make sense of newly found information. From this theory, this thesis introduces three strategies for creating organizations that better stimulate memory: (1) constructing overviews that are association networks that mimic our memory's structure, (2) incorporating our prior knowledge in these overviews, and (3) providing concrete information to help us make sense of abstract ideas. This thesis demonstrates how to employ these strategies through three exploratory search systems across three domains: (A) SymbolFinder helps graphic designers explore visual symbols for abstract concepts, (B) TastePaths helps music fans explore artists within a genre, and (C) AngleKindling supports journalists explore story angles for a press release. Through this body of work, I demonstrate that by designing exploratory search systems to stimulate our memory, we can make acquiring and making sense of knowledge less cognitively demanding
- …