2,589 research outputs found

    Bridging Vision and Language over Time with Neural Cross-modal Embeddings

    Get PDF
    Giving computers the ability to understand multimedia content is one of the goals of Artificial Intelligence systems. While humans excel at this task, it remains a challenge, requiring bridging vision and language, which inherently have heterogeneous computational representations. Cross-modal embeddings are used to tackle this challenge, by learning a common space that uni es these representations. However, to grasp the semantics of an image, one must look beyond the pixels and consider its semantic and temporal context, with the latter being de ned by images’ textual descriptions and time dimension, respectively. As such, external causes (e.g. emerging events) change the way humans interpret and describe the same visual element over time, leading to the evolution of visual-textual correlations. In this thesis we investigate models that capture patterns of visual and textual interactions over time, by incorporating time in cross-modal embeddings: 1) in a relative manner, where by using pairwise temporal correlations to aid data structuring, we obtained a model that provides better visual-textual correspondences on dynamic corpora, and 2) in a diachronic manner, where the temporal dimension is fully preserved, thus capturing visual-textual correlations evolution under a principled approach that jointly models vision+language+time. Rich insights stemming from data evolution were extracted from a 20 years large-scale dataset. Additionally, towards improving the e ectiveness of these embedding learning models, we proposed a novel loss function that increases the expressiveness of the standard triplet-loss, by making it adaptive to the data at hand. With our adaptive triplet-loss, in which triplet speci c constraints are inferred and scheduled, we achieved state-of-the-art performance on the standard cross-modal retrieval task

    Syntactic and Semantic Analysis and Visualization of Unstructured English Texts

    Get PDF
    People have complex thoughts, and they often express their thoughts with complex sentences using natural languages. This complexity may facilitate efficient communications among the audience with the same knowledge base. But on the other hand, for a different or new audience this composition becomes cumbersome to understand and analyze. Analysis of such compositions using syntactic or semantic measures is a challenging job and defines the base step for natural language processing. In this dissertation I explore and propose a number of new techniques to analyze and visualize the syntactic and semantic patterns of unstructured English texts. The syntactic analysis is done through a proposed visualization technique which categorizes and compares different English compositions based on their different reading complexity metrics. For the semantic analysis I use Latent Semantic Analysis (LSA) to analyze the hidden patterns in complex compositions. I have used this technique to analyze comments from a social visualization web site for detecting the irrelevant ones (e.g., spam). The patterns of collaborations are also studied through statistical analysis. Word sense disambiguation is used to figure out the correct sense of a word in a sentence or composition. Using textual similarity measure, based on the different word similarity measures and word sense disambiguation on collaborative text snippets from social collaborative environment, reveals a direction to untie the knots of complex hidden patterns of collaboration

    Knowledge-Based Techniques for Scholarly Data Access: Towards Automatic Curation

    Get PDF
    Accessing up-to-date and quality scientific literature is a critical preliminary step in any research activity. Identifying relevant scholarly literature for the extents of a given task or application is, however a complex and time consuming activity. Despite the large number of tools developed over the years to support scholars in their literature surveying activity, such as Google Scholar, Microsoft Academic search, and others, the best way to access quality papers remains asking a domain expert who is actively involved in the field and knows research trends and directions. State of the art systems, in fact, either do not allow exploratory search activity, such as identifying the active research directions within a given topic, or do not offer proactive features, such as content recommendation, which are both critical to researchers. To overcome these limitations, we strongly advocate a paradigm shift in the development of scholarly data access tools: moving from traditional information retrieval and filtering tools towards automated agents able to make sense of the textual content of published papers and therefore monitor the state of the art. Building such a system is however a complex task that implies tackling non trivial problems in the fields of Natural Language Processing, Big Data Analysis, User Modelling, and Information Filtering. In this work, we introduce the concept of Automatic Curator System and present its fundamental components.openDottorato di ricerca in InformaticaopenDe Nart, Dari

    Holistic recommender systems for software engineering

    Get PDF
    The knowledge possessed by developers is often not sufficient to overcome a programming problem. Short of talking to teammates, when available, developers often gather additional knowledge from development artifacts (e.g., project documentation), as well as online resources. The web has become an essential component in the modern developer’s daily life, providing a plethora of information from sources like forums, tutorials, Q&A websites, API documentation, and even video tutorials. Recommender Systems for Software Engineering (RSSE) provide developers with assistance to navigate the information space, automatically suggest useful items, and reduce the time required to locate the needed information. Current RSSEs consider development artifacts as containers of homogeneous information in form of pure text. However, text is a means to represent heterogeneous information provided by, for example, natural language, source code, interchange formats (e.g., XML, JSON), and stack traces. Interpreting the information from a pure textual point of view misses the intrinsic heterogeneity of the artifacts, thus leading to a reductionist approach. We propose the concept of Holistic Recommender Systems for Software Engineering (H-RSSE), i.e., RSSEs that go beyond the textual interpretation of the information contained in development artifacts. Our thesis is that modeling and aggregating information in a holistic fashion enables novel and advanced analyses of development artifacts. To validate our thesis we developed a framework to extract, model and analyze information contained in development artifacts in a reusable meta- information model. We show how RSSEs benefit from a meta-information model, since it enables customized and novel analyses built on top of our framework. The information can be thus reinterpreted from an holistic point of view, preserving its multi-dimensionality, and opening the path towards the concept of holistic recommender systems for software engineering

    Omani Undergraduate Students’, Teachers’ and Tutors’ Metalinguistic Understanding of Cohesion and Coherence in EFL Academic Writing and their Perspectives of Teaching Cohesion and Coherence

    Get PDF
    My interpretive study aims to explore how EFL university students verbally articulate their understanding of cohesion and coherence, how they perceive the teaching of cohesion and coherence and how they reflect on the way they have attempted to actualise cohesion and coherence in their EFL academic texts. The study also looks at how their writing teachers and tutors metalinguistically understand cohesion and coherence, and how they perceive issues related to the teaching/tutoring of cohesion and coherence. It has researched the situated realities of students, teachers and tutors through semi-structured interviews, and is informed by Halliday and Hasan’s (1976) taxonomy on cohesion and coherence. Further, the study employs text analysis of students’ essays to find out how well students write with cohesion and coherence. It explores the diversity, density and accuracy of cohesion devices as well as some coherence-related concepts (i.e. text unity, content, logic, writer’s subject and background knowledge and relationships with the reader). The study is largely qualitative, but it also has a quantitative element. It implements the triangulation of different sources of information: three participant groups (students, teachers and tutors) and two research methods (semi-structured interviews and text analysis). The study findings indicate that the students found it hard to verbally articulate what cohesion and coherence are, and defined the two terms through referring more to concepts that were related to coherence than cohesion. They also struggled with writing cohesive and coherent texts. There were also some synergies and discrepancies between teachers and tutors in how they metalinguistically understood cohesion and coherence, and how they perceived issues related to the teaching of cohesion and coherence. The study offers a deep discussion on its findings regarding its context and the prevailing body of research in the area. Its discussion focuses on researching cohesion and coherence and metalinguistic understanding in writing. It also discusses the characteristics of students’ writing regarding their writing cohesion and coherence, the influence of Arabic on the cohesion and coherence of their EFL academic writing, and the teaching of cohesion and coherence. The study offers some significant implications that inform practice, decision making and future research

    Combining granularity-based topic-dependent and topic-independent evidences for opinion detection

    Get PDF
    Fouille des opinion, une sous-discipline dans la recherche d'information (IR) et la linguistique computationnelle, fait référence aux techniques de calcul pour l'extraction, la classification, la compréhension et l'évaluation des opinions exprimées par diverses sources de nouvelles en ligne, social commentaires des médias, et tout autre contenu généré par l'utilisateur. Il est également connu par de nombreux autres termes comme trouver l'opinion, la détection d'opinion, l'analyse des sentiments, la classification sentiment, de détection de polarité, etc. Définition dans le contexte plus spécifique et plus simple, fouille des opinion est la tâche de récupération des opinions contre son besoin aussi exprimé par l'utilisateur sous la forme d'une requête. Il y a de nombreux problèmes et défis liés à l'activité fouille des opinion. Dans cette thèse, nous nous concentrons sur quelques problèmes d'analyse d'opinion. L'un des défis majeurs de fouille des opinion est de trouver des opinions concernant spécifiquement le sujet donné (requête). Un document peut contenir des informations sur de nombreux sujets à la fois et il est possible qu'elle contienne opiniâtre texte sur chacun des sujet ou sur seulement quelques-uns. Par conséquent, il devient très important de choisir les segments du document pertinentes à sujet avec leurs opinions correspondantes. Nous abordons ce problème sur deux niveaux de granularité, des phrases et des passages. Dans notre première approche de niveau de phrase, nous utilisons des relations sémantiques de WordNet pour trouver cette association entre sujet et opinion. Dans notre deuxième approche pour le niveau de passage, nous utilisons plus robuste modèle de RI i.e. la language modèle de se concentrer sur ce problème. L'idée de base derrière les deux contributions pour l'association d'opinion-sujet est que si un document contient plus segments textuels (phrases ou passages) opiniâtre et pertinentes à sujet, il est plus opiniâtre qu'un document avec moins segments textuels opiniâtre et pertinentes. La plupart des approches d'apprentissage-machine basée à fouille des opinion sont dépendants du domaine i.e. leurs performances varient d'un domaine à d'autre. D'autre part, une approche indépendant de domaine ou un sujet est plus généralisée et peut maintenir son efficacité dans différents domaines. Cependant, les approches indépendant de domaine souffrent de mauvaises performances en général. C'est un grand défi dans le domaine de fouille des opinion à développer une approche qui est plus efficace et généralisé. Nos contributions de cette thèse incluent le développement d'une approche qui utilise de simples fonctions heuristiques pour trouver des documents opiniâtre. Fouille des opinion basée entité devient très populaire parmi les chercheurs de la communauté IR. Il vise à identifier les entités pertinentes pour un sujet donné et d'en extraire les opinions qui leur sont associées à partir d'un ensemble de documents textuels. Toutefois, l'identification et la détermination de la pertinence des entités est déjà une tâche difficile. Nous proposons un système qui prend en compte à la fois l'information de l'article de nouvelles en cours ainsi que des articles antérieurs pertinents afin de détecter les entités les plus importantes dans les nouvelles actuelles. En plus de cela, nous présentons également notre cadre d'analyse d'opinion et tâches relieés. Ce cadre est basée sur les évidences contents et les évidences sociales de la blogosphère pour les tâches de trouver des opinions, de prévision et d'avis de classement multidimensionnel. Cette contribution d'prématurée pose les bases pour nos travaux futurs. L'évaluation de nos méthodes comprennent l'utilisation de TREC 2006 Blog collection et de TREC Novelty track 2004 collection. La plupart des évaluations ont été réalisées dans le cadre de TREC Blog track.Opinion mining is a sub-discipline within Information Retrieval (IR) and Computational Linguistics. It refers to the computational techniques for extracting, classifying, understanding, and assessing the opinions expressed in various online sources like news articles, social media comments, and other user-generated content. It is also known by many other terms like opinion finding, opinion detection, sentiment analysis, sentiment classification, polarity detection, etc. Defining in more specific and simpler context, opinion mining is the task of retrieving opinions on an issue as expressed by the user in the form of a query. There are many problems and challenges associated with the field of opinion mining. In this thesis, we focus on some major problems of opinion mining

    A Supreme Battle in Metaphor: A Critical Metaphor Analysis of the Culture War in Lawrence v. Texas

    Get PDF
    This work explores how metaphor, specifically conceptual metaphor, is used to create the argumentative context, carry meaning, supply the enthymematic structure of the arguments, and transform the sexual autonomy controversy within the U.S. Supreme Court\u27s opinion in Lawrence v. Texas. Research questions that guide this work include: Is there evidence that the metaphor of culture war drives the argumentative context in the Lawrence opinion and is carried by other metaphorical constructions? How is the social controversy over sexual autonomy advanced by conceptual metaphors in this legal text? What are the dominant metaphors used to argue for sexual autonomy? What are the dominant metaphors used to resist the advance of sexual autonomy? Does Critical Metaphor Analysis add substantially to our understanding of the argumentative strategy of both sides in the controversy? Theoretical assumptions made in this study are in line with George Lakoff and Mark Johnson\u27s conceptual metaphor theory and the embodied nature of cognition. The methodology selected to analyze the conceptual metaphors used in Lawrence v. Texas to argue about sexual autonomy is a variant of the Critical Metaphor Analysis method practiced by Jonathan Charteris-Black. The textual analysis of Justice Kennedy\u27s majority opinion and Justice Scalia\u27s dissent reveals that liberty functions as the chief metaphor. Although the metaphor culture war is used explicitly by Justice Scalia only twice, and not at all used explicitly by Justice Kennedy, this metaphor as a descriptor of the nature of the argument of Lawrence v. Texas is indeed supported throughout the arguments by other related conceptual metaphors. Despite the absence of the culture war metaphor in Justice Kennedy\u27s argument, the critical analysis of metaphor makes transparent the way he actually waged a very sophisticated rhetorical battle through metaphor in order to advance sexual autonomy. It also demonstrates that Justice Scalia\u27s charge of the Court\u27s engagement in culture war is not arbitrary, but supportable. This study demonstrates the theoretical and methodological synthesis possible in using Critical Metaphor Analysis on legal texts, and gives apple evidence of the impact cognitive metaphor theory has on advancing understanding of both how a text works and what a text means. Critical Metaphor Analysis facilitates a level of intellectual rigor, as it does not require adopting an a priori ideological stance. Instead the analysis is grounded in the cognitive workings of our shared human minds and bodily experiences as expressed in our use of conceptual metaphor. This work is a synthesizing demonstration of the need for critical rhetorical analysis of important judicial texts that will clarify the on-going role the courts are playing in the interpreting and shaping of our corporate life as a Nation
    • …
    corecore