943 research outputs found

    Reformulation of the domain-level semantic pattern of axiological evaluation in the lexicon of English verbs

    Get PDF
    The three-level hierarchy of values in Faber and Mairal‘s work (Constructing a Lexicon of English Verbs, Berlin: Mouton de Gruyter, 1999) is based on the scales of values given by Max Scheler or Józef Tischner, which are deeply rooted in the theory of the Great Chain of Being  (employed by Aristotle in his scala naturae). Faber and Mairal also provide an account of the relationship between lexical structure and cognition. A key issue was the introduction of a cognitive axis and a typology of predicate schemas in the lexicon (at lexeme, sub-domain and domain level). Among the four domain-level semantic patterns proposed, axiology is considered to appear in many domains. How-ever, in this article it is claimed that the axiological parameter needs further clarification and decomposition. Its structure is multidimensional, internally hierarchical and ca-nonical. In consequence, the three-level hierarchy of values in the lexicon of English verbs is reformulated and the axiological parameter is divided into multilevel categories crossed by two layers of canonical axes. It is also claimed that the axiological formula incorporating this might improve the understanding of this parameter within the lexical architecture of the verbal lexicon

    Two-Level Text Classification Using Hybrid Machine Learning Techniques

    Get PDF
    Nowadays, documents are increasingly being associated with multi-level category hierarchies rather than a flat category scheme. To access these documents in real time, we need fast automatic methods to navigate these hierarchies. Today’s vast data repositories such as the web also contain many broad domains of data which are quite distinct from each other e.g. medicine, education, sports and politics. Each domain constitutes a subspace of the data within which the documents are similar to each other but quite distinct from the documents in another subspace. The data within these domains is frequently further divided into many subcategories. Subspace Learning is a technique popular with non-text domains such as image recognition to increase speed and accuracy. Subspace analysis lends itself naturally to the idea of hybrid classifiers. Each subspace can be processed by a classifier best suited to the characteristics of that particular subspace. Instead of using the complete set of full space feature dimensions, classifier performances can be boosted by using only a subset of the dimensions. This thesis presents a novel hybrid parallel architecture using separate classifiers trained on separate subspaces to improve two-level text classification. The classifier to be used on a particular input and the relevant feature subset to be extracted is determined dynamically by using a novel method based on the maximum significance value. A novel vector representation which enhances the distinction between classes within the subspace is also developed. This novel system, the Hybrid Parallel Classifier, was compared against the baselines of several single classifiers such as the Multilayer Perceptron and was found to be faster and have higher two-level classification accuracies. The improvement in performance achieved was even higher when dealing with more complex category hierarchies

    Concepts, Frames and Cascades in Semantics, Cognition and Ontology

    Get PDF
    This open access book presents novel theoretical, empirical and experimental work exploring the nature of mental representations that support natural language production and understanding, and other manifestations of cognition. One fundamental question raised in the text is whether requisite knowledge structures can be adequately modeled by means of a uniform representational format, and if so, what exactly is its nature. Frames are a key topic covered which have had a strong impact on the exploration of knowledge representations in artificial intelligence, psychology and linguistics; cascades are a novel development in frame theory. Other key subject areas explored are: concepts and categorization, the experimental investigation of mental representation, as well as cognitive analysis in semantics. This book is of interest to students, researchers, and professionals working on cognition in the fields of linguistics, philosophy, and psychology

    The Abstract Language: Symbolic Cogniton And Its Relationship To Embodiment

    Get PDF
    Embodied theories presume that concepts are modality specific while symbolic theories suggest that all modalities for a given concept are integrated. Symbolic and embodied theories do fairly well with explaining and describing concrete concepts. Specifically, embodied theories seem well suited to describing the actual content of a concept while symbolic theories provide insight into how concepts operate. Conversely, neither symbolic nor embodied theories have been fully sufficient when attempting to describe and explain abstract concepts. Several pluralistic accounts have been put forth to describe how the semantic/lexical system interacts with the conceptual system. In this respect, they attempt to “embody” abstract concepts to the same extent as concrete concepts. Nevertheless, a concise and comprehensive theory for explaining how we learn/understand abstract concepts to the extent that we learn/understand concrete concepts remains elusive. One goal of the present review paper is to consider if abstract concepts can be defined by a unified theory or if subsets of abstract concepts will be defined by separate theories. Of particular focus will be Symbolic Interdependency Theory (SIT). It will be argued that SIT is suitable for grounding abstract concepts, as this theory infers that symbols bootstrap meaning from other symbols, highlighting the importance of abstract-to-abstract mapping in the same way that concrete-to-abstract mappings are created. Research will be considered to help outline a cohesive strategy for describing and understanding abstract concepts. Finally, as research has demonstrated efficiencies with concrete concept processing, analogous efficiencies will be explored for developing an understanding of abstract concepts. Such efforts could have both theoretical and practical implications for bolstering our knowledge of concept learning

    Language Learning and Metacognition: An Intervention to Improve Language Classrooms

    Get PDF
    In the USA, the trend of increase in foreign language enrollments at the college level has suddenly begun to decline since 2009, despite the notion that learning multiple languages is becoming essential for effectively communicating with others from diverse native language backgrounds. This new decline may be due in part to inefficient and outdated foreign language courses. The current study examined the effect of how we assess our current knowledge and learning techniques (metacognition) on educational outcomes in hopes to improve the effectiveness of the university classrooms. College students were exposed to new metacognitive strategies that could benefit their language learning throughout the fall 2016 semester. Specifically, students were presented with new information every other week to improve their vocabulary building, listening skills, and writing skills. Hierarchical multiple linear regression provided evidence that teaching students about metacognition and effective metacognitive strategies could benefit university language learners

    Bayesian nonparametric multilevel modelling and applications

    Full text link
    Our research aims at contributing to the multilevel modeling in data analytics. We address the task of multilevel clustering, multilevel regression, and classification. We provide state of the art solution for the critical problem

    Statistical analysis of grouped text documents

    Get PDF
    L'argomento di questa tesi sono i modelli statistici per l'analisi dei dati testuali, con particolare attenzione ai contesti in cui i campioni di testo sono raggruppati. Quando si ha a che fare con dati testuali, il primo problema è quello di elaborarli, per renderli compatibili dal punto di vista computazionale e metodologico con i metodi matematici e statistici prodotti e continuamente sviluppati dalla comunità scientifica. Per questo motivo, la tesi passa in rassegna i metodi esistenti per la rappresentazione analitica e l'elaborazione di campioni di dati testuali, compresi i "Vector Space Models", le "rappresentazioni distribuite" di parole e documenti e i "contextualized embeddings". Questa rassegna comporta la standardizzazione di una notazione che, anche all'interno dello stesso approccio di rappresentazione, appare molto eterogenea in letteratura. Vengono poi esplorati due domini di applicazione: i social media e il turismo culturale. Per quanto riguarda il primo, viene proposto uno studio sull'autodescrizione di gruppi diversi di individui sulla piattaforma StockTwits, dove i mercati finanziari sono gli argomenti dominanti. La metodologia proposta ha integrato diversi tipi di dati, sia testuali che variabili categoriche. Questo studio ha agevolato la comprensione sul modo in cui le persone si presentano online e ha trovato stutture di comportamento ricorrenti all'interno di gruppi di utenti. Per quanto riguarda il turismo culturale, la tesi approfondisce uno studio condotto nell'ambito del progetto "Data Science for Brescia - Arts and Cultural Places", in cui è stato addestrato un modello linguistico per classificare le recensioni online scritte in italiano in quattro aree semantiche distinte relative alle attrazioni culturali della città di Brescia. Il modello proposto permette di identificare le attrazioni nei documenti di testo, anche quando non sono esplicitamente menzionate nei metadati del documento, aprendo così la possibilità di espandere il database relativo a queste attrazioni culturali con nuove fonti, come piattaforme di social media, forum e altri spazi online. Infine, la tesi presenta uno studio metodologico che esamina la specificità di gruppo delle parole, analizzando diversi stimatori di specificità di gruppo proposti in letteratura. Lo studio ha preso in considerazione documenti testuali raggruppati con variabile di "outcome" e variabile di gruppo. Il suo contributo consiste nella proposta di modellare il corpus di documenti come una distribuzione multivariata, consentendo la simulazione di corpora di documenti di testo con caratteristiche predefinite. La simulazione ha fornito preziose indicazioni sulla relazione tra gruppi di documenti e parole. Inoltre, tutti i risultati possono essere liberamente esplorati attraverso un'applicazione web, i cui componenti sono altresì descritti in questo manoscritto. In conclusione, questa tesi è stata concepita come una raccolta di studi, ognuno dei quali suggerisce percorsi di ricerca futuri per affrontare le sfide dell'analisi dei dati testuali raggruppati.The topic of this thesis is statistical models for the analysis of textual data, emphasizing contexts in which text samples are grouped. When dealing with text data, the first issue is to process it, making it computationally and methodologically compatible with the existing mathematical and statistical methods produced and continually developed by the scientific community. Therefore, the thesis firstly reviews existing methods for analytically representing and processing textual datasets, including Vector Space Models, distributed representations of words and documents, and contextualized embeddings. It realizes this review by standardizing a notation that, even within the same representation approach, appears highly heterogeneous in the literature. Then, two domains of application are explored: social media and cultural tourism. About the former, a study is proposed about self-presentation among diverse groups of individuals on the StockTwits platform, where finance and stock markets are the dominant topics. The methodology proposed integrated various types of data, including textual and categorical data. This study revealed insights into how people present themselves online and found recurring patterns within groups of users. About the latter, the thesis delves into a study conducted as part of the "Data Science for Brescia - Arts and Cultural Places" Project, where a language model was trained to classify Italian-written online reviews into four distinct semantic areas related to cultural attractions in the Italian city of Brescia. The model proposed allows for the identification of attractions in text documents, even when not explicitly mentioned in document metadata, thus opening possibilities for expanding the database related to these cultural attractions with new sources, such as social media platforms, forums, and other online spaces. Lastly, the thesis presents a methodological study examining the group-specificity of words, analyzing various group-specificity estimators proposed in the literature. The study considered grouped text documents with both outcome and group variables. Its contribution consists of the proposal of modeling the corpus of documents as a multivariate distribution, enabling the simulation of corpora of text documents with predefined characteristics. The simulation provided valuable insights into the relationship between groups of documents and words. Furthermore, all its results can be freely explored through a web application, whose components are also described in this manuscript. In conclusion, this thesis has been conceived as a collection of papers. It aimed to contribute to the field with both applications and methodological proposals, and each study presented here suggests paths for future research to address the challenges in the analysis of grouped textual data