229 research outputs found

    Towards a belief revision based adaptive and context sensitive information retrieval system

    Get PDF
    In an adaptive information retrieval (IR) setting, the information seekers' beliefs about which terms are relevant or nonrelevant will naturally fluctuate. This article investigates how the theory of belief revision can be used to model adaptive IR. More specifically, belief revision logic provides a rich representation scheme to formalize retrieval contexts so as to disambiguate vague user queries. In addition, belief revision theory underpins the development of an effective mechanism to revise user profiles in accordance with information seekers' changing information needs. It is argued that information retrieval contexts can be extracted by means of the information-flow text mining method so as to realize a highly autonomous adaptive IR system. The extra bonus of a belief-based IR model is that its retrieval behavior is more predictable and explanatory. Our initial experiments show that the belief-based adaptive IR system is as effective as a classical adaptive IR system. To our best knowledge, this is the first successful implementation and evaluation of a logic-based adaptive IR model which can efficiently process large IR collections

    The Generation of Compound Nominals to Represent the Essence of Text The COMMIX System

    Get PDF
    This thesis concerns the COMMIX system, which automatically extracts information on what a text is about, and generates that information in the highly compacted form of compound nominal expressions. The expressions generated are complex and may include novel terms which do not appear themselves in the input text. From the practical point of view, the work is driven by the need for better representations of content: for representations which are shorter and more concise than would appear in an abstract, yet more informative and representative of the actual aboutness than commonly occurs in indexing expressions and key terms. This additional layer of representation is referred to in this work as pertaining to the essence of a particular text. From a theoretical standpoint, the thesis shows how the compound nominal as a construct can be successfully employed in these highly informative representations. It involves an exploration of the claim that there is sufficient semantic information contained within the standard dictionary glosses for individual words to enable the construction of useful and highly representative novel compound nominal expressions, without recourse to standard syntactic and statistical methods. It shows how a shallow semantic approach to content identification which is based on lexical overlap can produce some very encouraging results. The methodology employed, and described herein, is domain-independent, and does not require the specification of templates with which the input text must comply. In these two respects, the methodology developed in this work avoids two of the most common problems associated with information extraction. As regards the evaluation of this type of work, the thesis introduces and utilises the notion of percentage attainment value, which is used in conjunction with subjects' opinions about the degree to which the aboutness terms succeed in indicating the subject matter of the texts for which they were generated

    The use of non-formal information in reverse engineering and software reuse

    Get PDF
    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Within the field of software maintenance, both reverse engineering and software reuse have been suggested as ways of salvaging some of the investment made in software that is now out of date. One goal that is shared by both reverse engineering and reuse is a desire to be able to redescribe source code, that is to produce higher level descriptions of existing code. The fundamental theme of this thesis is that from a maintenance perspective, source code should be considered primarily as a text. This emphasizes its role as a medium for communication between humans rather than as a medium for human-computer communication. Characteristic of this view is the need to incorporate the analysis of non-formal information, such as comments and identifier names, when developing tools to redescribe code. Many existing tools fail to do this. To justify this text-based view of source code, an investigation into the possible use of non-formal information to index pieces of source code was undertaken. This involved attempting to assign descriptors that represent the code's function to pieces of source code from IBM's CICS project. The results of this investigation support the view that the use of nonformal information can be of practical value in redescribing source code. However, the results fail to suggest that using non-formal information will overcome any of the major difficulties associated with developing tools to redescribe code. This is used to suggest future directions for research

    Validating Topic Modeling as a Method of Analyzing Sujet and Theme

    Get PDF
    In Computational Literary Studies (CLS), several procedures for thematic analysis have been adapted from NLP and Computer Science. Among these procedures, topic modeling is the most prominent and popular technique. We maintain, however, that this procedure is used only in the context of exploration up to date, but not in the context of justification. When we seek to prove assumptions concerning the correlation between genres, methods of computational text analysis have to be set up in research environments of justification, i.e. in environments of hypothesis testing. We provide a holistic model of validation and conceptual disambiguation of the notion of aboutness as sujet, fabula, and theme, and discuss essential methodological requirements for hypothesis-based analysis. As we maintain that validation has to be performed for individual tasks respectively, we shall perform empirical validation of topic modeling based on a new corpus of German novellas and comprehensive annotations and draw hypothetical generalizations on the applicability of topic modeling for analyzing aboutness in the domain of narrative fiction

    In good form : arguing for epistemic norms of credence

    Get PDF
    The main topic of the book is how to argue for formal epistemic norms of credence. The author advocates formal justificational pluralism, suggesting that it is reasonable to use various formal tools, e.g. different "scoring rules", in arguments for synchronic and diachronic norms. The author first examines various occasions on which modern formal epistemology fails to live up to its "formal" label. Among the topics considered next are: the Dutch Book Theorem and Arguments (which fails according to the author), a novel version of the Principal Principle, and a constructive approach to higher order probabilities. The author argues then that the best method for dealing with various belief update problems is that of minimizing inverse relative entropy, and defends the claim that for evaluating an agent’s credal state at a single moment the Brier Score seems to be the way to go

    Using Hashtags to Disambiguate Aboutness in Social Media Discourse: A Case Study of #OrlandoStrong

    Get PDF
    While the field of writing studies has studied digital writing as a response to multiple calls for more research on digital forms of writing, research on hashtags has yet to build bridges between different disciplines\u27 approaches to studying the uses and effects of hashtags. This dissertation builds that bridge in its interdisciplinary approach to the study of hashtags by focusing on how hashtags can be fully appreciated at the intersection of the fields of information research, linguistics, rhetoric, ethics, writing studies, new media studies, and discourse studies. Hashtags are writing innovations that perform unique digital functions rhetorically while still hearkening back to functions of both print and oral rhetorical traditions. Hashtags function linguistically as indicators of semantic meaning; additionally, hashtags also perform the role of search queries on social media, retrieving texts that include the same hashtag. Information researchers refer to the relationship between a search query and its results using the term aboutness (Kehoe and Gee, 2011). By considering how hashtags have an aboutness, the humanities can call upon information research to better understand the digital aspects of the hashtag\u27s search function. Especially when hashtags are used to organize discourse, aboutness has an effect on how a discourse community\u27s agendas and goals are expressed, as well as framing what is relevant and irrelevant to the discourse. As digital activists increasingly use hashtags to organize and circulate the goals of their discourse communities, knowledge of ethical strategies for hashtag use will help to better preserve a relevant aboutness for their discourse while enabling them to better leverage their hashtag for circulation. In this dissertation, through a quantitative and qualitative analysis of the Twitter discourse that used #OrlandoStrong over the five-month period before the first anniversary of the Pulse shooting, I trace how the #OrlandoStrong discourse community used innovative rhetorical strategies to combat irrelevant content from ambiguating their discourse space. In Chapter One, I acknowledge the call from scholars to study digital tools and briefly describe the history of the Pulse shooting, reflecting on non-digital texts that employed #OrlandoStrong as memorials in the Orlando area. In Chapter Two, I focus on the literature surrounding hashtags, discourse, aboutness, intertextuality, hashtag activism, and informational compositions. In Chapter Three, I provide an overview of the stages of grounded theory methodology and the implications of critical discourse analysis before I detail how I approached the collection, coding, and analysis of the #OrlandoStrong Tweets I studied. The results of my study are reported in Chapter Four, offering examples of Tweets that were important to understanding how the discourse space became ambiguous through the use of hashtags. In Chapter Five, I reflect on ethical approaches to understanding the consequences of hashtag use, and then I offer an ethical recommendation for hashtag use by hashtag activists. I conclude Chapter Five with an example of a classroom activity that allows students to use hashtags to better understand the relationship between aboutness, (dis)ambiguation, discourse communities, and ethics. This classroom activity is provided with the hope that instructors from different disciplines will be able to provide ethical recommendations to future activists who may benefit from these rhetorical strategies

    Theoretical evaluation of XML retrieval

    Full text link

    Categorical Ontology of Complex Systems, Meta-Systems and Theory of Levels: The Emergence of Life, Human Consciousness and Society

    Get PDF
    Single cell interactomics in simpler organisms, as well as somatic cell interactomics in multicellular organisms, involve biomolecular interactions in complex signalling pathways that were recently represented in modular terms by quantum automata with ‘reversible behavior’ representing normal cell cycling and division. Other implications of such quantum automata, modular modeling of signaling pathways and cell differentiation during development are in the fields of neural plasticity and brain development leading to quantum-weave dynamic patterns and specific molecular processes underlying extensive memory, learning, anticipation mechanisms and the emergence of human consciousness during the early brain development in children. Cell interactomics is here represented for the first time as a mixture of ‘classical’ states that determine molecular dynamics subject to Boltzmann statistics and ‘steady-state’, metabolic (multi-stable) manifolds, together with ‘configuration’ spaces of metastable quantum states emerging from complex quantum dynamics of interacting networks of biomolecules, such as proteins and nucleic acids that are now collectively defined as quantum interactomics. On the other hand, the time dependent evolution over several generations of cancer cells --that are generally known to undergo frequent and extensive genetic mutations and, indeed, suffer genomic transformations at the chromosome level (such as extensive chromosomal aberrations found in many colon cancers)-- cannot be correctly represented in the ‘standard’ terms of quantum automaton modules, as the normal somatic cells can. This significant difference at the cancer cell genomic level is therefore reflected in major changes in cancer cell interactomics often from one cancer cell ‘cycle’ to the next, and thus it requires substantial changes in the modeling strategies, mathematical tools and experimental designs aimed at understanding cancer mechanisms. Novel solutions to this important problem in carcinogenesis are proposed and experimental validation procedures are suggested. From a medical research and clinical standpoint, this approach has important consequences for addressing and preventing the development of cancer resistance to medical therapy in ongoing clinical trials involving stage III cancer patients, as well as improving the designs of future clinical trials for cancer treatments.\ud \ud \ud KEYWORDS: Emergence of Life and Human Consciousness;\ud Proteomics; Artificial Intelligence; Complex Systems Dynamics; Quantum Automata models and Quantum Interactomics; quantum-weave dynamic patterns underlying human consciousness; specific molecular processes underlying extensive memory, learning, anticipation mechanisms and human consciousness; emergence of human consciousness during the early brain development in children; Cancer cell ‘cycling’; interacting networks of proteins and nucleic acids; genetic mutations and chromosomal aberrations in cancers, such as colon cancer; development of cancer resistance to therapy; ongoing clinical trials involving stage III cancer patients’ possible improvements of the designs for future clinical trials and cancer treatments. \ud \u
    • …
    corecore