6 research outputs found

    Unsupervised learning of relation detection patterns

    Get PDF
    L'extracció d'informació és l'àrea del processament de llenguatge natural l'objectiu de la qual és l'obtenir dades estructurades a partir de la informació rellevant continguda en fragments textuals. L'extracció d'informació requereix una quantitat considerable de coneixement lingüístic. La especificitat d'aquest coneixement suposa un inconvenient de cara a la portabilitat dels sistemes, ja que un canvi d'idioma, domini o estil té un cost en termes d'esforç humà. Durant dècades, s'han aplicat tècniques d'aprenentatge automàtic per tal de superar aquest coll d'ampolla de portabilitat, reduint progressivament la supervisió humana involucrada. Tanmateix, a mida que augmenta la disponibilitat de grans col·leccions de documents, esdevenen necessàries aproximacions completament nosupervisades per tal d'explotar el coneixement que hi ha en elles. La proposta d'aquesta tesi és la d'incorporar tècniques de clustering a l'adquisició de patrons per a extracció d'informació, per tal de reduir encara més els elements de supervisió involucrats en el procés En particular, el treball se centra en el problema de la detecció de relacions. L'assoliment d'aquest objectiu final ha requerit, en primer lloc, el considerar les diferents estratègies en què aquesta combinació es podia dur a terme; en segon lloc, el desenvolupar o adaptar algorismes de clustering adequats a les nostres necessitats; i en tercer lloc, el disseny de procediments d'adquisició de patrons que incorporessin la informació de clustering. Al final d'aquesta tesi, havíem estat capaços de desenvolupar i implementar una aproximació per a l'aprenentatge de patrons per a detecció de relacions que, utilitzant tècniques de clustering i un mínim de supervisió humana, és competitiu i fins i tot supera altres aproximacions comparables en l'estat de l'art.Information extraction is the natural language processing area whose goal is to obtain structured data from the relevant information contained in textual fragments. Information extraction requires a significant amount of linguistic knowledge. The specificity of such knowledge supposes a drawback on the portability of the systems, as a change of language, domain or style demands a costly human effort. Machine learning techniques have been applied for decades so as to overcome this portability bottleneck¿progressively reducing the amount of involved human supervision. However, as the availability of large document collections increases, completely unsupervised approaches become necessary in order to mine the knowledge contained in them. The proposal of this thesis is to incorporate clustering techniques into pattern learning for information extraction, in order to further reduce the elements of supervision involved in the process. In particular, the work focuses on the problem of relation detection. The achievement of this ultimate goal has required, first, considering the different strategies in which this combination could be carried out; second, developing or adapting clustering algorithms suitable to our needs; and third, devising pattern learning procedures which incorporated clustering information. By the end of this thesis, we had been able to develop and implement an approach for learning of relation detection patterns which, using clustering techniques and minimal human supervision, is competitive and even outperforms other comparable approaches in the state of the art.Postprint (published version

    Crisis Management: A Qualitative Study of Extreme Event Leadership

    Get PDF
    Several extreme events are examined in this dissertation to better understand the implications of such events for expanding the existing knowledge of crisis leadership. Through interviews with leaders that had direct leadership roles in extreme events such as the Fukushima nuclear reactor explosions, Deepwater Horizon oilrig explosion, and Super Storm Sandy, in addition to national leadership, e.g. White House Situation Room, an in-depth, cross-case analysis of leadership in extreme crises is presented. Previous literature concludes that the abilities of leaders are second only to the cause of the event itself in determining the outcome of a disaster but due to the rarity of these events, there has been limited scholarly consideration of the implications of these events for leadership research and practice. Using an inductive, qualitative approach to analyze the interviews, the results lead to several conclusions. First, there is a need for this and additional research to clarify the meaning or unique challenges that define the characteristics of an extreme event crisis especially in the most extreme cases. Second, the importance of the effects of felt emotions including mortality salience on extreme leadership is profound on the thinking and actions of leaders in these events. Third, classic crisis management and leadership theories are insufficient for explaining the needed actions in responding to extreme events. These conclusions were integrated with prior research to develop a model of crisis leadership based on a continuum of crisis events from routine to extreme. This model is developed around six leadership concepts either identified in prior research or developed based on the findings of this study. The model also identifies threshold points where routine crisis events become more extreme. At these threshold points the demands on all actors in the event, especially the leaders, become more non-linear and can result in great emotional influences on sensemaking and subsequent decision making. This dissertation concludes that leadership in this context can almost exclusively be focused on life-saving, and instinctual or emotional responses. Further the differences between leadership in dangerous military and non-military domains are examined. The implication of these findings for practitioners and future researchers is also discussed

    A framework for analyzing changes in health care lexicons and nomenclatures

    Get PDF
    Ontologies play a crucial role in current web-based biomedical applications for capturing contextual knowledge in the domain of life sciences. Many of the so-called bio-ontologies and controlled vocabularies are known to be seriously defective from both terminological and ontological perspectives, and do not sufficiently comply with the standards to be considered formai ontologies. Therefore, they are continuously evolving in order to fix the problems and provide valid knowledge. Moreover, many problems in ontology evolution often originate from incomplete knowledge about the given domain. As our knowledge improves, the related definitions in the ontologies will be altered. This problem is inadequately addressed by available tools and algorithms, mostly due to the lack of suitable knowledge representation formalisms to deal with temporal abstract notations, and the overreliance on human factors. Also most of the current approaches have been focused on changes within the internal structure of ontologies, and interactions with other existing ontologies have been widely neglected. In this research, alter revealing and classifying some of the common alterations in a number of popular biomedical ontologies, we present a novel agent-based framework, RLR (Represent, Legitimate, and Reproduce), to semi-automatically manage the evolution of bio-ontologies, with emphasis on the FungalWeb Ontology, with minimal human intervention. RLR assists and guides ontology engineers through the change management process in general, and aids in tracking and representing the changes, particularly through the use of category theory. Category theory has been used as a mathematical vehicle for modeling changes in ontologies and representing agents' interactions, independent of any specific choice of ontology language or particular implementation. We have also employed rule-based hierarchical graph transformation techniques to propose a more specific semantics for analyzing ontological changes and transformations between different versions of an ontology, as well as tracking the effects of a change in different levels of abstractions. Thus, the RLR framework enables one to manage changes in ontologies, not as standalone artifacts in isolation, but in contact with other ontologies in an openly distributed semantic web environment. The emphasis upon the generality and abstractness makes RLR more feasible in the multi-disciplinary domain of biomedical Ontology change management

    Multiversion Divergence Control of Time Fuzziness

    No full text
    Epsilon Serializability (ESR) has been proposed to manage and control inconsistency in extending the classic transaction processing that has been based on serializability. ESR increases transaction processing system concurrency by tolerating a bounded amount of inconsistency. In this paper, we present multiversion divergence control (mvDC) algorithms that support ESR with not only value but also time fuzziness in multiversion databases. Unlike value fuzziness, accumulating time fuzziness is semantically different. A simple summation of the length of two time intervals may either underestimate the total time fuzziness, resulting in incorrect execution, or overestimate the total time fuzziness, unnecessarily degrading the effectiveness of mvESR. In this paper, we present a new operation, called TimeUnion, to accurately accumulate the total time fuzziness. In addition, we describe two mvDC algorithms that can correctly bound the inconsistency in time and value for mvESR. Because of the ac..

    An evaluation of the challenges of Multilingualism in Data Warehouse development

    Get PDF
    In this paper we discuss Business Intelligence and define what is meant by support for Multilingualism in a Business Intelligence reporting context. We identify support for Multilingualism as a challenging issue which has implications for data warehouse design and reporting performance. Data warehouses are a core component of most Business Intelligence systems and the star schema is the approach most widely used to develop data warehouses and dimensional Data Marts. We discuss the way in which Multilingualism can be supported in the Star Schema and identify that current approaches have serious limitations which include data redundancy and data manipulation, performance and maintenance issues. We propose a new approach to enable the optimal application of multilingualism in Business Intelligence. The proposed approach was found to produce satisfactory results when used in a proof-of-concept environment. Future work will include testing the approach in an enterprise environmen
    corecore