1,205 research outputs found

    mARC: Memory by Association and Reinforcement of Contexts

    Full text link
    This paper introduces the memory by Association and Reinforcement of Contexts (mARC). mARC is a novel data modeling technology rooted in the second quantization formulation of quantum mechanics. It is an all-purpose incremental and unsupervised data storage and retrieval system which can be applied to all types of signal or data, structured or unstructured, textual or not. mARC can be applied to a wide range of information clas-sification and retrieval problems like e-Discovery or contextual navigation. It can also for-mulated in the artificial life framework a.k.a Conway "Game Of Life" Theory. In contrast to Conway approach, the objects evolve in a massively multidimensional space. In order to start evaluating the potential of mARC we have built a mARC-based Internet search en-gine demonstrator with contextual functionality. We compare the behavior of the mARC demonstrator with Google search both in terms of performance and relevance. In the study we find that the mARC search engine demonstrator outperforms Google search by an order of magnitude in response time while providing more relevant results for some classes of queries

    Arabic named entity recognition

    Full text link
    En esta tesis doctoral se describen las investigaciones realizadas con el objetivo de determinar las mejores tecnicas para construir un Reconocedor de Entidades Nombradas en Arabe. Tal sistema tendria la habilidad de identificar y clasificar las entidades nombradas que se encuentran en un texto arabe de dominio abierto. La tarea de Reconocimiento de Entidades Nombradas (REN) ayuda a otras tareas de Procesamiento del Lenguaje Natural (por ejemplo, la Recuperacion de Informacion, la Busqueda de Respuestas, la Traduccion Automatica, etc.) a lograr mejores resultados gracias al enriquecimiento que a~nade al texto. En la literatura existen diversos trabajos que investigan la tarea de REN para un idioma especifico o desde una perspectiva independiente del lenguaje. Sin embargo, hasta el momento, se han publicado muy pocos trabajos que estudien dicha tarea para el arabe. El arabe tiene una ortografia especial y una morfologia compleja, estos aspectos aportan nuevos desafios para la investigacion en la tarea de REN. Una investigacion completa del REN para elarabe no solo aportaria las tecnicas necesarias para conseguir un alto rendimiento, sino que tambien proporcionara un analisis de los errores y una discusion sobre los resultados que benefician a la comunidad de investigadores del REN. El objetivo principal de esta tesis es satisfacer esa necesidad. Para ello hemos: 1. Elaborado un estudio de los diferentes aspectos del arabe relacionados con dicha tarea; 2. Analizado el estado del arte del REN; 3. Llevado a cabo una comparativa de los resultados obtenidos por diferentes tecnicas de aprendizaje automatico; 4. Desarrollado un metodo basado en la combinacion de diferentes clasificadores, donde cada clasificador trata con una sola clase de entidades nombradas y emplea el conjunto de caracteristicas y la tecnica de aprendizaje automatico mas adecuados para la clase de entidades nombradas en cuestion. Nuestros experimentos han sido evaluados sobre nueve conjuntos de test.Benajiba, Y. (2009). Arabic named entity recognition [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8318Palanci

    Workplace innovation and new product development in Vietnamese manufacturing small and medium-sized enterprises

    Get PDF
    Workplace innovation (WI) and new product development (NPD) is essential for organisations to ensure their market positioning. Vietnam is at the starting point of innovation. The purpose of this thesis is to gain a better understanding of senior management practices in NPD projects in the Vietnamese manufacturing industry and the status of the NPD process, strategic planning, resource allocation and success measure in Vietnamese manufacturing small and medium-sized enterprises (SMEs); identify NPD success factors in Vietnamese manufacturing SMEs at the project level; investigate the relationship between WI, NPD capability, strategic planning and performance in Vietnamese manufacturing SMEs at the project level; and determine the moderating effect of two groups (manager and employee) on the relationship between WI, NPD capability and NPD strategic planning on NPD performance in Vietnamese manufacturing SMEs. A total of 795 questionnaires were sent to manufacturing SMEs in Hanoi, with a response rate of 42.77% yielding 340 usable responses. Using IBM SPSS AMOS (v.25) software (hereafter AMOS) to test the research model of the relationship between WI, NPD capability, NPD strategic planning and NPD performance, the findings confirmed the simultaneous relationship between WI, NPD capability, NPD strategic planning and NPD performance in Vietnamese manufacturing SMEs at the project level. This thesis makes a significant contribution to the field of WI and NPD research from both theoretical and practical perspectives. Theoretically, this thesis contributes to the existing literature in the field of WI and NPD in organisations by 1) integrating the framework of contingency theory, the dynamic capability view and resource-based view theory in the study of the relationship between WI, NPD capability, NPD strategic planning and NPD performance; 2) developing a validated conceptual framework for examining the relationship between WI, NPD capability, NPD strategic planning and NPD performance in Vietnamese manufacturing SMEs; 3) observing a difference of perspective on the relationship between employee and managers, with the thesis findings confirming for the first time the simultaneous relationship between WI, NPD capability, NPD strategic planning and NPD performance, thereby expanding the contingency theory (Miller and Friesen, 1983) to a new environment¿capability¿strategic planning¿performance paradigm; and 4) recognition of moderating effect of manager and employee on WI and NPD capability. Practically, the findings enhance current understanding of senior management practices in NPD projects and NPD success factors within Vietnamese manufacturing SMEs and discuss for the first time NPD process, strategic planning, resource allocation and success measures in Vietnamese manufacturing SMEs. These results are hugely beneficial, for manufacturing SMEs in Vietnam in particular and for other industries and countries in general, in assisting successful NPD

    Enhanced ontology-based text classification algorithm for structurally organized documents

    Get PDF
    Text classification (TC) is an important foundation of information retrieval and text mining. The main task of a TC is to predict the text‟s class according to the type of tag given in advance. Most TC algorithms used terms in representing the document which does not consider the relations among the terms. These algorithms represent documents in a space where every word is assumed to be a dimension. As a result such representations generate high dimensionality which gives a negative effect on the classification performance. The objectives of this thesis are to formulate algorithms for classifying text by creating suitable feature vector and reducing the dimension of data which will enhance the classification accuracy. This research combines the ontology and text representation for classification by developing five algorithms. The first and second algorithms namely Concept Feature Vector (CFV) and Structure Feature Vector (SFV), create feature vector to represent the document. The third algorithm is the Ontology Based Text Classification (OBTC) and is designed to reduce the dimensionality of training sets. The fourth and fifth algorithms, Concept Feature Vector_Text Classification (CFV_TC) and Structure Feature Vector_Text Classification (SFV_TC) classify the document to its related set of classes. These proposed algorithms were tested on five different scientific paper datasets downloaded from different digital libraries and repositories. Experimental obtained from the proposed algorithm, CFV_TC and SFV_TC shown better average results in terms of precision, recall, f-measure and accuracy compared against SVM and RSS approaches. The work in this study contributes to exploring the related document in information retrieval and text mining research by using ontology in TC

    Journal of Asian Finance, Economics and Business, v. 4, no. 3

    Get PDF

    Social capital for industrial development: operationalizing the concept

    Get PDF
    The present report on Social capital for industrial development: operationalizing the concept is part of the broader Combating Marginalization and Poverty through Industrial Development (COMPID), research programme of the United Nations Industrial Development Organization (UNIDO), designed to enhance the competitiveness of industrial producers in marginalized countries.1 The Industrial Development Report 2002/2003 posits that, especially in the least developed countries, building industrial competitiveness: ‘‘… can involve heavy costs and great risks and uncertainties’’ (UNIDO [131], p. 9). The main reason for conducting research on operationalizing social capital is that there are grounds for believing that social capital could potentially mitigate some of the risks and uncertainties that exist in low-income and marginalized countries, and thus help to increase their level of competitiveness

    Data sets for author name disambiguation: an empirical analysis and a new resource

    Get PDF
    Data sets of publication meta data with manually disambiguated author names play an important role in current author name disambiguation (AND) research. We review the most important data sets used so far, and compare their respective advantages and shortcomings. From the results of this review, we derive a set of general requirements to future AND data sets. These include both trivial requirements, like absence of errors and preservation of author order, and more substantial ones, like full disambiguation and adequate representation of publications with a small number of authors and highly variable author names. On the basis of these requirements, we create and make publicly available a new AND data set, SCAD-zbMATH. Both the quantitative analysis of this data set and the results of our initial AND experiments with a naive baseline algorithm show the SCAD-zbMATH data set to be considerably different from existing ones. We consider it a useful new resource that will challenge the state of the art in AND and benefit the AND research community

    Topic-enhanced Models for Speech Recognition and Retrieval

    Get PDF
    This thesis aims to examine ways in which topical information can be used to improve recognition and retrieval of spoken documents. We consider the interrelated concepts of locality, repetition, and `subject of discourse' in the context of speech processing applications: speech recognition, speech retrieval, and topic identification of speech. This work demonstrates how supervised and unsupervised models of topics, applicable to any language, can improve accuracy in accessing spoken content. This work looks at the complementary aspects of topic information in lexical content in terms of local context - locality or repetition of word usage - and broad context - the typical `subject matter' definition of a topic. By augmenting speech processing language models with topic information we can demonstrate consistent improvements in performance in a number of metrics. We add locality to bags-of-words topic identification models, we quantify the relationship between topic information and keyword retrieval, and we consider word repetition both in terms of keyword based retrieval and language modeling. Lastly, we combine these concepts and develop joint models of local and broad context via latent topic models. We present a latent topic model framework that treats documents as arising from an underlying topic sequence combined with a cache-based repetition model. We analyze our proposed model both for its ability to capture word repetition via the cache and for its suitability as a language model for speech recognition and retrieval. We show this model, augmented with the cache, captures intuitive repetition behavior across languages and exhibits lower perplexity than regular LDA on held out data in multiple languages. Lastly, we show that our joint model improves speech retrieval performance beyond N-grams or latent topics alone, when applied to a term detection task in all languages considered
    corecore