102 research outputs found

    Information content: assessing meso-scale structures in complex networks

    Get PDF
    We propose a novel measure to assess the presence of meso-scale structures in complex networks. This measure is based on the identification of regular patterns in the adjacency matrix of the network, and on the calculation of the quantity of information lost when pairs of nodes are iteratively merged. We show how this measure is able to quantify several meso-scale structures, like the presence of modularity, bipartite and core-periphery configurations, or motifs. Results corresponding to a large set of real networks are used to validate its ability to detect non-trivial topological patterns.Comment: Published as: M. Zanin, P. A. Sousa and E. Menasalvas, Information content: assessing meso-scale structures in complex networks EPL 106 (3), (2014) 3000

    TIDA: A spanish EHR semantic search engine

    Get PDF
    Electronic Health Records (EHR) and the constant adoption of Information Technologies in healthcare have dramatically increased the amount of unstructured data stored. The extraction of key information from this data will bring better caregivers decisions and an improvement in patients? treatments. With more than 495 million people talking Spanish, the need to adapt algorithms and technologies used in EHR knowledge extraction in English speaking countries, leads to the development of different frameworks. Thus, we present TIDA, a Spanish EHR semantic search engine, to give support to Spanish speaking medical centers and hospitals to convert pure raw data into information understandable for cognitive systems. This paper presents the results of TIDA?s Spanish EHR free-text treatment component with the adaptation of negation and context detection algorithms applied in a semantic search engine with a database with more than 30,000 clinical notes

    Tracking recurrent concepts using context

    Get PDF
    The problem of recurring concepts in data stream classification is a special case of concept drift where concepts may reappear. Although several existing methods are able to learn in the presence of concept drift, few consider contextual information when tracking recurring concepts. Nevertheless, in many real-world scenarios context information is available and can be exploited to improve existing approaches in the detection or even anticipation of recurring concepts. In this work, we propose the extension of existing approaches to deal with the problem of recurring concepts by reusing previously learned decision models in situations where concepts reappear. The different underlying concepts are identified using an existing drift detection method, based on the error-rate of the learning process. A method to associate context information and learned decision models is proposed to improve the adaptation to recurring concepts. The method also addresses the challenge of retrieving the most appropriate concept for a particular context. Finally, to deal with situations of memory scarcity, an intelligent strategy to discard models is proposed. The experiments conducted so far, using synthetic and real datasets, show promising results and make it possible to analyze the trade-off between the accuracy gains and the learned models storage cost

    Predicting recurring concepts on data-streams by me ans of a meta-model and a fuzzy similarity function

    Get PDF
    Meta-models can be used in the process of enhancing the drift detection mechanisms used by data stream algorithms, by representing and predicting when the change will occur. There are some real-world situations where a concept reappears, as in the case of intrusion detection systems(IDS), where the same incidents or an adaptation of them usually reappear over time. In these environments the early prediction of drift by means of a better knowledge of past models can help to anticipate to the change, thus improving efficiency of the model regarding the training instances needed. In this paper we present MM-PRec, a meta-model for predicting recurring concepts on data-streams which main goal is to predict when the drift is going to occur together with the best model to be used in case of a recurring concept. To fulfill this goal, MM-PRec trains a Hidden Markov Model (HMM) from the instances that appear during the concept drift. The learning process of the base classification learner feeds the meta-model with all the information needed to predict recurrent or similar situations. Thus, the models predicted together with the associated contextual information are stored. In our approach we also propose to use a fuzzy similarity function to decide which is the best model to represent a particular context when drift is detected. The experiments performed show that MM-PRec outperforms the behaviour of other context-aware algorithms in terms of training instances needed, specially in environments characterized by the presence of gradual drifts

    Query categorization process based on visibility on a mobile device

    Get PDF
    La categorización de consultas web es una actividad de creciente interés para las organizaciones debido a que les permite proporcionar a sus usuarios servicios de valor añadido en respuesta a las consultas ellos envían al motor de búsqueda de la compañía. Esto representa un reto en dispositivos móviles no solo por los problemas asociados con la interpretación de una consulta tomando en cuenta el contexto del usuario bajo supuestos de movilidad, sino también por los problemas derivados de las limitaciones de recursos en este tipo de dispositivos. La necesidad de autonomía requerida en una situación donde el minero de datos no está presente, hace que el problema sea aun más desafiante. En este artículo abordamos el problema de categorización de consultas en dispositivos móviles. Para ello presentamos en primer lugar un modelo para la visibilidad de términos. En base a este modelo definimos el proceso para la categorización. La parte innovadora de este proceso comprende la definición de parámetros y tareas, encargadas de controlar las restricciones derivadas por las limitaciones de recursos. Finalmente presentamos un modelo de metadatos, el cual es requerido como base para la automatización.Web query categorization is an activity of growing interest to organizations because it enables them to provide their users with value-added services in response to queries they submit to the company's search engine. This represents a challenge in mobile devices not only because of the problems associated with the interpretation of a query taking into account the user's context under mobility assumptions, but also because of the problems derived from resource limitations in this type of device. The need for autonomy required in a situation where the data miner is not present makes the problem even more challenging. In this article we address the problem of query categorization on mobile devices. For this, we first present a model for the visibility of terms. Based on this model we define the process for categorization. The innovative part of this process includes the definition of parameters and tasks, in charge of controlling the restrictions derived from resource limitations. Finally we present a metadata model, which is required as a basis for automation

    Automatic extraction and identification of users' responses in Facebook medical quizzes

    Get PDF
    In the last few years the use of social media in medicine has grown exponentially, providing a new area of research based on the analysis and use of Web 2.0 capabilities. In addition, the use of social media in medical education is a subject of particular interest which has been addressed in several studies. One example of this application is the medical quizzes of The New England Journal of Medicine (NEJM) that regularly publishes a set of questions through their Facebook timeline

    Collaborative data stream mining in ubiquitous environments using dynamic classifier selection

    Full text link
    In ubiquitous data stream mining applications, different devices often aim to learn concepts that are similar to some extent. In these applications, such as spam filtering or news recommendation, the data stream underlying concept (e.g., interesting mail/news) is likely to change over time. Therefore, the resultant model must be continuously adapted to such changes. This paper presents a novel Collaborative Data Stream Mining (Coll-Stream) approach that explores the similarities in the knowledge available from other devices to improve local classification accuracy. Coll-Stream integrates the community knowledge using an ensemble method where the classifiers are selected and weighted based on their local accuracy for different partitions of the feature space. We evaluate Coll-Stream classification accuracy in situations with concept drift, noise, partition granularity and concept similarity in relation to the local underlying concept. The experimental results show that Coll-Stream resultant model achieves stability and accuracy in a variety of situations using both synthetic and real world datasets

    Proceso de categorización de consultas basado en visibilidad en un dispositivo móvil

    Get PDF
    La categorización de consultas web es una actividad de creciente interés para las organizaciones debido a que les permite proporcionar a sus usuarios servicios de valor añadido en respuesta a las consultas ellos envían al motor de búsqueda de la compañía. Esto representa un reto en dispositivos móviles no solo por los problemas asociados con la interpretación de una consulta tomando en cuenta el contexto del usuario bajo supuestos de movilidad, sino también por los problemas derivados de las limitaciones de recursos en este tipo de dispositivos. La necesidad de autonomía requerida en una situación donde el minero de datos no está presente, hace que el problema sea aun más desafiante. En este artículo abordamos el problema de categorización de consultas en dispositivos móviles. Para ello presentamos en primer lugar un modelo para la visibilidad de términos. En base a este modelo definimos el proceso para la categorización. La parte innovadora de este proceso comprende la definición de parámetros y tareas, encargadas de controlar las restricciones derivadas por las limitaciones de recursos. Finalmente presentamos un modelo de metadatos, el cual es requerido como base para la automatización
    corecore