190 research outputs found

    Selective sampling for combined learning from labelled and unlabelled data

    Get PDF
    This paper examines the problem of selecting a suitable subset of data to be labelled when building pattern classifiers from labelled and unlabelled data. The selection of representative set is guided by a clustering information and various options of allocating a number of samples within clusters and their distributions are investigated. The experimental results show that hybrid methods like Semi-supervised clustering with selective sampling can result in building a classifier which requires much less labelled data in order to achieve a comparable classification performance to classifiers built only on the basis of labelled data

    Next challenges for adaptive learning systems

    Get PDF
    Learning from evolving streaming data has become a 'hot' research topic in the last decade and many adaptive learning algorithms have been developed. This research was stimulated by rapidly growing amounts of industrial, transactional, sensor and other business data that arrives in real time and needs to be mined in real time. Under such circumstances, constant manual adjustment of models is in-efficient and with increasing amounts of data is becoming infeasible. Nevertheless, adaptive learning models are still rarely employed in business applications in practice. In the light of rapidly growing structurally rich 'big data', new generation of parallel computing solutions and cloud computing services as well as recent advances in portable computing devices, this article aims to identify the current key research directions to be taken to bring the adaptive learning closer to application needs. We identify six forthcoming challenges in designing and building adaptive learning (pre-diction) systems: making adaptive systems scalable, dealing with realistic data, improving usability and trust, integrat-ing expert knowledge, taking into account various application needs, and moving from adaptive algorithms towards adaptive tools. Those challenges are critical for the evolving stream settings, as the process of model building needs to be fully automated and continuous.</jats:p

    Change point detection in social networksCritical review with experiments

    Full text link
    Š 2018 Elsevier Inc. Change point detection in social networks is an important element in developing the understanding of dynamic systems. This complex and growing area of research has no clear guidelines on what methods to use or in which circumstances. This paper critically discusses several possible network metrics to be used for a change point detection problem and conducts an experimental, comparative analysis using the Enron and MIT networks. Bayesian change point detection analysis is conducted on different global graph metrics (Size, Density, Average Clustering Coefficient, Average Shortest Path) as well as metrics derived from the Hierarchical and Block models (Entropy, Edge Probability, No. of Communities, Hierarchy Level Membership). The results produced the posterior probability of a change point at weekly time intervals that were analysed against ground truth change points using precision and recall measures. Results suggest that computationally heavy generative models offer only slightly better results compared to some of the global graph metrics. The simplest metrics used in the experiments, i.e. nodes and links numbers, are the recommended choice for detecting overall structural changes

    Encoding edge type information in graphlets.

    Full text link
    Graph embedding approaches have been attracting increasing attention in recent years mainly due to their universal applicability. They convert network data into a vector space in which the graph structural information and properties are maximumly preserved. Most existing approaches, however, ignore the rich information about interactions between nodes, i.e., edge attribute or edge type. Moreover, the learned embeddings suffer from a lack of explainability, and cannot be used to study the effects of typed structures in edge-attributed networks. In this paper, we introduce a framework to embed edge type information in graphlets and generate a Typed-Edge Graphlets Degree Vector (TyE-GDV). Additionally, we extend two combinatorial approaches, i.e., the colored graphlets and heterogeneous graphlets approaches to edge-attributed networks. Through applying the proposed method to a case study of chronic pain patients, we find that not only the network structure of a patient could indicate his/her perceived pain grade, but also certain social ties, such as those with friends, colleagues, and healthcare professionals, are more crucial in understanding the impact of chronic pain. Further, we demonstrate that in a node classification task, the edge-type encoded graphlets approaches outperform the traditional graphlet degree vector approach by a significant margin, and that TyE-GDV could achieve a competitive performance of the combinatorial approaches while being far more efficient in space requirements

    Area-level and individual correlates of active transportation among adults in Germany: A population-based multilevel study

    Get PDF
    This study aimed at estimating the prevalence in adults of complying with the aerobic physical activity (PA) recommendation through transportation-related walking and cycling. Furthermore, potential determinants of transportation-related PA recommendation compliance were investigated. 10,872 men and 13,144 women aged 18 years or older participated in the cross-sectional 'German Health Update 2014/15 - EHIS' in Germany. Transportation-related walking and cycling were assessed using the European Health Interview Survey-Physical Activity Questionnaire. Three outcome indicators were constructed: walking, cycling, and total active transportation (>= 600 metabolic equivalent, MET-min/week). Associations were analyzed using multilevel regression analysis. Forty-two percent of men and 39% of women achieved >= 600 MET-min/week with total active transportation. The corresponding percentages for walking were 27% and 28% and for cycling 17% and 13%, respectively. Higher population density, older age, lower income, higher work-related and leisure-time PA, not being obese, and better self-perceived health were positively associated with transportation-related walking and cycling and total active transportation among both men and women. The promotion of walking and cycling among inactive people has great potential to increase PA in the general adult population and to comply with PA recommendations. Several correlates of active transportation were identified which should be considered when planning public health policies and interventions

    Understanding the Relation Between Early-Life Adversity and Depression Symptoms: The Moderating Role of Sex and an Interleukin-1β Gene Variant

    Get PDF
    Pro-inflammatory cytokines, such as interleukin (IL)-6 and tumor necrosis factor-ι (TNF-ι), are thought to play a fundamental role in the pathogenesis of depression within a subset of individuals. However, the involvement of IL-1β has not been as consistently linked to depression, possibly owing to difficulties in detecting this cytokine in blood samples or that changes in circulating levels might only be apparent in a subgroup of patients who have experienced early-life adversity. From this perspective, the association between early-life adversity and depressive illness might depend on genetic variants regulating IL-1β activity. Considering the inflammatory-depression link, and that women are twice as likely to experience depression compared to men, the current study (N = 475 university students) examined the moderating role of three independent cytokine single nucleotide polymorphisms (SNPs; IL-1β rs16944, IL-6 rs1800795 SNP, TNF-ι rs1800629) in the relationship between early-life adversity and depressive symptoms, and whether these relations differed between males and females. The relation between childhood adversity and depressive symptoms was moderated by the IL-1β SNP, and further varied according to sex. Specifically, among females, higher childhood maltreatment was accompanied by elevated depressive symptoms irrespective of the IL-1β SNP, but among males, this relationship was particularly pronounced for those carrying the GG genotype of the IL-1β SNP. These findings suggest that, in the context of early life adversity, genetic variations of IL-1β functioning are related to depressive symptomatology and this may vary among males and females. The present study also, more broadly, highlights the importance of considering the confluence of experiential factors (e.g., early life adversity) and personal characteristics (e.g., sex and genetics) in understanding depressive disorders, an approach increasingly recognized in developing personalized treatment approaches to this illness

    Understanding the relation between early-life adversity and depression symptoms: The moderating role of sex and an interleukin-1β gene variant

    Get PDF
    Pro-inflammatory cytokines, such as interleukin (IL)-6 and tumor necrosis factor-ι (TNF-ι), are thought to play a fundamental role in the pathogenesis of depression within a subset of individuals. However, the involvement of IL-1β has not been as consistently linked to depression, possibly owing to difficulties in detecting this cytokine in blood samples or that changes in circulating levels might only be apparent in a subgroup of patients who have experienced early-life adversity. From this perspective, the association between early-life adversity and depressive illness might depend on genetic variants regulating IL-1β ac

    Outlier Resistant PCA Ensembles

    Get PDF
    Statistical re-sampling techniques have been used extensively and successfully in the machine learning approaches for generation of classifier and predictor ensembles. It has been frequently shown that combining so called unstable predictors has a stabilizing effect on and improves the performance of the prediction system generated in this way. In this paper we use the re-sampling techniques in the context of Principal Component Analysis (PCA). We show that the proposed PCA ensembles exhibit a much more robust behaviour in the presence of outliers which can seriously affect the performance of an individual PCA algorithm. The performance and characteristics of the proposed approaches are illustrated on a number of experimental studies where an individual PCA is compared to the introduced PCA ensemble
    • …
    corecore