Search CORE

35 research outputs found

ヘイセイ　21ネンド　ネンジ　ホウコクショ　カツドウ　ジョウキョウ　ト　カダイ

Author
Publication venue: オオサカ　ダイガク　サンギョウ　カガク　ケンキュウジョ　コウホウ　イインカイ
Publication date
Field of study

A survey on feature drift adaptation: Definition, benchmark, challenges and future directions

Author: Barddal Jean Paul
Enembreck Fabrício
Gomes Heitor Murilo
Pfahringer Bernhard
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Data stream mining is a fast growing research topic due to the ubiquity of data in several real-world problems. Given their ephemeral nature, data stream sources are expected to undergo changes in data distribution, a phenomenon called concept drift. This paper focuses on one specific type of drift that has not yet been thoroughly studied, namely feature drift. Feature drift occurs whenever a subset of features becomes, or ceases to be, relevant to the learning task; thus, learners must detect and adapt to these changes accordingly. We survey existing work on feature drift adaptation with both explicit and implicit approaches. Additionally, we benchmark several algorithms and a naive feature drift detection approach using synthetic and real-world datasets. The results from our experiments indicate the need for future research in this area as even naive approaches produced gains in accuracy while reducing resources usage. Finally, we state current research topics, challenges and future directions for feature drift adaptation

International Evaluation of Research and Doctoral Training at the University of Helsinki 2005-2010 : RC-Specific Evaluation of ALKO - Algorithms and Data Analysis

Author
Publication venue
Publication date: 01/01/2012
Field of study

Helsingin yliopiston digitaalinen arkisto

Computational Methods for Medical and Cyber Security

Author
Publication venue: 'MDPI AG'
Publication date: 16/09/2022
Field of study

Over the past decade, computational methods, including machine learning (ML) and deep learning (DL), have been exponentially growing in their development of solutions in various domains, especially medicine, cybersecurity, finance, and education. While these applications of machine learning algorithms have been proven beneficial in various fields, many shortcomings have also been highlighted, such as the lack of benchmark datasets, the inability to learn from small datasets, the cost of architecture, adversarial attacks, and imbalanced datasets. On the other hand, new and emerging algorithms, such as deep learning, one-shot learning, continuous learning, and generative adversarial networks, have successfully solved various tasks in these fields. Therefore, applying these new methods to life-critical missions is crucial, as is measuring these less-traditional algorithms' success when used in these fields

Assessment of Renewable Energy Resources with Remote Sensing

Author: Martins Fernando Ramos
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

The development of renewable energy sources plays a fundamental role in the transition towards a low carbon economy. Considering that renewable energy resources have an intrinsic relationship with meteorological conditions and climate patterns, methodologies based on the remote sensing of the atmosphere are fundamental sources of information to support the energy sector in planning and operation procedures. This Special Issue is intended to provide a highly recognized international forum to present recent advances in remote sensing to data acquisition required by the energy sector. After a review, a total of eleven papers were accepted for publication. The contributions focus on solar, wind, and geothermal energy resource. This editorial presents a brief overview of each contribution.About the Editor .............................................. vii Fernando Ramos Martins Editorial for the Special Issue: Assessment of Renewable Energy Resources with Remote Sensing Reprinted from: Remote Sens. 2020, 12, 3748, doi:10.3390/rs12223748 ................. 1 André R. Gonçalves, Arcilan T. Assireu, Fernando R. Martins, Madeleine S. G. Casagrande, Enrique V. Mattos, Rodrigo S. Costa, Robson B. Passos, Silvia V. Pereira, Marcelo P. Pes, Francisco J. L. Lima and Enio B. Pereira Enhancement of Cloudless Skies Frequency over a Large Tropical Reservoir in Brazil Reprinted from: Remote Sens. 2020, 12, 2793, doi:10.3390/rs12172793 ................. 7 Anders V. Lindfors, Axel Hertsberg, Aku Riihelä, Thomas Carlund, Jörg Trentmann and Richard Müller On the Land-Sea Contrast in the Surface Solar Radiation (SSR) in the Baltic Region Reprinted from: Remote Sens. 2020, 12, 3509, doi:10.3390/rs12213509 ................. 33 Joaquín Alonso-Montesinos Real-Time Automatic Cloud Detection Using a Low-Cost Sky Camera Reprinted from: Remote Sens. 2020, 12, 1382, doi:10.3390/rs12091382 ................. 43 Román Mondragón, Joaquín Alonso-Montesinos, David Riveros-Rosas, Mauro Valdés, Héctor Estévez, Adriana E. González-Cabrera and Wolfgang Stremme Attenuation Factor Estimation of Direct Normal Irradiance Combining Sky Camera Images and Mathematical Models in an Inter-Tropical Area Reprinted from: Remote Sens. 2020, 12, 1212, doi:10.3390/rs12071212 ................. 61 Jinwoong Park, Jihoon Moon, Seungmin Jung and Eenjun Hwang Multistep-Ahead Solar Radiation Forecasting Scheme Based on the Light Gradient Boosting Machine: A Case Study of Jeju Island Reprinted from: Remote Sens. 2020, 12, 2271, doi:10.3390/rs12142271 ................. 79 Guojiang Xiong, Jing Zhang, Dongyuan Shi, Lin Zhu, Xufeng Yuan and Gang Yao Modified Search Strategies Assisted Crossover Whale Optimization Algorithm with Selection Operator for Parameter Extraction of Solar Photovoltaic Models Reprinted from: Remote Sens. 2019, 11, 2795, doi:10.3390/rs11232795 ................. 101 Alexandra I. Khalyasmaa, Stanislav A. Eroshenko, Valeriy A. Tashchilin, Hariprakash Ramachandran, Teja Piepur Chakravarthi and Denis N. Butusov Industry Experience of Developing Day-Ahead Photovoltaic Plant Forecasting System Based on Machine Learning Reprinted from: Remote Sens. 2020, 12, 3420, doi:10.3390/rs12203420 ................. 125 Ian R. Young, Ebru Kirezci and Agustinus Ribal The Global Wind Resource Observed by Scatterometer Reprinted from: Remote Sens. 2020, 12, 2920, doi:10.3390/rs12182920 ................. 147 Susumu Shimada, Jay Prakash Goit, Teruo Ohsawa, Tetsuya Kogaki and Satoshi Nakamura Coastal Wind Measurements Using a Single Scanning LiDAR Reprinted from: Remote Sens. 2020, 12, 1347, doi:10.3390/rs12081347 ................. 165 Cristina Sáez Blázquez, Pedro Carrasco García, Ignacio Martín Nieto, MiguelAngel ´ Maté-González, Arturo Farfán Martín and Diego González-Aguilera Characterizing Geological Heterogeneities for Geothermal Purposes through Combined Geophysical Prospecting Methods Reprinted from: Remote Sens. 2020, 12, 1948, doi:10.3390/rs12121948 ................. 189 Miktha Farid Alkadri, Francesco De Luca, Michela Turrin and Sevil Sariyildiz A Computational Workflow for Generating A Voxel-Based Design Approach Based on Subtractive Shading Envelopes and Attribute Information of Point Cloud Data Reprinted from: Remote Sens. 2020, 12, 2561, doi:10.3390/rs12162561 ................. 207Instituto do Ma

Repositório Institucional UNIFESP

Sparsity-aware neural user behavior modeling in online interaction platforms

Author: Sankar Aravind
Publication venue
Publication date: 01/12/2021
Field of study

Modern online platforms offer users an opportunity to participate in a variety of content-creation, social networking, and shopping activities. With the rapid proliferation of such online services, learning data-driven user behavior models is indispensable to enable personalized user experiences. Recently, representation learning has emerged as an effective strategy for user modeling, powered by neural networks trained over large volumes of interaction data. Despite their enormous potential, we encounter the unique challenge of data sparsity for a vast majority of entities, e.g., sparsity in ground-truth labels for entities and in entity-level interactions (cold-start users, items in the long-tail, and ephemeral groups). In this dissertation, we develop generalizable neural representation learning frameworks for user behavior modeling designed to address different sparsity challenges across applications. Our problem settings span transductive and inductive learning scenarios, where transductive learning models entities seen during training and inductive learning targets entities that are only observed during inference. We leverage different facets of information reflecting user behavior (e.g., interconnectivity in social networks, temporal and attributed interaction information) to enable personalized inference at scale. Our proposed models are complementary to concurrent advances in neural architectural choices and are adaptive to the rapid addition of new applications in online platforms. First, we examine two transductive learning settings: inference and recommendation in graph-structured and bipartite user-item interactions. In chapter 3, we formulate user profiling in social platforms as semi-supervised learning over graphs given sparse ground-truth labels for node attributes. We present a graph neural network framework that exploits higher-order connectivity structures (network motifs) to learn attributed structural roles of nodes that identify structurally similar nodes with co-varying local attributes. In chapter 4, we design neural collaborative filtering models for few-shot recommendations over user-item interactions. To address item interaction sparsity due to heavy-tailed distributions, our proposed meta-learning framework learns-to-recommend few-shot items by knowledge transfer from arbitrary base recommenders. We show that our framework consistently outperforms state-of-art approaches on overall recommendation (by 5% Recall) while achieving significant gains (of 60-80% Recall) for tail items with fewer than 20 interactions. Next, we explored three inductive learning settings: modeling spread of user-generated content in social networks; item recommendations for ephemeral groups; and friend ranking in large-scale social platforms. In chapter 5, we focus on diffusion prediction in social networks where a vast population of users rarely post content. We introduce a deep generative modeling framework that models users as probability distributions in the latent space with variational priors parameterized by graph neural networks. Our approach enables massive performance gains (over 150% recall) for users with sparse activities while being faster than state-of-the-art neural models by an order of magnitude. In chapter 6, we examine item recommendations for ephemeral groups with limited or no historical interactions together. To overcome group interaction sparsity, we present self-supervised learning strategies that exploit the preference co-variance in observed group memberships for group recommender training. Our framework achieves significant performance gains (over 30% NDCG) over prior state-of-the-art group recommendation models. In chapter 7, we introduce multi-modal inference with graph neural networks that captures knowledge from multiple feature modalities and user interactions for multi-faceted friend ranking. Our approach achieves notable higher performance gains for critical populations of less-active and low degree users

Learning in Dynamic Data-Streams with a Scarcity of Labels

Author: Fahy Conor
Publication venue: Faculty of Computing, Engineering and Media
Publication date: 01/01/2019
Field of study

Analysing data in real-time is a natural and necessary progression from traditional data mining. However, real-time analysis presents additional challenges to batch-analysis; along with strict time and memory constraints, change is a major consideration. In a dynamic stream there is an assumption that the underlying process generating the stream is non-stationary and that concepts within the stream will drift and change over time. Adopting a false assumption that a stream is stationary will result in non-adaptive models degrading and eventually becoming obsolete. The challenge of recognising and reacting to change in a stream is compounded by the scarcity of labels problem. This refers to the very realistic situation in which the true class label of an incoming point is not immediately available (or will never be available) or in situations where manually labelling incoming points is prohibitively expensive. The goal of this thesis is to evaluate unsupervised learning as the basis for online classification in dynamic data-streams with a scarcity of labels. To realise this goal, a novel stream clustering algorithm based on the collective behaviour of ants (Ant Colony Stream Clustering (ACSC)) is proposed. This algorithm is shown to be faster and more accurate than comparative, peer stream-clustering algorithms while requiring fewer sensitive parameters. The principles of ACSC are extended in a second stream-clustering algorithm named Multi-Density Stream Clustering (MDSC). This algorithm has adaptive parameters and crucially, can track clusters and monitor their dynamic behaviour over time. A novel technique called a Dynamic Feature Mask (DFM) is proposed to ``sit on top’’ of these stream-clustering algorithms and can be used to observe and track change at the feature level in a data stream. This Feature Mask acts as an unsupervised feature selection method allowing high-dimensional streams to be clustered. Finally, data-stream clustering is evaluated as an approach to one-class classification and a novel framework (named COCEL: Clustering and One class Classification Ensemble Learning) for classification in dynamic streams with a scarcity of labels is described. The proposed framework can identify and react to change in a stream and hugely reduces the number of required labels (typically less than 0.05% of the entire stream)

De Montfort University Open Research Archive

On the Principles of Evaluation for Natural Language Generation

Author: Zhao Wei
Publication venue
Publication date: 01/01/2023
Field of study

Natural language processing is concerned with the ability of computers to understand natural language texts, which is, arguably, one of the major bottlenecks in the course of chasing the holy grail of general Artificial Intelligence. Given the unprecedented success of deep learning technology, the natural language processing community has been almost entirely in favor of practical applications with state-of-the-art systems emerging and competing for human-parity performance at an ever-increasing pace. For that reason, fair and adequate evaluation and comparison, responsible for ensuring trustworthy, reproducible and unbiased results, have fascinated the scientific community for long, not only in natural language but also in other fields. A popular example is the ISO-9126 evaluation standard for software products, which outlines a wide range of evaluation concerns, such as cost, reliability, scalability, security, and so forth. The European project EAGLES-1996, being the acclaimed extension to ISO-9126, depicted the fundamental principles specifically for evaluating natural language technologies, which underpins succeeding methodologies in the evaluation of natural language. Natural language processing encompasses an enormous range of applications, each with its own evaluation concerns, criteria and measures. This thesis cannot hope to be comprehensive but particularly addresses the evaluation in natural language generation (NLG), which touches on, arguably, one of the most human-like natural language applications. In this context, research on quantifying day-to-day progress with evaluation metrics lays the foundation of the fast-growing NLG community. However, previous works have failed to address high-quality metrics in multiple scenarios such as evaluating long texts and when human references are not available, and, more prominently, these studies are limited in scope, given the lack of a holistic view sketched for principled NLG evaluation. In this thesis, we aim for a holistic view of NLG evaluation from three complementary perspectives, driven by the evaluation principles in EAGLES-1996: (i) high-quality evaluation metrics, (ii) rigorous comparison of NLG systems for properly tracking the progress, and (iii) understanding evaluation metrics. To this end, we identify the current state of challenges derived from the inherent characteristics of these perspectives, and then present novel metrics, rigorous comparison approaches, and explainability techniques for metrics to address the identified issues. We hope that our work on evaluation metrics, system comparison and explainability for metrics inspires more research towards principled NLG evaluation, and contributes to the fair and adequate evaluation and comparison in natural language processing