2,316 research outputs found

    A Comprehensive Survey on Rare Event Prediction

    Full text link
    Rare event prediction involves identifying and forecasting events with a low probability using machine learning and data analysis. Due to the imbalanced data distributions, where the frequency of common events vastly outweighs that of rare events, it requires using specialized methods within each step of the machine learning pipeline, i.e., from data processing to algorithms to evaluation protocols. Predicting the occurrences of rare events is important for real-world applications, such as Industry 4.0, and is an active research area in statistical and machine learning. This paper comprehensively reviews the current approaches for rare event prediction along four dimensions: rare event data, data processing, algorithmic approaches, and evaluation approaches. Specifically, we consider 73 datasets from different modalities (i.e., numerical, image, text, and audio), four major categories of data processing, five major algorithmic groupings, and two broader evaluation approaches. This paper aims to identify gaps in the current literature and highlight the challenges of predicting rare events. It also suggests potential research directions, which can help guide practitioners and researchers.Comment: 44 page

    Ensemble Multi-Quantiles: Adaptively Flexible Distribution Prediction for Uncertainty Quantification

    Full text link
    We propose a novel, succinct, and effective approach to quantify uncertainty in machine learning. It incorporates adaptively flexible distribution prediction for P(yX=x)\mathbb{P}(\mathbf{y}|\mathbf{X}=x) in regression tasks. For predicting this conditional distribution, its quantiles of probability levels spreading the interval (0,1)(0,1) are boosted by additive models which are designed by us with intuitions and interpretability. We seek an adaptive balance between the structural integrity and the flexibility for P(yX=x)\mathbb{P}(\mathbf{y}|\mathbf{X}=x), while Gaussian assumption results in a lack of flexibility for real data and highly flexible approaches (e.g., estimating the quantiles separately without a distribution structure) inevitably have drawbacks and may not lead to good generalization. This ensemble multi-quantiles approach called EMQ proposed by us is totally data-driven, and can gradually depart from Gaussian and discover the optimal conditional distribution in the boosting. On extensive regression tasks from UCI datasets, we show that EMQ achieves state-of-the-art performance comparing to many recent uncertainty quantification methods. Visualization results further illustrate the necessity and the merits of such an ensemble model

    Neural network methods for one-to-many multi-valued mapping problems

    Get PDF
    An investigation of the applicability of neural network-based methods in predicting the values of multiple parameters, given the value of a single parameter within a particular problem domain is presented. In this context, the input parameter may be an important source of variation that is related with a complex mapping function to the remaining sources of variation within a multivariate distribution. The definition of the relationship between the variables of a multivariate distribution and a single source of variation allows the estimation of the values of multiple variables given the value of the single variable, addressing in that way an ill-conditioned one-to-many mapping problem. As part of our investigation, two problem domains are considered: predicting the values of individual stock shares, given the value of the general index, and predicting the grades received by high school pupils, given the grade for a single course or the average grade. With our work, the performance of standard neural network-based methods and in particular multilayer perceptrons (MLPs), radial basis functions (RBFs), mixture density networks (MDNs) and a latent variable method, the general topographic mapping (GTM), is compared. According to the results, MLPs and RBFs outperform MDNs and the GTM for these one-to-many mapping problems

    Кибербезопасность в образовательных сетях

    Get PDF
    The paper discusses the possible impact of digital space on a human, as well as human-related directions in cyber-security analysis in the education: levels of cyber-security, social engineering role in cyber-security of education, “cognitive vaccination”. “A Human” is considered in general meaning, mainly as a learner. The analysis is provided on the basis of experience of hybrid war in Ukraine that have demonstrated the change of the target of military operations from military personnel and critical infrastructure to a human in general. Young people are the vulnerable group that can be the main goal of cognitive operations in long-term perspective, and they are the weakest link of the System.У статті обговорюється можливий вплив цифрового простору на людину, а також пов'язані з людиною напрямки кібербезпеки в освіті: рівні кібербезпеки, роль соціального інжинірингу в кібербезпеці освіти, «когнітивна вакцинація». «Людина» розглядається в загальному значенні, головним чином як та, що навчається. Аналіз надається на основі досвіду гібридної війни в Україні, яка продемонструвала зміну цілей військових операцій з військовослужбовців та критичної інфраструктури на людину загалом. Молодь - це вразлива група, яка може бути основною метою таких операцій в довгостроковій перспективі, і вони є найслабшою ланкою системи.В документе обсуждается возможное влияние цифрового пространства на человека, а также связанные с ним направления в анализе кибербезопасности в образовании: уровни кибербезопасности, роль социальной инженерии в кибербезопасности образования, «когнитивная вакцинация». «Человек» рассматривается в общем смысле, в основном как ученик. Анализ представлен на основе опыта гибридной войны в Украине, которая продемонстрировала изменение цели военных действий с военного персонала и критической инфраструктуры на человека в целом. Молодые люди являются уязвимой группой, которая может быть главной целью когнитивных операций в долгосрочной перспективе, и они являются самым слабым звеном Систем
    corecore