39,752 research outputs found
A survey on machine learning for recurring concept drifting data streams
The problem of concept drift has gained a lot of attention in recent years. This aspect is key in many domains exhibiting non-stationary as well as cyclic patterns and structural breaks affecting their generative processes. In this survey, we review the relevant literature to deal with regime changes in the behaviour of continuous data streams. The study starts with a general introduction to the field of data stream learning, describing recent works on passive or active mechanisms to adapt or detect concept drifts, frequent challenges in this area, and related performance metrics. Then, different supervised and non-supervised approaches such as online ensembles, meta-learning and model-based clustering that can be used to deal with seasonalities in a data stream are covered. The aim is to point out new research trends and give future research directions on the usage of machine learning techniques for data streams which can help in the event of shifts and recurrences in continuous learning scenarios in near real-time
Outsourcing Back Office Services in Small Nonprofits: Pitfalls and Possibilities
Presents findings on small nonprofits' administrative, finance, and other office support needs; reasons and conditions for outsourcing as well as barriers; methods for evaluating options; and guiding principles. Examines three business models
A survey on online active learning
Online active learning is a paradigm in machine learning that aims to select
the most informative data points to label from a data stream. The problem of
minimizing the cost associated with collecting labeled observations has gained
a lot of attention in recent years, particularly in real-world applications
where data is only available in an unlabeled form. Annotating each observation
can be time-consuming and costly, making it difficult to obtain large amounts
of labeled data. To overcome this issue, many active learning strategies have
been proposed in the last decades, aiming to select the most informative
observations for labeling in order to improve the performance of machine
learning models. These approaches can be broadly divided into two categories:
static pool-based and stream-based active learning. Pool-based active learning
involves selecting a subset of observations from a closed pool of unlabeled
data, and it has been the focus of many surveys and literature reviews.
However, the growing availability of data streams has led to an increase in the
number of approaches that focus on online active learning, which involves
continuously selecting and labeling observations as they arrive in a stream.
This work aims to provide an overview of the most recently proposed approaches
for selecting the most informative observations from data streams in the
context of online active learning. We review the various techniques that have
been proposed and discuss their strengths and limitations, as well as the
challenges and opportunities that exist in this area of research. Our review
aims to provide a comprehensive and up-to-date overview of the field and to
highlight directions for future work
The sustainable delivery of sexual violence prevention education in schools
Sexual violence is a crime that cannot be ignored: it causes our communities significant consequences including heavy economic costs, and evidence of its effects can be seen in our criminal justice system, public health system, Accident Compensation Corporation (ACC), and education system, particularly in our schools. Many agencies throughout New Zealand work to end sexual violence. Auckland-based Rape Prevention Education: Whakatu Mauri (RPE) is one such agency, and is committed to preventing sexual violence by providing a range of programmes and initiatives, information, education, and advocacy to a broad range of audiences. Up until early 2014 RPE employed one or two full-time positions dedicated to co-ordinating and training a large pool (up to 15) of educators on casual contracts to deliver their main school-based programmes, BodySafe â approximately 450 modules per year, delivered to some 20 high schools. Each year several of the contract educators, many of whom were tertiary students, found secure full time employment elsewhere. To retain sufficient contract educators to deliver its BodySafe contract meant that RPE had to recruit, induct and train new educators two to three times every year. This model was expensive, resource intense, and ultimately untenable. The Executive Director and core staff at RPE wanted to develop a more efficient and stable model of delivery that fitted its scarce resources. To enable RPE to know what the most efficient model was nationally and internationally, with Ministry of Justice funding, RPE commissioned Massey University to undertake this report reviewing national and international research on sexual violence prevention education (SVPE)
The sustainable delivery of sexual violence prevention education in schools
Sexual violence is a crime that cannot be ignored: it causes our communities significant
consequences including heavy economic costs, and evidence of its effects can be seen in our
criminal justice system, public health system, Accident Compensation Corporation (ACC),
and education system, particularly in our schools. Many agencies throughout New Zealand
work to end sexual violence. Auckland-based Rape Prevention Education: Whakatu Mauri
(RPE) is one such agency, and is committed to preventing sexual violence by providing a
range of programmes and initiatives, information, education, and advocacy to a broad range
of audiences.
Up until early 2014 RPE employed one or two full-time positions dedicated to co-ordinating
and training a large pool (up to 15) of educators on casual contracts to deliver their main
school-based programmes, BodySafe â approximately 450 modules per year, delivered to
some 20 high schools. Each year several of the contract educators, many of whom were
tertiary students, found secure full time employment elsewhere. To retain sufficient
contract educators to deliver its BodySafe contract meant that RPE had to recruit, induct
and train new educators two to three times every year. This model was expensive, resource
intense, and ultimately untenable. The Executive Director and core staff at RPE wanted to
develop a more efficient and stable model of delivery that fitted its scarce resources.
To enable RPE to know what the most efficient model was nationally and internationally,
with Ministry of Justice funding, RPE commissioned Massey University to undertake this
report reviewing national and international research on sexual violence prevention
education (SVPE). [Background from Executive Summary.]Rape Prevention Education: Whakatu Maur
Recommended from our members
Online semi-supervised learning in non-stationary environments
Existing Data Stream Mining (DSM) algorithms assume the availability of labelled and
balanced data, immediately or after some delay, to extract worthwhile knowledge from the
continuous and rapid data streams. However, in many real-world applications such as
Robotics, Weather Monitoring, Fraud Detection Systems, Cyber Security, and Computer
Network Traffic Flow, an enormous amount of high-speed data is generated by Internet of
Things sensors and real-time data on the Internet. Manual labelling of these data streams
is not practical due to time consumption and the need for domain expertise. Another
challenge is learning under Non-Stationary Environments (NSEs), which occurs due to
changes in the data distributions in a set of input variables and/or class labels. The problem
of Extreme Verification Latency (EVL) under NSEs is referred to as Initially Labelled Non-Stationary Environment (ILNSE). This is a challenging task because the learning algorithms
have no access to the true class labels directly when the concept evolves. Several approaches
exist that deal with NSE and EVL in isolation. However, few algorithms address both issues
simultaneously. This research directly responds to ILNSEâs challenge in proposing two
novel algorithms âPredictor for Streaming Data with Scarce Labelsâ (PSDSL) and
Heterogeneous Dynamic Weighted Majority (HDWM) classifier. PSDSL is an Online Semi-Supervised Learning (OSSL) method for real-time DSM and is closely related to label
scarcity issues in online machine learning.
The key capabilities of PSDSL include learning from a small amount of labelled data in an
incremental or online manner and being available to predict at any time. To achieve this,
PSDSL utilises both labelled and unlabelled data to train the prediction models, meaning it
continuously learns from incoming data and updates the model as new labelled or
unlabelled data becomes available over time. Furthermore, it can predict under NSE
conditions under the scarcity of class labels. PSDSL is built on top of the HDWM classifier,
which preserves the diversity of the classifiers. PSDSL and HDWM can intelligently switch
and adapt to the conditions. The PSDSL adapts to learning states between self-learning,
micro-clustering and CGC, whichever approach is beneficial, based on the characteristics of
the data stream. HDWM makes use of âseedâ learners of different types in an ensemble to
maintain its diversity. The ensembles are simply the combination of predictive models
grouped to improve the predictive performance of a single classifier.
PSDSL is empirically evaluated against COMPOSE, LEVELIW, SCARGC and MClassification
on benchmarks, NSE datasets as well as Massive Online Analysis (MOA) data streams and real-world datasets. The results showed that PSDSL performed significantly better than
existing approaches on most real-time data streams including randomised data instances.
PSDSL performed significantly better than âStaticâ i.e. the classifier is not updated after it is
trained with the first examples in the data streams. When applied to MOA-generated data
streams, PSDSL ranked highest (1.5) and thus performed significantly better than SCARGC,
while SCARGC performed the same as the Static. PSDSL achieved better average prediction
accuracies in a short time than SCARGC.
The HDWM algorithm is evaluated on artificial and real-world data streams against existing
well-known approaches such as the heterogeneous WMA and the homogeneous Dynamic
DWM algorithm. The results showed that HDWM performed significantly better than WMA
and DWM. Also, when recurring concept drifts were present, the predictive performance of
HDWM showed an improvement over DWM. In both drift and real-world streams,
significance tests and post hoc comparisons found significant differences between
algorithms, HDWM performed significantly better than DWM and WMA when applied to
MOA data streams and 4 real-world datasets Electric, Spam, Sensor and Forest cover. The
seeding mechanism and dynamic inclusion of new base learners in the HDWM algorithms
benefit from the use of both forgetting and retaining the models. The algorithm also
provides the independence of selecting the optimal base classifier in its ensemble depending
on the problem.
A new approach, Envelope-Clustering is introduced to resolve the cluster overlap conflicts
during the cluster labelling process. In this process, PSDSL transforms the centroidsâ
information of micro-clusters into micro-instances and generates new clusters called
Envelopes. The nearest envelope clusters assist the conflicted micro-clusters and
successfully guide the cluster labelling process after the concept drifts in the absence of true
class labels. PSDSL has been evaluated on real-world problem âkeystroke dynamicsâ, and
the results show that PSDSL achieved higher prediction accuracy (85.3%) and SCARGC
(81.6%), while the Static (49.0%) significantly degrades the performance due to changes in
the users typing pattern. Furthermore, the predictive accuracies of SCARGC are found
highly fluctuated between (41.1% to 81.6%) based on different values of parameter âkâ
(number of clusters), while PSDSL automatically determine the best values for this
parameter
Holobiont Evolution: Mathematical Model with Vertical vs. Horizontal Microbiome Transmission
A holobiont is a composite organism consisting of a host together with its microbiome, such as a coral with its zooxanthellae. To explain the often intimate integration between hosts and their microbiomes, some investigators contend that selection operates on holobionts as a unit and view the microbiomeâs genes as extending the hostâs nuclear genome to jointly comprise a hologenome. Because vertical transmission of microbiomes is uncommon, other investigators contend that holobiont selection cannot be effective because a holobiontâs microbiome is an acquired condition rather than an inherited trait. This disagreement invites a simple mathematical model to see how holobiont selection might operate and to assess its plausibility as an evolutionary force. This paper presents two variants of such a model. In one variant, juvenile hosts obtain microbiomes from their parents (vertical transmission). In the other variant, microbiomes of juvenile hosts are assembled from source pools containing the combined microbiomes of all parents (horizontal transmission). According to both variants, holobiont selection indeed causes evolutionary change in holobiont traits. Therefore, holobiont selection is plausibly an effective evolutionary force with either mode of microbiome transmission. The modeling employs two distinct concepts of inheritance, depending on the mode of microbiome transmission: collective inheritance whereby juveniles inherit a sample of the collected genomes from all parents, as contrasted with lineal inheritance whereby juveniles inherit the genomes from only their own parents. A distinction between collective and lineal inheritance also features in theories of multilevel selection
- âŠ