18,488 research outputs found
Handling Concept Drift for Predictions in Business Process Mining
Predictive services nowadays play an important role across all business
sectors. However, deployed machine learning models are challenged by changing
data streams over time which is described as concept drift. Prediction quality
of models can be largely influenced by this phenomenon. Therefore, concept
drift is usually handled by retraining of the model. However, current research
lacks a recommendation which data should be selected for the retraining of the
machine learning model. Therefore, we systematically analyze different data
selection strategies in this work. Subsequently, we instantiate our findings on
a use case in process mining which is strongly affected by concept drift. We
can show that we can improve accuracy from 0.5400 to 0.7010 with concept drift
handling. Furthermore, we depict the effects of the different data selection
strategies
Types of cost in inductive concept learning
Inductive concept learning is the task of learning to assign cases to a discrete set of classes. In real-world applications of concept learning, there are many different types of cost involved. The majority of the machine learning literature ignores all types of cost (unless accuracy is interpreted as a type of cost measure). A few papers have investigated the cost of misclassification errors. Very few papers have examined the many other types of cost. In this paper, we attempt to create a taxonomy of the different types of cost that are involved in inductive concept learning. This taxonomy may help to organize the literature on cost-sensitive learning. We hope that it will inspire researchers to investigate all types of cost in inductive concept learning in more depth
Automated user modeling for personalized digital libraries
Digital libraries (DL) have become one of the most typical ways of accessing any kind of digitalized information. Due to this key role, users welcome any improvements on the services they receive from digital libraries. One trend used to
improve digital services is through personalization. Up to now, the most common approach for personalization in digital libraries has been user-driven. Nevertheless, the design of efficient personalized services has to be done, at least in part, in
an automatic way. In this context, machine learning techniques automate the process of constructing user models. This paper proposes a new approach to construct digital libraries that satisfy user’s necessity for information: Adaptive Digital Libraries, libraries that automatically learn user preferences and goals and personalize their interaction using this information
Recommended from our members
Classification of information systems research revisited: A keyword analysis approach
A number of studies have previously been conducted on keyword analysis in order to provide a comprehensive scheme to classify information systems (IS) research. However, these studies appeared prior to 1994, and IS research has clearly developed substantially since then with the emergence of areas such as electronic commerce, electronic government, electronic health and numerous others. Furthermore, the majority of European IS outlets - such as the European Journal of Information Systems and Information Systems Journal - were founded in the early 1990s, and keywords from these journals were not included in any previous work. Given that a number of studies have raised the issue of differences in European and North American IS research topics and approaches, it is arguable that any such analysis must consider sources from both locations to provide a representative and balanced view of IS classification. Moreover, it has also been argued that there is a need for further work in order to create a comprehensive keyword classification scheme reflecting the current state of the art. Consequently, the aim of this paper is to present the results of a keyword analysis utilizing keywords appearing in major peer-reviewed IS publications after the year 1990 through to 2007. This aim is realized by means of the two following objectives: (1) collect all keywords appearing in 24 peer reviewed IS journals after 1990; and (2) identify keywords not included in the previous IS keyword classification scheme. This paper also describes further research required in order to place new keywords in appropriate IS research categories. The paper makes an incremental contribution toward a contemporary means of classifying IS research. This work is important and useful for researchers in understanding the area and evolution of the IS field and also has implications for improving information search and retrieval activities
Survey of data mining approaches to user modeling for adaptive hypermedia
The ability of an adaptive hypermedia system to create tailored environments depends mainly on the amount and accuracy of information stored in each user model. Some of the difficulties that user modeling faces are the amount of data available to create user models, the adequacy of the data, the noise within that data, and the necessity of capturing the imprecise nature of human behavior. Data mining and machine learning techniques have the ability to handle large amounts of data and to process uncertainty. These characteristics make these techniques suitable for automatic generation of user models that simulate human decision making. This paper surveys different data mining techniques that can be used to efficiently and accurately capture user behavior. The paper also presents guidelines that show which techniques may be used more efficiently according to the task implemented by the applicatio
The category proliferation problem in ART neural networks
This article describes the design of a new model IKMART, for classification of documents and their incorporation into categories based on the KMART architecture. The architecture consists of two networks that mutually cooperate through the interconnection of weights and the output matrix of the coded documents. The architecture retains required network features such as incremental learning without the need of descriptive and input/output fuzzy data, learning acceleration and classification of documents and a minimal number of user-defined parameters. The conducted experiments with real documents showed a more precise categorization of documents and higher classification performance in comparison to the classic KMART algorithm.Web of Science145634
- …