48 research outputs found
Automated Model Selection with AMSFin a production process of the automotive industry
Machine learning, statistics and knowledge engineering provide a broad variety of supervised learning algorithms for classification. In this paper we introduce the Automated Model Selection Framework (AMSF) which presents automatic and semi-automatic methods to select classifiers. To achieve this we split up the selection process into three distinct phases. Two of those select algorithms by static rules which are derived from a manually created knowledgebase. At this stage of AMSF the user can choose between different rankers in the third phase. Currently, we use instance based learning and a scoring scheme for ranking the classifiers. After evaluation of different rankers we will recommend the most successful to the user by default. Besides describing the architecture and design issues, we additionally point out the versatile ways AMSF is applied in a production process of the automotive industr
Knowledge Search within a Company-WIKI
The usage of Wikis for the purpose of knowledge management within a business company is only of value if the stored information can be found easily. The fundamental characteristic of a Wiki, its easy and informal usage, results in large amounts of steadily changing, unstructured documents. The widely used full-text search often provides search results of insufficient accuracy. In this paper, we will present an approach likely to improve search quality, through the use of Semantic Web, Text Mining, and Case Based Reasoning (CBR) technologies. Search results are more precise and complete because, in contrast to full-text search, the proposed knowledge-based search operates on the semantic layer
Constraining the Search Space in Temporal Pattern Mining
Agents in dynamic environments have to deal with complex situations including various temporal interrelations of actions and events. Discovering frequent patterns in such scenes can be useful in order to create prediction rules which can be used to predict future activities or situations. We present the algorithm MiTemP which learns frequent patterns based on a time intervalbased relational representation. Additionally the problem has also been transfered to a pure relational association rule mining task which can be handled by WARMR. The two approaches are compared in a number of experiments. The experiments show the advantage of avoiding the creation of impossible or redundant patterns with MiTemP. While less patterns have to be explored on average with MiTemP more frequent patterns are found at an earlier refinement level
From Personal Memories to Sharable Memories
The exchange of personal experiences is a way of supporting decision making and interpersonal communication. In this article, we discuss how augmented personal memories could be exploited in order to support such a sharing. We start with a brief summary of a system implementing an augmented memory for a single user. Then, we exploit results from interviews to define an example scenario involving sharable memories. This scenario serves as background for a discussion of various questions related to sharing memories and potential approaches to their solution. We especially focus on the selection of relevant experiences and sharing partners, sharing methods, and the configuration of those sharing methods by means of reflection
Crime Pattern Detection Using Data Mining
Can crimes be modeled as data mining problems? We will try to answer this question in this paper. Crimes are a social nuisance and cost our society dearly in several ways. Any research that can help in solving crimes faster will pay for itself. Here we look at use of clustering algorithm for a data mining approach to help detect the crimes patterns and speed up the process of solving crime. We will look at k-means clustering with some enhancements to aid in the process of identification of crime patterns. We will apply these techniques to real crime data from a sheriff’s office and validate our results. We also use semi-supervised learning technique here for knowledge discovery from the crime records and to help increase the predictive accuracy. We also developed a weighting scheme for attributes here to deal with limitations of various out of the box clustering tools and techniques. This easy to implement machine learning framework works with the geo-spatial plot of crime and helps to improve the productivity of the detectives and other law enforcement officers. It can also be applied for counter terrorism for homeland security
The effects of topic familiarity on user search behavior in question answering systems
This paper reports on experiments that attempt
to characterize the relationship between users
and their knowledge of the search topic in a
Question Answering (QA) system. It also
investigates user search behavior with respect
to the length of answers presented by a QA
system. Two lengths of answers were
compared; snippets (one to two sentences of
text) and exact answers. A user test was
conducted, 92 factoid questions were judged
by 44 participants, to explore the participants’
preferences, feelings and opinions about QA
system tasks. The conclusions drawn from the
results were that participants preferred and
obtained higher accuracy in finding answers
from the snippets set. However, accuracy
varied according to users’ topic familiarity;
users were only substantially helped by the
wider context of a snippet if they were already
familiar with the topic of the question, without
such familiarity, users were about as accurate
at locating answers from the snippets as they
were in exact set
Designing Semantic Kernels as Implicit Superconcept Expansions
Recently, there has been an increased interest in the exploitation of background knowledge in the context of text mining tasks, especially text classification. At the same time, kernel-based learning algorithms like Support Vector Machines have become a dominant paradigm in the text mining community. Amongst other reasons, this is also due to their capability to achieve more accurate learning results by replacing standard linear kernel (bag-of-words) with customized kernel functions which incorporate additional apriori knowledge. In this paper we propose a new approach to the design of ‘semantic smoothing kernels’ by means of an implicit superconcept expansion using well-known measures of term similarity. The experimental evaluation on two different datasets indicates that our approach consistently improves performance in situations where (i) training data is scarce or (ii) the bag-ofwords representation is too sparse to build stable models when using the linear kernel
Users' effectiveness and satisfaction for image retrieval
This paper presents results from an initial user
study exploring the relationship between system
effectiveness as quantified by traditional
measures such as precision and recall, and users’
effectiveness and satisfaction of the results. The
tasks involve finding images for recall-based
tasks. It was concluded that no direct relationship
between system effectiveness and users’
performance could be proven (as shown by
previous research). People learn to adapt to a
system regardless of its effectiveness. This study
recommends that a combination of attributes
(e.g. system effectiveness, user performance and
satisfaction) is a more effective way to evaluate
interactive retrieval systems. Results of this
study also reveal that users are more concerned
with accuracy than coverage of the search
results
FolkRank: A Ranking Algorithm for Folksonomies
In social bookmark tools users are setting up lightweight conceptual structures called folksonomies. Currently, the information retrieval support is limited. We present a formal model and a new search algorithm for folksonomies, called FolkRank, that exploits the structure of the folksonomy. The proposed algorithm is also applied to find communities within the folksonomy and is used to structure search results. All findings are demonstrated on a large scale dataset. A long version of this paper has been published at the European Semantic Web Conference 2006
Integration von Qualitätsdaten für Produktionsanlagen
In automatisierten Produktionsanlagen werden mehr und mehr Sensorsysteme eingesetzt, um die produzierte Qualität zu überwachen und auf Basis gesammelter Prozessdaten sicherzustellen. Die Heterogenität der an unterschiedlichen Stellen im Prozess integrierten Sensoren erfordert einen Ansatz zur einfachen Integration. Ziel der Integration ist die für verschiedene Rollen aufbereitete Qualitätssicht, die auch ein Feedback zur Fehlerdeduktion beinhaltet. In diesem Erfahrungsbericht wird der im Projekt BridgeIT entwickelte Ansatz zur syntaktischen und semantischen Integration von Qualitätsdaten vorgestellt. Der Ansatz ermöglicht insbesondere eine einfache Anbindung neuer Sensorsysteme