2,684 research outputs found
Using Big Data to Enhance the Bosch Production Line Performance: A Kaggle Challenge
This paper describes our approach to the Bosch production line performance
challenge run by Kaggle.com. Maximizing the production yield is at the heart of
the manufacturing industry. At the Bosch assembly line, data is recorded for
products as they progress through each stage. Data science methods are applied
to this huge data repository consisting records of tests and measurements made
for each component along the assembly line to predict internal failures. We
found that it is possible to train a model that predicts which parts are most
likely to fail. Thus a smarter failure detection system can be built and the
parts tagged likely to fail can be salvaged to decrease operating costs and
increase the profit margins.Comment: IEEE Big Data 2016 Conferenc
Click-through rate prediction : a comparative study of ensemble techniques in real-time bidding
Dissertation presented as a partial requirement for obtaining the Master’s degree in Information Management, with a specialization in Business Intelligence and Knowledge ManagementReal-Time Bidding is an automated mechanism to buy and sell ads in real time that uses data collected from internet users, to accurately deliver the right audience to the best-matched advertisers. It goes beyond contextual advertising by motivating the bidding focused on user data and also, it is different from the sponsored search auction where the bid price is associated with keywords. There is extensive literature regarding the classification and prediction of performance metrics such as click-through-rate, impression rate and bidding price. However, there is limited research on the application of advanced machine learning techniques, such as ensemble methods, on predicting click-through rate of real-time bidding campaigns. This paper presents an in-depth analysis of predicting click-through rate in real-time bidding campaigns by comparing the classification results from six traditional classification models (Linear Discriminant Analysis, Logistic Regression, Regularised Regression, Decision trees, k-nearest neighbors and Support Vector Machines) with two popular ensemble learning techniques (Voting and BootStrap Aggregation). The goal of our research is to determine whether ensemble methods can accurately predict click-through rate and compared to standard classifiers. Results showed that ensemble techniques outperformed simple classifiers performance. Moreover, also, highlights the excellent performance of linear algorithms (Linear Discriminant Analysis and Regularized Regression)
- …