1,020 research outputs found
Modeling wine preferences by data mining from physicochemical properties
We propose a data mining approach to predict human wine taste preferences that
is based on easily available analytical tests at the certification step. A large dataset
(when compared to other studies in this domain) is considered, with white and red
vinho verde samples (from Portugal). Three regression techniques were applied, un-
der a computationally efficient procedure that performs simultaneous variable and
model selection. The support vector machine achieved promising results, outper-
forming the multiple regression and neural network methods. Such model is useful
to support the oenologist wine tasting evaluations and improve wine production.
Furthermore, similar techniques can help in target marketing by modeling consumer
tastes from niche markets.We would like to thank Cristina Lagido and the anonymous reviewers for their helpful comments. The work of P. Cortez is supported by the FCT project PTDC/EIA/64541/2006
Moment-based Estimation of Mixtures of Regression Models
Finite mixtures of regression models provide a flexible modeling framework
for many phenomena. Using moment-based estimation of the regression parameters,
we develop unbiased estimators with a minimum of assumptions on the mixture
components. In particular, only the average regression model for one of the
components in the mixture model is needed and no requirements on the
distributions. The consistency and asymptotic distribution of the estimators is
derived and the proposed method is validated through a series of simulation
studies and is shown to be highly accurate. We illustrate the use of the
moment-based mixture of regression models with an application to wine quality
data.Comment: 17 pages, 3 figure
Application of machine learning to predict quality of Portuguese wine based on sensory preferences
Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceTechnology has been broadly used in the wine industry, from vineyards to purchases, improving means or understanding customers' preferences. Numerous companies are using machine learning solutions to leverage their business. Henceforth, the sensory properties of wines constitute a significant element to determine wine quality, that combined with the accuracy of predictive models attained by classification methods, could be helpful to support winemakers enhance their outcomes. This research proposes a supervised machine learning approach to predict the quality of Portuguese wines based on sensory characteristics such as acidity, intensity, sweetness, and tannin. Additionally, this study includes red and white wines, implements, and compare the effectiveness of three classification algorithms. The conclusions promote understanding the importance of the sensory characteristics that influence the wine quality throughout customers' perception.Tecnologia vem sendo amplamente empregada na indústria do vinho. Desde melhoria em processos de cultivo à compreensão de mercado por meio da análise de preferência de consumidores. Tendo em vista à atual dinâmica dos mercados, empresas estão gradualmente a considerar soluções que implementam conceitos de aprendizagem de máquina e tragam diferencial competitivo para potencializar o negócio. Doravante, propriedades sensoriais são importantes elementos para determinação da qualidade do vinho, que aliado à precisão obtida por modelos preditivos podem auxiliar produtores de vinho a melhorar produtos e resultados. O presente estudo propõe a elaboração de modelos de aprendizado supervisionado, baseado em algoritmos de classificação a fim de prever qualidade de vinhos portugueses a partir de dados sensoriais detetados por consumidores como acidez, intensidade, açúcar e taninos. A pesquisa inclui vinhos tintos e brancos; implementa e compara a efetividade de três algoritmos de classificação. Não obstante, o estudo permite compreender como dados sensoriais fornecidos por consumidores podem determinar a qualidade de vinhos, bem como perceber quais características contribuem no processo de avaliação
Chapter Prediction of wine sensorial quality: a classification problem
When dealing with a wine, it is of interest to be able to predict its quality based on chemical and/or sensory variables. There is no agreement on what wine quality means, or how it should be assessed and it is often viewed in intrinsic (physicochemical, sensory) or extrinsic (price, prestige, context) terms (Jackson, 2017). In this paper, the wine quality was evaluated by experienced judges who scored the wine on the base of a 0-10 scale, with 0 meaning very bad and 10 excellent, so, the resulting variable was categorical. The models applied to predict this variable provide the prediction of the occurrence probabilities of each of its categories. Nevertheless, jointly with this probabilities’ record, the practitioners need the predicted value (category) of the variable, so the statistical problem to be covered refers to the way in which this probabilities’ record is transformed into a single value. In this paper we compare the predictive performances of the default method (Bayes Classifier - BC), which assigns a unit to the most likely category, and other two methods (Maximum Difference Classifier and Maximum Ratio Classifier). The BC is the optimal criterion if one is interested in the accuracy of the classification, but, given that it favors the prevalent category most, when there is not a category of interest, it cannot be the best choice. The data under study concern the quality of the red variant of the Portuguese "Vinho Verde" wine (Cortez et al., 2009), measured on a 0-10 scale. Nevertheless, only 6 scores were used, with 2 scores with a very few number of observations, so this is the right context for predictive performance comparisons. In the study, we investigated different merging of categories and we used 11 explanatory variables to estimate the probabilities’ record of the wine quality variable
Quantifying Model Complexity via Functional Decomposition for Better Post-Hoc Interpretability
Post-hoc model-agnostic interpretation methods such as partial dependence
plots can be employed to interpret complex machine learning models. While these
interpretation methods can be applied regardless of model complexity, they can
produce misleading and verbose results if the model is too complex, especially
w.r.t. feature interactions. To quantify the complexity of arbitrary machine
learning models, we propose model-agnostic complexity measures based on
functional decomposition: number of features used, interaction strength and
main effect complexity. We show that post-hoc interpretation of models that
minimize the three measures is more reliable and compact. Furthermore, we
demonstrate the application of these measures in a multi-objective optimization
approach which simultaneously minimizes loss and complexity
Ensemble Deep Learning
Machine learning has become a common tool within the tech industry due to its high versatility and efficiency with large datasets. Partnering with the Nevada National Security Site, our goal is to improve accuracy of machine predictions by utilizing deep learning, which will enable the power and accuracy of a prediction to grow from the model. To build a deep learning model, multiple neural network architectures were developed and combined to create an ensemble neural network. The project’s objective is to determine the comparative differences between the efficiency of the ensemble neural network versus each individual neural network. The data set used to test, validate, and train the networks is 1D regressive. After testing architecture and determining accuracy of certain networks, the model will be updated and tested again to compare accuracies. Accuracy is the number of correct predictions over the total number of predictions. As model precision is a key aspect of machine learning, emphasis is placed on the efficiency of ensemble neural networks
- …