1,020 research outputs found

    Modeling wine preferences by data mining from physicochemical properties

    Get PDF
    We propose a data mining approach to predict human wine taste preferences that is based on easily available analytical tests at the certification step. A large dataset (when compared to other studies in this domain) is considered, with white and red vinho verde samples (from Portugal). Three regression techniques were applied, un- der a computationally efficient procedure that performs simultaneous variable and model selection. The support vector machine achieved promising results, outper- forming the multiple regression and neural network methods. Such model is useful to support the oenologist wine tasting evaluations and improve wine production. Furthermore, similar techniques can help in target marketing by modeling consumer tastes from niche markets.We would like to thank Cristina Lagido and the anonymous reviewers for their helpful comments. The work of P. Cortez is supported by the FCT project PTDC/EIA/64541/2006

    Moment-based Estimation of Mixtures of Regression Models

    Get PDF
    Finite mixtures of regression models provide a flexible modeling framework for many phenomena. Using moment-based estimation of the regression parameters, we develop unbiased estimators with a minimum of assumptions on the mixture components. In particular, only the average regression model for one of the components in the mixture model is needed and no requirements on the distributions. The consistency and asymptotic distribution of the estimators is derived and the proposed method is validated through a series of simulation studies and is shown to be highly accurate. We illustrate the use of the moment-based mixture of regression models with an application to wine quality data.Comment: 17 pages, 3 figure

    Application of machine learning to predict quality of Portuguese wine based on sensory preferences

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceTechnology has been broadly used in the wine industry, from vineyards to purchases, improving means or understanding customers' preferences. Numerous companies are using machine learning solutions to leverage their business. Henceforth, the sensory properties of wines constitute a significant element to determine wine quality, that combined with the accuracy of predictive models attained by classification methods, could be helpful to support winemakers enhance their outcomes. This research proposes a supervised machine learning approach to predict the quality of Portuguese wines based on sensory characteristics such as acidity, intensity, sweetness, and tannin. Additionally, this study includes red and white wines, implements, and compare the effectiveness of three classification algorithms. The conclusions promote understanding the importance of the sensory characteristics that influence the wine quality throughout customers' perception.Tecnologia vem sendo amplamente empregada na indústria do vinho. Desde melhoria em processos de cultivo à compreensão de mercado por meio da análise de preferência de consumidores. Tendo em vista à atual dinâmica dos mercados, empresas estão gradualmente a considerar soluções que implementam conceitos de aprendizagem de máquina e tragam diferencial competitivo para potencializar o negócio. Doravante, propriedades sensoriais são importantes elementos para determinação da qualidade do vinho, que aliado à precisão obtida por modelos preditivos podem auxiliar produtores de vinho a melhorar produtos e resultados. O presente estudo propõe a elaboração de modelos de aprendizado supervisionado, baseado em algoritmos de classificação a fim de prever qualidade de vinhos portugueses a partir de dados sensoriais detetados por consumidores como acidez, intensidade, açúcar e taninos. A pesquisa inclui vinhos tintos e brancos; implementa e compara a efetividade de três algoritmos de classificação. Não obstante, o estudo permite compreender como dados sensoriais fornecidos por consumidores podem determinar a qualidade de vinhos, bem como perceber quais características contribuem no processo de avaliação

    Chapter Prediction of wine sensorial quality: a classification problem

    Get PDF
    When dealing with a wine, it is of interest to be able to predict its quality based on chemical and/or sensory variables. There is no agreement on what wine quality means, or how it should be assessed and it is often viewed in intrinsic (physicochemical, sensory) or extrinsic (price, prestige, context) terms (Jackson, 2017). In this paper, the wine quality was evaluated by experienced judges who scored the wine on the base of a 0-10 scale, with 0 meaning very bad and 10 excellent, so, the resulting variable was categorical. The models applied to predict this variable provide the prediction of the occurrence probabilities of each of its categories. Nevertheless, jointly with this probabilities’ record, the practitioners need the predicted value (category) of the variable, so the statistical problem to be covered refers to the way in which this probabilities’ record is transformed into a single value. In this paper we compare the predictive performances of the default method (Bayes Classifier - BC), which assigns a unit to the most likely category, and other two methods (Maximum Difference Classifier and Maximum Ratio Classifier). The BC is the optimal criterion if one is interested in the accuracy of the classification, but, given that it favors the prevalent category most, when there is not a category of interest, it cannot be the best choice. The data under study concern the quality of the red variant of the Portuguese "Vinho Verde" wine (Cortez et al., 2009), measured on a 0-10 scale. Nevertheless, only 6 scores were used, with 2 scores with a very few number of observations, so this is the right context for predictive performance comparisons. In the study, we investigated different merging of categories and we used 11 explanatory variables to estimate the probabilities’ record of the wine quality variable

    Quantifying Model Complexity via Functional Decomposition for Better Post-Hoc Interpretability

    Full text link
    Post-hoc model-agnostic interpretation methods such as partial dependence plots can be employed to interpret complex machine learning models. While these interpretation methods can be applied regardless of model complexity, they can produce misleading and verbose results if the model is too complex, especially w.r.t. feature interactions. To quantify the complexity of arbitrary machine learning models, we propose model-agnostic complexity measures based on functional decomposition: number of features used, interaction strength and main effect complexity. We show that post-hoc interpretation of models that minimize the three measures is more reliable and compact. Furthermore, we demonstrate the application of these measures in a multi-objective optimization approach which simultaneously minimizes loss and complexity

    Ensemble Deep Learning

    Get PDF
    Machine learning has become a common tool within the tech industry due to its high versatility and efficiency with large datasets. Partnering with the Nevada National Security Site, our goal is to improve accuracy of machine predictions by utilizing deep learning, which will enable the power and accuracy of a prediction to grow from the model. To build a deep learning model, multiple neural network architectures were developed and combined to create an ensemble neural network. The project’s objective is to determine the comparative differences between the efficiency of the ensemble neural network versus each individual neural network. The data set used to test, validate, and train the networks is 1D regressive. After testing architecture and determining accuracy of certain networks, the model will be updated and tested again to compare accuracies. Accuracy is the number of correct predictions over the total number of predictions. As model precision is a key aspect of machine learning, emphasis is placed on the efficiency of ensemble neural networks
    corecore