    Application of neural networks and sensitivity analysis to improved prediction of trauma survival

    Improving customer engagement through the determinants of employee engagement

    In the field of marketing, an engagement orientation urges companies to co-create a broader range of activities with their customers and requires all employees to commit to the company and understand the value of customers engagement (Harmeling et al., 2017). Several previous studies have focused on how companies can develop customer engagement through marketing and employee engagement (Venkatesan, 2017). However, little research has examined the factors that determine the level of employee engagement in this relationship with the customers. The objective of this research is to identify which variables of the organization, associated with employee engagement, show a relationship or incidence in the customer engagement. To achieve this goal present study uses Radial Basis Function Neural Networks in order to model the behavior of the employee engagement of the selected sample. The results obtained provide significant information on how relationships with stakeholders, and especially customer engagement, can be improved.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

    Machine Learning with Sensitivity Analysis to Determine Key Factors Contributing to Energy Consumption in Cloud Data Centers

    Machine learning (ML) approach to modeling and predicting real-world dynamic system behaviours has received widespread research interest. While ML capability in approximating any nonlinear or complex system is promising, it is often a black-box approach, which lacks the physical meanings of the actual system structure and its parameters, as well as their impacts on the system. This paper establishes a model to provide explanation on how system parameters affect its output(s), as such knowledge would lead to potential useful, interesting and novel information. The paper builds on our previous work in machine learning, and also combines an evolutionary artificial neural networks with sensitivity analysis to extract and validate key factors affecting the cloud data center energy performance. This provides an opportunity for software analyst to design and develop energy-aware applications and for Hadoop administrator to optimize the Hadoop infrastructure by having Big Data partitioned in bigger chunks and shortening the time to complete MapReduce jobs

    Parameter selection for and implementation of a web-based decision-support tool to predict extubation outcome in premature infants

    BACKGROUND: Approximately 30% of intubated preterm infants with respiratory distress syndrome (RDS) will fail attempted extubation, requiring reintubation and mechanical ventilation. Although ventilator technology and monitoring of premature infants have improved over time, optimal extubation remains challenging. Furthermore, extubation decisions for premature infants require complex informational processing, techniques implicitly learned through clinical practice. Computer-aided decision-support tools would benefit inexperienced clinicians, especially during peak neonatal intensive care unit (NICU) census. METHODS: A five-step procedure was developed to identify predictive variables. Clinical expert (CE) thought processes comprised one model. Variables from that model were used to develop two mathematical models for the decision-support tool: an artificial neural network (ANN) and a multivariate logistic regression model (MLR). The ranking of the variables in the three models was compared using the Wilcoxon Signed Rank Test. The best performing model was used in a web-based decision-support tool with a user interface implemented in Hypertext Markup Language (HTML) and the mathematical model employing the ANN. RESULTS: CEs identified 51 potentially predictive variables for extubation decisions for an infant on mechanical ventilation. Comparisons of the three models showed a significant difference between the ANN and the CE (p = 0.0006). Of the original 51 potentially predictive variables, the 13 most predictive variables were used to develop an ANN as a web-based decision-tool. The ANN processes user-provided data and returns the prediction 0–1 score and a novelty index. The user then selects the most appropriate threshold for categorizing the prediction as a success or failure. Furthermore, the novelty index, indicating the similarity of the test case to the training case, allows the user to assess the confidence level of the prediction with regard to how much the new data differ from the data originally used for the development of the prediction tool. CONCLUSION: State-of-the-art, machine-learning methods can be employed for the development of sophisticated tools to aid clinicians' decisions. We identified numerous variables considered relevant for extubation decisions for mechanically ventilated premature infants with RDS. We then developed a web-based decision-support tool for clinicians which can be made widely available and potentially improve patient care world wide

    Video Analysis and Indexing

    Non-linear dimensionality reduction of signaling networks

    <p>Abstract</p> <p>Background</p> <p>Systems wide modeling and analysis of signaling networks is essential for understanding complex cellular behaviors, such as the biphasic responses to different combinations of cytokines and growth factors. For example, tumor necrosis factor (TNF) can act as a proapoptotic or prosurvival factor depending on its concentration, the current state of signaling network and the presence of other cytokines. To understand combinatorial regulation in such systems, new computational approaches are required that can take into account non-linear interactions in signaling networks and provide tools for clustering, visualization and predictive modeling.</p> <p>Results</p> <p>Here we extended and applied an unsupervised non-linear dimensionality reduction approach, Isomap, to find clusters of similar treatment conditions in two cell signaling networks: (I) apoptosis signaling network in human epithelial cancer cells treated with different combinations of TNF, epidermal growth factor (EGF) and insulin and (II) combination of signal transduction pathways stimulated by 21 different ligands based on AfCS double ligand screen data. For the analysis of the apoptosis signaling network we used the Cytokine compendium dataset where activity and concentration of 19 intracellular signaling molecules were measured to characterise apoptotic response to TNF, EGF and insulin. By projecting the original 19-dimensional space of intracellular signals into a low-dimensional space, Isomap was able to reconstruct clusters corresponding to different cytokine treatments that were identified with graph-based clustering. In comparison, Principal Component Analysis (PCA) and Partial Least Squares – Discriminant analysis (PLS-DA) were unable to find biologically meaningful clusters. We also showed that by using Isomap components for supervised classification with k-nearest neighbor (k-NN) and quadratic discriminant analysis (QDA), apoptosis intensity can be predicted for different combinations of TNF, EGF and insulin. Prediction accuracy was highest when early activation time points in the apoptosis signaling network were used to predict apoptosis rates at later time points. Extended Isomap also outperformed PCA on the AfCS double ligand screen data. Isomap identified more functionally coherent clusters than PCA and captured more information in the first two-components. The Isomap projection performs slightly worse when more signaling networks are analyzed; suggesting that the mapping function between cues and responses becomes increasingly non-linear when large signaling pathways are considered.</p> <p>Conclusion</p> <p>We developed and applied extended Isomap approach for the analysis of cell signaling networks. Potential biological applications of this method include characterization, visualization and clustering of different treatment conditions (i.e. low and high doses of TNF) in terms of changes in intracellular signaling they induce.</p

    Redes Neuronales Artificiales aplicadas al Análisis de Datos

    Este trabajo describe tres líneas de investigación desarrolladas en los últimos cinco años en torno a la aplicación de las Redes Neuronales Artificiales (RNA) en el ámbito del análisis de datos. Los campos de aplicación tratados son: el análisis de datos aplicado a conductas adictivas, el análisis de supervivencia, y el estudio del efecto de las variables de entrada en una red neuronal. Los resultados obtenidos ponen de manifiesto, en primer lugar, que las RNA son capaces de predecir el consumo de éxtasis con un margen de error pequeño a partir de las respuestas dadas a un cuestionario. Desde una perspectiva explicativa, el análisis de sensibilidad aplicado al modelo de red ha identificado los factores asociados al consumo de esta sustancia. En segundo lugar, los modelos de redes jerárquicas y secuenciales permiten el manejo de datos de supervivencia superando en algunos aspectos el rendimiento del modelo que tradicionalmente ha sido utilizado hasta el momento, el modelo de regresión de Cox. Por último, el análisis de sensibilidad numérico propuesto por nosotros es el procedimiento que permite evaluar con mayor exactitud la importancia o efecto de las variables de entrada de una red Perceptrón Multicapa. Por su parte, el programa informático Sensitivity Neural Network 1.0, desarrollado por nuestro equipo, permite simular el comportamiento de una red Perceptrón Multicapa e incorpora un conjunto de procedimientos numéricos y gráficos que han demostrado ser de utilidad en el análisis del efecto de las variables de entrada de una RNA.This work describes three lines of research developed in the last five years around the application of Artificial Neural Networks (ANN) in the field of the data analysis. The aplication fields are: the data analysis applied to addictive behaviors, the survival analysis, and the study of the effect of the input variables in a neural network. The results show, in the first place, that the ANN is able to predict the ecstasy consumption with a good accuracy through the answers given to a questionnaire. From an explanatory perspective, the sensitivity analysis applied to the network model has identified the factors associated to the consumption of this substance. In second place, the hierarchical and sequential network models allow to manage the survival data overcoming in some aspects the performance of the model that traditionally has been used until the moment, Cox regression model. Lastly, the numeric sensitivity analysis proposed by us is the procedure that allows to evaluate with more accuracy the importance or effect of the input variables in a Multilayer Perceptron network. On the other hand, the computer program Sensitivity Neural Network 1.0, developed by our team, allows to simulate the behavior of a Multilayer Perceptron and it incorporates a series of numeric and graphics procedures that have demonstrated being of utility in the analysis of the effect of the input variables in ANN

    Three essays on the use of neural networks for financial prediction

    The number of studies trying to explain the causes and consequences of the economic and financial crises usually rises considerably after a banking crisis occurs. The dramatic effects of the most recent financial crisis on the real economy around the world call for a better comprehension of previous crises as a way to anticipate future crisis episodes. It is precisely this objective, preventing future crises, the main motivation of this PhD dissertation. We identify two important mechanisms that have failed during the latest years and that are closely related to the onset of the financial crisis: The assessment of the solvency of banks along with the systemic risk over the time, and the detection of the macroeconomic imbalances in some countries, especially in Europe, which made the financial crisis evolve through a sovereign crisis. Our dissertation is made up of three different essays, trying to go a step ahead in the knowledge of these mechanisms.Departamento de Economía Financiera y ContabilidadDoctorado en Economía de la Empres

    Feature selection strategies for improving data-driven decision support in bank telemarketing

    The usage of data mining techniques to unveil previously undiscovered knowledge has been applied in past years to a wide number of domains, including banking and marketing. Raw data is the basic ingredient for successfully detecting interesting patterns. A key aspect of raw data manipulation is feature engineering and it is related with the correct characterization or selection of relevant features (or variables) that conceal relations with the target goal. This study is particularly focused on feature engineering, aiming at the unfolding features that best characterize the problem of selling long-term bank deposits through telemarketing campaigns. For the experimental setup, a case-study from a Portuguese bank, ranging the 2008-2013 year period and encompassing the recent global financial crisis, was addressed. To assess the relevance of such problem, a novel literature analysis using text mining and the latent Dirichlet allocation algorithm was conducted, confirming the existence of a research gap for bank telemarketing. Starting from a dataset containing typical telemarketing contacts and client information, research followed three different and complementary strategies: first, by enriching the dataset with social and economic context features; then, by including customer lifetime value related features; finally, by applying a divide and conquer strategy for splitting the problem in smaller fractions, leading to optimized sub-problems. Each of the three approaches improved previous results in terms of model metrics related to prediction performance. The relevance of the proposed features was evaluated, confirming the obtained models as credible and valuable for telemarketing campaign managers.A utilização de técnicas de data mining para a descoberta de conhecimento tem sido aplicada nos últimos anos a uma grande variedade de domínios, incluindo banca e marketing. Os dados no seu estado primitivo constituem o ingrediente básico para a deteção de padrões de informação. Um aspeto chave da manipulação de dados em bruto consiste na "engenharia de atributos", que compreende uma correta definição e seleção de atributos relevantes (ou variáveis) que se relacionem com o alvo da descoberta de conhecimento. Este trabalho foca-se numa abordagem de "engenharia de atributos" para definir as variáveis que melhor caraterizam o problema de vender depósitos bancários a prazo através de campanhas de telemarketing. Sendo um estudo empírico, foi utilizado um caso de estudo de um banco português, abrangendo o período 2008-2013, que inclui os efeitos da crise financeira internacional. Para aferir da importância deste problema, foi realizada uma inovadora análise da literatura recorrendo a text mining e ao algoritmo latent Dirichlet allocation, confirmando a existência de uma lacuna nesta matéria. Utilizando como base um conjunto de dados de contactos de telemarketing e informação sobre os clientes, três estratégias diferentes e complementares foram propostas: primeiro, os dados foram enriquecidos com atributos socioeconómicos; posteriormente, foram adicionadas características associadas ao valor do cliente ao longo do seu tempo de vida; finalmente, o problema foi dividido em problemas mais específicos, permitindo abordagens otimizadas a cada subproblema. Cada abordagem melhorou as métricas associadas à capacidade preditiva do modelo. Adicionalmente, a relevância dos atributos foi avaliada, confirmando os modelos obtidos como credíveis e valiosos para gestores de campanhas de telemarketing