1,559 research outputs found

    A review of homogenous ensemble methods on the classification of breast cancer data

    Get PDF
    In the last decades, emerging data mining technology has been introduced to assist humankind in generating relevant decisions. Data mining is a concept established by computer scientists to lead a secure and reliable classification and deduction of data. In the medical field, data mining methods can assist in performing various medical diagnoses, including breast cancer. As evolution happens, ensemble methods are being proposed to achieve better performance in classification. This technique reinforced the use of multiple classifiers in the model. The review of the homogenous ensemble method on breast cancer classification is being carried out to identify the overall performance. The results of the reviewed ensemble techniques, such as Random Forest and XGBoost, show that ensemble methods can outperform the performance of the single classifier method. The reviewed ensemble methods have pros and cons and are useful for solving breast cancer classification problems. The methods are being discussed thoroughly to examine the overall performance in the classification

    More-than-words: Reconceptualising Two-year-old Children’s Onto-epistemologies Through Improvisation and the Temporal Arts

    Get PDF
    This thesis project takes place at a time of increasing focus upon two-year-old children and the words they speak. On the one hand there is a mounting pressure, driven by the school readiness agenda, to make children talk as early as possible. On the other hand, there is an increased interest in understanding children’s communication in order to create effective pedagogies. More-than-words (MTW) examines an improvised art-education practice that combines heterogenous elements: sound, movement and materials (such as silk, string, light) to create encounters for young children, educators and practitioners from diverse backgrounds. During these encounters, adults adopt a practice of stripping back their words in order to tune into the polyphonic ways that children are becoming-with the world. For this research-creation, two MTW sessions for two-year-old children and their carers took place in a specially created installation. These sessions were filmed on a 360˚ camera, nursery school iPad and on a specially made child-friendly Toddler-cam (Tcam) that rolled around in the installation-event with the children. Through using the frameless technology of 360˚ film, I hoped to make tangible the relation and movement of an emergent and improvised happening and the way in which young children operate fluidly through multiple modes. Travelling with posthuman, Deleuzio-Guattarian and feminist vital material philosophy, I wander and wonder speculatively through practice, memory, and film data as a bag lady, a Haraway-ian writer/artist/researcher-creator who resists the story of the wordless child as lacking and tragic; the story that positions the word as heroic. Instead, through returning to the uncertainty of improvisation, I attempt to tune into the savage, untamed and wild music of young children’s animistic onto-epistemologies

    Protecting Privacy in Indian Schools: Regulating AI-based Technologies' Design, Development and Deployment

    Get PDF
    Education is one of the priority areas for the Indian government, where Artificial Intelligence (AI) technologies are touted to bring digital transformation. Several Indian states have also started deploying facial recognition-enabled CCTV cameras, emotion recognition technologies, fingerprint scanners, and Radio frequency identification tags in their schools to provide personalised recommendations, ensure student security, and predict the drop-out rate of students but also provide 360-degree information of a student. Further, Integrating Aadhaar (digital identity card that works on biometric data) across AI technologies and learning and management systems (LMS) renders schools a ‘panopticon’. Certain technologies or systems like Aadhaar, CCTV cameras, GPS Systems, RFID tags, and learning management systems are used primarily for continuous data collection, storage, and retention purposes. Though they cannot be termed AI technologies per se, they are fundamental for designing and developing AI systems like facial, fingerprint, and emotion recognition technologies. The large amount of student data collected speedily through the former technologies is used to create an algorithm for the latter-stated AI systems. Once algorithms are processed using machine learning (ML) techniques, they learn correlations between multiple datasets predicting each student’s identity, decisions, grades, learning growth, tendency to drop out, and other behavioural characteristics. Such autonomous and repetitive collection, processing, storage, and retention of student data without effective data protection legislation endangers student privacy. The algorithmic predictions by AI technologies are an avatar of the data fed into the system. An AI technology is as good as the person collecting the data, processing it for a relevant and valuable output, and regularly evaluating the inputs going inside an AI model. An AI model can produce inaccurate predictions if the person overlooks any relevant data. However, the state, school administrations and parents’ belief in AI technologies as a panacea to student security and educational development overlooks the context in which ‘data practices’ are conducted. A right to privacy in an AI age is inextricably connected to data practices where data gets ‘cooked’. Thus, data protection legislation operating without understanding and regulating such data practices will remain ineffective in safeguarding privacy. The thesis undergoes interdisciplinary research that enables a better understanding of the interplay of data practices of AI technologies with social practices of an Indian school, which the present Indian data protection legislation overlooks, endangering students’ privacy from designing and developing to deploying stages of an AI model. The thesis recommends the Indian legislature frame better legislation equipped for the AI/ML age and the Indian judiciary on evaluating the legality and reasonability of designing, developing, and deploying such technologies in schools

    Machine Learning Approaches for the Prioritisation of Cardiovascular Disease Genes Following Genome- wide Association Study

    Get PDF
    Genome-wide association studies (GWAS) have revealed thousands of genetic loci, establishing itself as a valuable method for unravelling the complex biology of many diseases. As GWAS has grown in size and improved in study design to detect effects, identifying real causal signals, disentangling from other highly correlated markers associated by linkage disequilibrium (LD) remains challenging. This has severely limited GWAS findings and brought the method’s value into question. Although thousands of disease susceptibility loci have been reported, causal variants and genes at these loci remain elusive. Post-GWAS analysis aims to dissect the heterogeneity of variant and gene signals. In recent years, machine learning (ML) models have been developed for post-GWAS prioritisation. ML models have ranged from using logistic regression to more complex ensemble models such as random forests and gradient boosting, as well as deep learning models (i.e., neural networks). When combined with functional validation, these methods have shown important translational insights, providing a strong evidence-based approach to direct post-GWAS research. However, ML approaches are in their infancy across biological applications, and as they continue to evolve an evaluation of their robustness for GWAS prioritisation is needed. Here, I investigate the landscape of ML across: selected models, input features, bias risk, and output model performance, with a focus on building a prioritisation framework that is applied to blood pressure GWAS results and tested on re-application to blood lipid traits

    Introduction to Facial Micro Expressions Analysis Using Color and Depth Images: A Matlab Coding Approach (Second Edition, 2023)

    Full text link
    The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment. FMER is a subset of image processing and it is a multidisciplinary topic to analysis. So, it requires familiarity with other topics of Artifactual Intelligence (AI) such as machine learning, digital image processing, psychology and more. So, it is a great opportunity to write a book which covers all of these topics for beginner to professional readers in the field of AI and even without having background of AI. Our goal is to provide a standalone introduction in the field of MFER analysis in the form of theorical descriptions for readers with no background in image processing with reproducible Matlab practical examples. Also, we describe any basic definitions for FMER analysis and MATLAB library which is used in the text, that helps final reader to apply the experiments in the real-world applications. We believe that this book is suitable for students, researchers, and professionals alike, who need to develop practical skills, along with a basic understanding of the field. We expect that, after reading this book, the reader feels comfortable with different key stages such as color and depth image processing, color and depth image representation, classification, machine learning, facial micro-expressions recognition, feature extraction and dimensionality reduction. The book attempts to introduce a gentle introduction to the field of Facial Micro Expressions Recognition (FMER) using Color and Depth images, with the aid of MATLAB programming environment.Comment: This is the second edition of the boo

    Hospital length of stay prediction tools for all hospital admissions and general medicine populations: systematic review and meta-analysis

    Get PDF
    BackgroundUnwarranted extended length of stay (LOS) increases the risk of hospital-acquired complications, morbidity, and all-cause mortality and needs to be recognized and addressed proactively.ObjectiveThis systematic review aimed to identify validated prediction variables and methods used in tools that predict the risk of prolonged LOS in all hospital admissions and specifically General Medicine (GenMed) admissions.MethodLOS prediction tools published since 2010 were identified in five major research databases. The main outcomes were model performance metrics, prediction variables, and level of validation. Meta-analysis was completed for validated models. The risk of bias was assessed using the PROBAST checklist.ResultsOverall, 25 all admission studies and 14 GenMed studies were identified. Statistical and machine learning methods were used almost equally in both groups. Calibration metrics were reported infrequently, with only 2 of 39 studies performing external validation. Meta-analysis of all admissions validation studies revealed a 95% prediction interval for theta of 0.596 to 0.798 for the area under the curve. Important predictor categories were co-morbidity diagnoses and illness severity risk scores, demographics, and admission characteristics. Overall study quality was deemed low due to poor data processing and analysis reporting.ConclusionTo the best of our knowledge, this is the first systematic review assessing the quality of risk prediction models for hospital LOS in GenMed and all admissions groups. Notably, both machine learning and statistical modeling demonstrated good predictive performance, but models were infrequently externally validated and had poor overall study quality. Moving forward, a focus on quality methods by the adoption of existing guidelines and external validation is needed before clinical application.Systematic review registrationhttps://www.crd.york.ac.uk/PROSPERO/, identifier: CRD42021272198

    Mixed-Integer Projections for Automated Data Correction of EMRs Improve Predictions of Sepsis among Hospitalized Patients

    Full text link
    Machine learning (ML) models are increasingly pivotal in automating clinical decisions. Yet, a glaring oversight in prior research has been the lack of proper processing of Electronic Medical Record (EMR) data in the clinical context for errors and outliers. Addressing this oversight, we introduce an innovative projections-based method that seamlessly integrates clinical expertise as domain constraints, generating important meta-data that can be used in ML workflows. In particular, by using high-dimensional mixed-integer programs that capture physiological and biological constraints on patient vitals and lab values, we can harness the power of mathematical "projections" for the EMR data to correct patient data. Consequently, we measure the distance of corrected data from the constraints defining a healthy range of patient data, resulting in a unique predictive metric we term as "trust-scores". These scores provide insight into the patient's health status and significantly boost the performance of ML classifiers in real-life clinical settings. We validate the impact of our framework in the context of early detection of sepsis using ML. We show an AUROC of 0.865 and a precision of 0.922, that surpasses conventional ML models without such projections

    Prognosis of symptomatic patients with Brugada Syndrome through electrocardiogram biomarkers and machine learning

    Get PDF
    La Síndrome de Brugada (BrS) és un trastorn cardiovascular poc comú però greu que pot causar batecs perillosament ràpids i es caracteritza per presentar un conjunt particular de patrons d'electrocardiograma (ECG) als seus pacients. És una condició molt impredictible. Moltes persones no presenten cap símptoma, mentre que per altres, lamentablement, el primer símptoma és la mort. Per a pacients d'alt risc es recomana col•locar un desfibril•lador cardioversor implantable. Desafortunadament, això té greus riscos associats, com infeccions i descàrregues inadequades, per la qual cosa és clau identificar aquests pacients d'alt risc correctament. L'objectiu d'aquest projecte ha estat desenvolupar eines basades en aprenentatge automàtic que poguessin diferenciar els pacients amb Síndrome de Brugada simptomàtics dels quals no ho són. Es van considerar pacients simptomàtics aquells que s'havien recuperat de mort cardíaca, van patir un síncope arritmogènic o taquicàrdia sostinguda. Per fer-ho, després d'una investigació de l'estat de l'art dels temes pertinents, es van extreure diversos biomarcadors relacionats amb els patrons d'ECG de Brugada a partir de registres d'ECG de 24 hores de 45 pacients diferents, després d'haver estat processats per mitjà de promediat de senyal per reduir el soroll. Aquests biomarcadors, juntament amb algunes dades clíniques, es van separar de diferents maneres per entrenar i provar diferents models de classificació automatitzats basats en aprenentatge automàtic. Els resultats obtinguts dels models han estat molt pobres. Cap d'ells no ha pogut classificar de manera fiable els pacients amb BrS com es desitjava. Això no obstant, d'aquesta primera aproximació es poden extreure conclusions valuoses per assolir l'objectiu del projecte, i s’han desenvolupat eines útils que poden permetre un processament més ràpid de la base de dades utilitzada.El Síndrome de Brugada (BrS) es un trastorno cardiovascular poco común pero grave que puede causar latidos peligrosamente rápidos y se caracteriza por presentar un conjunto particular de patrones de electrocardiograma (ECG) en sus pacientes. Es una condición muy impredecible. Muchas personas no presentan ningún síntoma, mientras que para otras, lamentablemente, el primer síntoma es la muerte. Para pacientes de alto riesgo se recomienda la colocación de un desfibrilador cardioversor implantable. Desafortunadamente, eso tiene graves riesgos asociados, como infecciones y descargas inapropiadas, por lo que es clave identificar a esos pacientes de alto riesgo correctamente. El objetivo de este proyecto era desarrollar herramientas basadas en aprendizaje automático que puedan diferenciar a los pacientes con Síndrome de Brugada sintomáticos de aquellos que no lo son. Se consideraron pacientes sintomáticos aquellos que se habían recuperado de muerte cardiaca, sufrieron un síncope arritmogénico o una taquicardia sostenida. Para ello, tras una investigación del estado del arte de los temas relevantes, se extrajeron varios biomarcadores relacionados con los patrones de ECG de Brugada a partir de registros de ECG de 24h de 45 pacientes diferentes, después de haber sido procesados mediante promedio de señal para reducir su ruido. Estos biomarcadores, junto con algunos datos clínicos, se separaron de diferentes maneras para entrenar y probar diferentes modelos de clasificación automatizados basados en aprendizaje automático. Los resultados de los modelos obtenidos han sido muy pobres. Ninguno de ellos pudo clasificar de manera confiable a los pacientes con BrS como se deseaba. No obstante, de esta primera aproximación se pueden extraer valiosas conclusiones para continuar avanzando hacia el objetivo perseguido, y se desarrollaron herramientas útiles que permitirán un procesamiento más rápido de la base de datos utilizada.The Brugada Syndrome (BrS) is a rare but serious cardiovascular disorder that can cause dangerously fast heartbeats and is characterized by a particular set of electrocardiogram (ECG) patterns. It’s a very unpredictable condition. Many people don’t present symptoms at all, while for others, unfortunately, the first symptom is death. For high risk patients, having an implantable cardioverter-defibrillator placed is recommended. Unfortunately, that has severe risks associated, like infections and inappropriate shocks, so it’s key to identify those high risk patients. The objective of this project was to develop machine learning based tools that are able to tell symptomatic Brugada Syndrome patients apart from those who are not. Symptomatic patients were considered those who had recovered from cardiac death, suffered an arrhythmogenic syncope or sustained tachycardia. In order to do so, after an investigation of the state of the art of the relevant subjects, several biomarkers related with Brugada ECG patterns were extracted from 24h ECG recordings of 45 different patients, after having been processed by signal averaging in order to reduce their noise. Those biomarkers, alongside some clinical data, were then separated in different ways in order to train and test different machine learning based automated classifier models. The performances of those models were very poor. None of them was able to reliably classify BrS patients as desired. Nevertheless, valuable conclusions can be extracted from this first approach to pursue the intended goal further, and useful tools were developed that would allow for a faster processing of the database used

    Design and Evolution of Deep Convolutional Neural Networks in Image Classification – A Review

    Get PDF
    Convolutional Neural Network(CNN) is a well-known computer vision approach successfully applied for various classification and recognition problems. It has an outstanding power to identify patterns in 1D and 2D data. Though invented in 80's, it became hugely successful after LeCun's work on digit identification. Several CNN based models have been developed to record splendid performance on ImageNet and other databases. Ability of the CNN in learning complex features at different hierarchy from the data had made it the most successful among deep learning algorithms. Innovative architectural designs and hyperaparameter optimization have greatly improved the efficiency of CNN in pattern recognition. This review majorly focuses on the evolution and history of CNN models. Landmark CNN architectures are discussed with their categorization depending on various parameters. In addition, this also explores the architectural details of different layers, activation function, optimizers and other hyperparameters used by CNN. Review concludes by shedding the light on the applications and observations to be considered while designing the network

    WEIGH-IN-MOTION DATA-DRIVEN PAVEMENT PERFORMANCE PREDICTION MODELS

    Get PDF
    The effective functioning of pavements as a critical component of the transportation system necessitates the implementation of ongoing maintenance programs to safeguard this significant and valuable infrastructure and guarantee its optimal performance. The maintenance, rehabilitation, and reconstruction (MRR) program of the pavement structure is dependent on a multidimensional decision-making process, which considers the existing pavement structural condition and the anticipated future performance. Pavement Performance Prediction Models (PPPMs) have become indispensable tools for the efficient implementation of the MRR program and the minimization of associated costs by providing precise predictions of distress and roughness based on inventory and monitoring data concerning the pavement structure\u27s state, traffic load, and climatic conditions. The integration of PPPMs has become a vital component of Pavement Management Systems (PMSs), facilitating the optimization, prioritization, scheduling, and selection of maintenance strategies. Researchers have developed several PPPMs with differing objectives, and each PPPM has demonstrated distinct strengths and weaknesses regarding its applicability, implementation process, and data requirements for development. Traditional statistical models, such as linear regression, are inadequate in handling complex nonlinear relationships between variables and often generate less precise results. Machine Learning (ML)-based models have become increasingly popular due to their ability to manage vast amounts of data and identify meaningful relationships between them to generate informative insights for better predictions. To create ML models for pavement performance prediction, it is necessary to gather a significant amount of historical data on pavement and traffic loading conditions. The Long-Term Pavement Performance Program (LTPP) initiated by the Federal Highway Administration (FHWA) offers a comprehensive repository of data on the environment, traffic, inventory, monitoring, maintenance, and rehabilitation works that can be utilized to develop PPPMs. The LTPP also includes Weigh-In-Motion (WIM) data that provides information on traffic, such as truck traffic, total traffic, directional distribution, and the number of different axle types of vehicles. High-quality traffic loading data can play an essential role in improving the performance of PPPMs, as the Mechanistic-Empirical Pavement Design Guide (MEPDG) considers vehicle types and axle load characteristics to be critical inputs for pavement design. The collection of high-quality traffic loading data has been a challenge in developing Pavement Performance Prediction Models (PPPMs). The Weigh-In-Motion (WIM) system, which comprises WIM scales, has emerged as an innovative solution to address this issue. By leveraging computer vision and machine learning techniques, WIM systems can collect accurate data on vehicle type and axle load characteristics, which are critical factors affecting the performance of flexible pavements. Excessive dynamic loading caused by heavy vehicles can result in the early disintegration of the pavement structure. The Long-Term Pavement Performance Program (LTPP) provides an extensive repository of WIM data that can be utilized to develop accurate PPPMs for predicting pavement future behavior and tolerance. The incorporation of comprehensive WIM data collected from LTPP has the potential to significantly improve the accuracy and effectiveness of PPPMs. To develop artificial neural network (ANN) based pavement performance prediction models (PPPMs) for seven distinct performance indicators, including IRI, longitudinal crack, transverse crack, fatigue crack, potholes, polished aggregate, and patch failure, a total of 300 pavement sections with WIM data were selected from the United States of America. Data collection spanned 20 years, from 2001 to 2020, and included information on pavement age, material properties, climatic properties, structural properties, and traffic-related characteristics. The primary dataset was then divided into two distinct subsets: one which included WIMgenerated traffic data and another which excluded WIM-generated traffic data. Data cleaning and normalization were meticulously performed using the Z-score normalization method. Each subset was further divided into two separate groups: the first containing 15 years of data for model training and the latter containing 5 years of data for testing purposes. Principal Component Analysis (PCA) was then employed to reduce the number of input variables for the model. Based on a cumulative Proportion of Variation (PoV) of 96%, 12 input variables were selected. Subsequently, a single hidden layer ANN model with 12 neurons was generated for each performance indicator. The study\u27s results indicate that incorporating Weigh-In-Motion (WIM)-generated traffic loading data can significantly enhance the accuracy and efficacy of pavement performance prediction models (PPPMs). This improvement further supports the suitability of optimized pavement maintenance scheduling with minimal costs, while also ensuring timely repairs to promote acceptable serviceability and structural stability of the pavement. The contributions of this research are twofold: first, it provides an enhanced understanding of the positive impacts that high-quality traffic loading data has on pavement conditions; and second, it explores potential applications of WIM data within the Pavement Management System (PMS)
    • …
    corecore