5,098 research outputs found

    Modeling Stroke Diagnosis with the Use of Intelligent Techniques

    Get PDF
    The purpose of this work is to test the efficiency of specific intelligent classification algorithms when dealing with the domain of stroke medical diagnosis. The dataset consists of patient records of the ”Acute Stroke Unit”, Alexandra Hospital, Athens, Greece, describing patients suffering one of 5 different stroke types diagnosed by 127 diagnostic attributes / symptoms collected during the first hours of the emergency stroke situation as well as during the hospitalization and recovery phase of the patients. Prior to the application of the intelligent classifier the dimensionality of the dataset is further reduced using a variety of classic and state of the art dimensionality reductions techniques so as to capture the intrinsic dimensionality of the data. The results obtained indicate that the proposed methodology achieves prediction accuracy levels that are comparable to those obtained by intelligent classifiers trained on the original feature space

    The Minimum Wiener Connector

    Full text link
    The Wiener index of a graph is the sum of all pairwise shortest-path distances between its vertices. In this paper we study the novel problem of finding a minimum Wiener connector: given a connected graph G=(V,E)G=(V,E) and a set QVQ\subseteq V of query vertices, find a subgraph of GG that connects all query vertices and has minimum Wiener index. We show that The Minimum Wiener Connector admits a polynomial-time (albeit impractical) exact algorithm for the special case where the number of query vertices is bounded. We show that in general the problem is NP-hard, and has no PTAS unless P=NP\mathbf{P} = \mathbf{NP}. Our main contribution is a constant-factor approximation algorithm running in time O~(QE)\widetilde{O}(|Q||E|). A thorough experimentation on a large variety of real-world graphs confirms that our method returns smaller and denser solutions than other methods, and does so by adding to the query set QQ a small number of important vertices (i.e., vertices with high centrality).Comment: Published in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Dat

    Symbolic Regression as Feature Engineering Method for Machine and Deep Learning Regression Tasks

    Full text link
    In the realm of machine and deep learning regression tasks, the role of effective feature engineering (FE) is pivotal in enhancing model performance. Traditional approaches of FE often rely on domain expertise to manually design features for machine learning models. In the context of deep learning models, the FE is embedded in the neural network's architecture, making it hard for interpretation. In this study, we propose to integrate symbolic regression (SR) as an FE process before a machine learning model to improve its performance. We show, through extensive experimentation on synthetic and real-world physics-related datasets, that the incorporation of SR-derived features significantly enhances the predictive capabilities of both machine and deep learning regression models with 34-86% root mean square error (RMSE) improvement in synthetic datasets and 4-11.5% improvement in real-world datasets. In addition, as a realistic use-case, we show the proposed method improves the machine learning performance in predicting superconducting critical temperatures based on Eliashberg theory by more than 20% in terms of RMSE. These results outline the potential of SR as an FE component in data-driven models

    Distributed texture-based terrain synthesis

    Get PDF
    Terrain synthesis is an important field of Computer Graphics that deals with the generation of 3D landscape models for use in virtual environments. The field has evolved to a stage where large and even infinite landscapes can be generated in realtime. However, user control of the generation process is still minimal, as well as the creation of virtual landscapes that mimic real terrain. This thesis investigates the use of texture synthesis techniques on real landscapes to improve realism and the use of sketch-based interfaces to enable intuitive user control

    A genetic algorithm coupled with tree-based pruning for mining closed association rules

    Get PDF
    Due to the voluminous amount of itemsets that are generated, the association rules extracted from these itemsets contain redundancy, and designing an effective approach to address this issue is of paramount importance. Although multiple algorithms were proposed in recent years for mining closed association rules most of them underperform in terms of run time or memory. Another issue that remains challenging is the nature of the dataset. While some of the existing algorithms perform well on dense datasets others perform well on sparse datasets. This paper aims to handle these drawbacks by using a genetic algorithm for mining closed association rules. Recent studies have shown that genetic algorithms perform better than conventional algorithms due to their bitwise operations of crossover and mutation. Bitwise operations are predominantly faster than conventional approaches and bits consume lesser memory thereby improving the overall performance of the algorithm. To address the redundancy in the mined association rules a tree-based pruning algorithm has been designed here. This works on the principle of minimal antecedent and maximal consequent. Experiments have shown that the proposed approach works well on both dense and sparse datasets while surpassing existing techniques with regard to run time and memory

    Graph-based Regularization in Machine Learning: Discovering Driver Modules in Biological Networks

    Get PDF
    Curiosity of human nature drives us to explore the origins of what makes each of us different. From ancient legends and mythology, Mendel\u27s law, Punnett square to modern genetic research, we carry on this old but eternal question. Thanks to technological revolution, today\u27s scientists try to answer this question using easily measurable gene expression and other profiling data. However, the exploration can easily get lost in the data of growing volume, dimension, noise and complexity. This dissertation is aimed at developing new machine learning methods that take data from different classes as input, augment them with knowledge of feature relationships, and train classification models that serve two goals: 1) class prediction for previously unseen samples; 2) knowledge discovery of the underlying causes of class differences. Application of our methods in genetic studies can help scientist take advantage of existing biological networks, generate diagnosis with higher accuracy, and discover the driver networks behind the differences. We proposed three new graph-based regularization algorithms. Graph Connectivity Constrained AdaBoost algorithm combines a connectivity module, a deletion function, and a model retraining procedure with the AdaBoost classifier. Graph-regularized Linear Programming Support Vector Machine integrates penalty term based on submodular graph cut function into linear classifier\u27s objective function. Proximal Graph LogisticBoost adds lasso and graph-based penalties into logistic risk function of an ensemble classifier. Results of tests of our models on simulated biological datasets show that the proposed methods are able to produce accurate, sparse classifiers, and can help discover true genetic differences between phenotypes

    Genomic evidence of past and future climate-linked loss in a migratory Arctic fish

    Get PDF
    Acknowledgements We thank staff of the Newfoundland DFO Salmonids section, Parks Canada, the Nunatsiavut Government, the NunatuKavut Community Council, the Sivunivut Inuit Community Corporation, the Innu Nation, the Labrador Hunting and Fishing Association and fishers for their support, participation and tissue collections and the staff of the Aquatic Biotechnology Lab at the Bedford Institute of Oceanography for DNA extractions. This study was supported by the Ocean Frontier Institute, a Genomics Research and Development Initiative (GRDI) Grant, a Natural Sciences and Engineering Research Council (NSERC) Discovery Grant and Strategic Project Grant to I.R.B., the Weston Family Award for research at the Torngat Mountains Base Camp and an Atlantic Canada Opportunities Agency and Department of Tourism, Culture, Industry and Innovation grant allocated to the Labrador Institute. Author Correction: Layton, K.K.S., Snelgrove, P.V.R., Dempson, J.B. et al. Author Correction: Genomic evidence of past and future climate-linked loss in a migratory Arctic fish. Nat. Clim. Chang. 11, 551 (2021). https://doi.org/10.1038/s41558-021-01023-8Peer reviewedPostprin

    Simulating The Impact of Emissions Control on Economic Productivity Using Particle Systems and Puff Dispersion Model

    Get PDF
    A simulation platform is developed for quantifying the change in productivity of an economy under passive and active emission control mechanisms. The program uses object-oriented programming to code a collection of objects resembling typical stakeholders in an economy. These objects include firms, markets, transportation hubs, and boids which are distributed over a 2D surface. Firms are connected using a modified Prim’s Minimum spanning tree algorithm, followed by implementation of an all-pair shortest path Floyd Warshall algorithm for navigation purposes. Firms use a non-linear production function for transformation of land, labor, and capital inputs to finished product. A GA-Vehicle Routing Problem with multiple pickups and drop-offs is implemented for efficient delivery of commodities across multiple nodes in the economy. Boids are autonomous agents which perform several functions in the economy including labor, consumption, renting, saving, and investing. Each boid is programmed with several microeconomic functions including intertemporal choice models, Hicksian and Marshallian demand function, and labor-leisure model. The simulation uses a Puff Dispersion model to simulate the advection and diffusion of emissions from point and mobile sources in the economy. A dose-response function is implemented to quantify depreciation of a Boid’s health upon contact with these emissions. The impact of emissions control on productivity and air quality is examined through a series of passive and active emission control scenarios. Passive control examines the impact of various shutdown times on economic productivity and rate of emissions exposure experienced by boids. The active control strategy examines the effects of acceptable levels of emissions exposure on economic productivity. The key findings on 7 different scenarios of passive and active emissions controls indicate that rate of productivity and consumption in an economy declines with increased scrutiny of emissions from point sources. In terms of exposure rates, the point sources may not be the primary source of average exposure rates, however they significantly impact the maximum exposure rate experienced by a boid. Tightening of emissions control also negatively impacts the transportation sector by reducing the asset utilization rate as well as reducing the total volume of goods transported across the economy
    corecore