106 research outputs found

    Integrality gaps of semidefinite programs for Vertex Cover and relations to ell1_1 embeddability of negative type metrics

    Get PDF
    We study various SDP formulations for Vertex Cover by adding different constraints to the standard formulation. We rule out approximations better than even when we add the so-called pentagonal inequality constraints to the standard SDP formulation, and thus almost meet the best upper bound known due to Karakostas, of . We further show the surprising fact that by strengthening the SDP with the (intractable) requirement that the metric interpretation of the solution embeds into &#8467;1 with no distortion, we get an exact relaxation (integrality gap is 1), and on the other hand if the solution is arbitrarily close to being &#8467;1 embeddable, the integrality gap is 2&#8201;&#8722;&#8201;o(1). Finally, inspired by the above findings, we use ideas from the integrality gap construction of Charikar to provide a family of simple examples for negative type metrics that cannot be embedded into &#8467;1 with distortion better than 8/7&#8201;&#8722;&#8201;&#949;. To this end we prove a new isoperimetric inequality for the hypercube. </div

    Experimentation and Analysis of Ensemble Deep Learning in IoT Applications

    Get PDF
    This paper presents an experimental study of Ensemble Deep Learning (DL) techniques for the analysis of time series data on IoT devices. We have shown in our earlier work that DL demonstrates superior performance compared to traditional machine learning techniques on fall detection applications due to the fact that important features in time series data can be learned and need not be determined manually by the domain expert. However, DL networks generally require large datasets for training. In the health care domain, such as the real-time smartwatch-based fall detection, there are no publicly available large annotated datasets that can be used for training, due to the nature of the problem (i.e. a fall is not a common event). Moreover, fall data is also inherently noisy since motions generated by the wrist-worn smartwatch can be mistaken for a fall. This paper explores combing DL (Recurrent Neural Network) with ensemble techniques (Stacking and AdaBoosting) using a fall detection application as a case study. We conducted a series of experiments using two different datasets of simulated falls for training various ensemble models. Our results show that an ensemble of deep learning models combined by the stacking ensemble technique, outperforms a single deep learning model trained on the same data samples, and thus, may be better suited for small-size datasets

    DoctorEye: A clinically driven multifunctional platform, for accurate processing of tumors in medical images

    Get PDF
    Copyright @ Skounakis et al.This paper presents a novel, open access interactive platform for 3D medical image analysis, simulation and visualization, focusing in oncology images. The platform was developed through constant interaction and feedback from expert clinicians integrating a thorough analysis of their requirements while having an ultimate goal of assisting in accurately delineating tumors. It allows clinicians not only to work with a large number of 3D tomographic datasets but also to efficiently annotate multiple regions of interest in the same session. Manual and semi-automatic segmentation techniques combined with integrated correction tools assist in the quick and refined delineation of tumors while different users can add different components related to oncology such as tumor growth and simulation algorithms for improving therapy planning. The platform has been tested by different users and over large number of heterogeneous tomographic datasets to ensure stability, usability, extensibility and robustness with promising results. AVAILABILITY: THE PLATFORM, A MANUAL AND TUTORIAL VIDEOS ARE AVAILABLE AT: http://biomodeling.ics.forth.gr. It is free to use under the GNU General Public License

    Using Synchronic and Diachronic Relations for Summarizing Multiple Documents Describing Evolving Events

    Full text link
    In this paper we present a fresh look at the problem of summarizing evolving events from multiple sources. After a discussion concerning the nature of evolving events we introduce a distinction between linearly and non-linearly evolving events. We present then a general methodology for the automatic creation of summaries from evolving events. At its heart lie the notions of Synchronic and Diachronic cross-document Relations (SDRs), whose aim is the identification of similarities and differences between sources, from a synchronical and diachronical perspective. SDRs do not connect documents or textual elements found therein, but structures one might call messages. Applying this methodology will yield a set of messages and relations, SDRs, connecting them, that is a graph which we call grid. We will show how such a grid can be considered as the starting point of a Natural Language Generation System. The methodology is evaluated in two case-studies, one for linearly evolving events (descriptions of football matches) and another one for non-linearly evolving events (terrorist incidents involving hostages). In both cases we evaluate the results produced by our computational systems.Comment: 45 pages, 6 figures. To appear in the Journal of Intelligent Information System

    Розробка модуля отримання демографічних та клінічних даних про пацієнта для експертної системи оцінювання ризику серцево – судинних захворювань у хворих на артеріальну гіпертензію

    Get PDF
    Signaling data from the cellular networks can provide a means of analyzing the efficiency of a deployed transportation system and assisting in the formulation of transport models to predict its future use. An approach based on this type of data can be especially appealing for transportation systems that need massive expansions, since it has the added benefit that no specialized equipment or installations are required, hence it can be very cost efficient. Within this context in this paper we describe how such obtained data can be processed and used in order to act as enablers for traditional transportation analysis models. We outline a layered, modular architectural framework that encompasses the entire process and present results from initial analysis of mobile phone call data in the context of mobility, transport and transport infrastructure. We finally introduce the Mobility Analytics Platform, developed by Ericsson Research, tailored for mobility analysis, and discuss techniques for analyzing transport supply and demand, and give indication on how cell phone use data can be used directly to analyze the status and use of the current transport infrastructure

    Adaptation strategy to mitigate the impact of climate change on water resources in arid and semi-arid regions : a case study

    Get PDF
    Climate change and drought phenomena impacts have become a growing concern for water resources engineers and policy makers, mainly in arid and semi-arid areas. This study aims to contribute to the development of a decision support tool to prepare water resources managers and planners for climate change adaptation. The Hydrologiska Byråns Vattenbalansavdelning (The Water Balance Department of the Hydrological Bureau) hydrologic model was used to define the boundary conditions for the reservoir capacity yield model comprising daily reservoir inflow from a representative example watershed with the size of 14,924 km2 into a reservoir with the capacity of 6.80 Gm3. The reservoir capacity yield model was used to simulate variability in climate change-induced differences in reservoir capacity needs and performance (operational probability of failure, resilience, and vulnerability). Owing to the future precipitation reduction and potential evapotranspiration increase during the worst case scenario (−40% precipitation and +30% potential evapotranspiration), substantial reductions in streamflow of between −56% and −58% are anticipated for the dry and wet seasons, respectively. Furthermore, model simulations recommend that as a result of future climatic conditions, the reservoir operational probability of failure would generally increase due to declined reservoir inflow. The study developed preparedness plans to combat the consequences of climate change and drought

    Complaint-driven Training Data Debugging for Query 2.0

    Full text link
    As the need for machine learning (ML) increases rapidly across all industry sectors, there is a significant interest among commercial database providers to support "Query 2.0", which integrates model inference into SQL queries. Debugging Query 2.0 is very challenging since an unexpected query result may be caused by the bugs in training data (e.g., wrong labels, corrupted features). In response, we propose Rain, a complaint-driven training data debugging system. Rain allows users to specify complaints over the query's intermediate or final output, and aims to return a minimum set of training examples so that if they were removed, the complaints would be resolved. To the best of our knowledge, we are the first to study this problem. A naive solution requires retraining an exponential number of ML models. We propose two novel heuristic approaches based on influence functions which both require linear retraining steps. We provide an in-depth analytical and empirical analysis of the two approaches and conduct extensive experiments to evaluate their effectiveness using four real-world datasets. Results show that Rain achieves the highest recall@k among all the baselines while still returns results interactively.Comment: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Dat

    A global view of the OCA2-HERC2 region and pigmentation

    Get PDF
    Mutations in the gene OCA2 are responsible for oculocutaneous albinism type 2, but polymorphisms in and around OCA2 have also been associated with normal pigment variation. In Europeans, three haplotypes in the region have been shown to be associated with eye pigmentation and a missense SNP (rs1800407) has been associated with green/hazel eyes (Branicki et al. in Ann Hum Genet 73:160–170, 2009). In addition, a missense mutation (rs1800414) is a candidate for light skin pigmentation in East Asia (Yuasa et al. in Biochem Genet 45:535–542, 2007; Anno et al. in Int J Biol Sci 4, 2008). We have genotyped 3,432 individuals from 72 populations for 21 SNPs in the OCA2-HERC2 region including those previously associated with eye or skin pigmentation. We report that the blue-eye associated alleles at all three haplotypes were found at high frequencies in Europe; however, one is restricted to Europe and surrounding regions, while the other two are found at moderate to high frequencies throughout the world. We also observed that the derived allele of rs1800414 is essentially limited to East Asia where it is found at high frequencies. Long-range haplotype tests provide evidence of selection for the blue-eye allele at the three haplotyped systems but not for the green/hazel eye SNP allele. We also saw evidence of selection at the derived allele of rs1800414 in East Asia. Our data suggest that the haplotype restricted to Europe is the strongest marker for blue eyes globally and add further inferential evidence that the derived allele of rs1800414 is an East Asian skin pigmentation allele

    The problems of selecting problems

    Get PDF
    We face several teaching problems where a set of exercises has to be selected based on their capability to make students discover typical misconceptions or their capability to evaluate the knowledge of the students. We consider four different optimization problems, developed from two basic decision problems. The first two optimization problems consist in selecting a set of exercises reaching some required levels of coverage for each topic. In the first problem we minimize the total time required to present the selected exercises, whereas the surplus coverage of topics is maximized in the second problem. The other two optimization problems consist in composing an exam in such a way that each student misconception reduces the overall mark of the exam to some specific required extent. In particular, we consider the problem of minimizing the size of the exam fulfilling these mark reduction constraints, and the problem of minimizing the differences between the required marks losses due to each misconception and the actual ones in the composed exam. For each optimization problem, we formally identify its approximation hardness and we heuristically solve it by using a genetic algorithm. We report experimental results for a case study based on a set of real exercises of Discrete Mathematics, a Computer Science degree subject
    corecore