106 research outputs found
Integrality gaps of semidefinite programs for Vertex Cover and relations to ell embeddability of negative type metrics
We study various SDP formulations for Vertex Cover by adding different constraints to the standard formulation. We rule out approximations better than
even when we add the so-called pentagonal inequality constraints to the standard SDP formulation, and thus almost meet the
best upper bound known due to Karakostas, of
. We further show the surprising fact that by strengthening the SDP with the (intractable) requirement that the metric interpretation
of the solution embeds into ℓ1 with no distortion, we get an exact relaxation (integrality gap is 1), and on the other hand if the solution is arbitrarily
close to being ℓ1 embeddable, the integrality gap is 2 − o(1). Finally, inspired by the above findings, we use ideas from the integrality gap construction of Charikar to provide a
family of simple examples for negative type metrics that cannot be embedded into ℓ1 with distortion better than 8/7 − ε. To this end we prove a new isoperimetric inequality for the hypercube.
</div
Experimentation and Analysis of Ensemble Deep Learning in IoT Applications
This paper presents an experimental study of Ensemble Deep Learning (DL) techniques for the analysis of time series data on IoT devices. We have shown in our earlier work that DL demonstrates superior performance compared to traditional machine learning techniques on fall detection applications due to the fact that important features in time series data can be learned and need not be determined manually by the domain expert. However, DL networks generally require large datasets for training. In the health care domain, such as the real-time smartwatch-based fall detection, there are no publicly available large annotated datasets that can be used for training, due to the nature of the problem (i.e. a fall is not a common event). Moreover, fall data is also inherently noisy since motions generated by the wrist-worn smartwatch can be mistaken for a fall. This paper explores combing DL (Recurrent Neural Network) with ensemble techniques (Stacking and AdaBoosting) using a fall detection application as a case study. We conducted a series of experiments using two different datasets of simulated falls for training various ensemble models. Our results show that an ensemble of deep learning models combined by the stacking ensemble technique, outperforms a single deep learning model trained on the same data samples, and thus, may be better suited for small-size datasets
DoctorEye: A clinically driven multifunctional platform, for accurate processing of tumors in medical images
Copyright @ Skounakis et al.This paper presents a novel, open access interactive platform for 3D medical image analysis, simulation and visualization, focusing in oncology images. The platform was developed through constant interaction and feedback from expert clinicians integrating a thorough analysis of their requirements while having an ultimate goal of assisting in accurately delineating tumors. It allows clinicians not only to work with a large number of 3D tomographic datasets but also to efficiently annotate multiple regions of interest in the same session. Manual and semi-automatic segmentation techniques combined with integrated correction tools assist in the quick and refined delineation of tumors while different users can add different components related to oncology such as tumor growth and simulation algorithms for improving therapy planning. The platform has been tested by different users and over large number of heterogeneous tomographic datasets to ensure stability, usability, extensibility and robustness with promising results. AVAILABILITY: THE PLATFORM, A MANUAL AND TUTORIAL VIDEOS ARE AVAILABLE AT: http://biomodeling.ics.forth.gr. It is free to use under the GNU General Public License
Using Synchronic and Diachronic Relations for Summarizing Multiple Documents Describing Evolving Events
In this paper we present a fresh look at the problem of summarizing evolving
events from multiple sources. After a discussion concerning the nature of
evolving events we introduce a distinction between linearly and non-linearly
evolving events. We present then a general methodology for the automatic
creation of summaries from evolving events. At its heart lie the notions of
Synchronic and Diachronic cross-document Relations (SDRs), whose aim is the
identification of similarities and differences between sources, from a
synchronical and diachronical perspective. SDRs do not connect documents or
textual elements found therein, but structures one might call messages.
Applying this methodology will yield a set of messages and relations, SDRs,
connecting them, that is a graph which we call grid. We will show how such a
grid can be considered as the starting point of a Natural Language Generation
System. The methodology is evaluated in two case-studies, one for linearly
evolving events (descriptions of football matches) and another one for
non-linearly evolving events (terrorist incidents involving hostages). In both
cases we evaluate the results produced by our computational systems.Comment: 45 pages, 6 figures. To appear in the Journal of Intelligent
Information System
Розробка модуля отримання демографічних та клінічних даних про пацієнта для експертної системи оцінювання ризику серцево – судинних захворювань у хворих на артеріальну гіпертензію
Signaling data from the cellular networks can provide a means of analyzing the efficiency of a deployed transportation system and assisting in the formulation of transport models to predict its future use. An approach based on this type of data can be especially appealing for transportation systems that need massive expansions, since it has the added benefit that no specialized equipment or installations are required, hence it can be very cost efficient. Within this context in this paper we describe how such obtained data can be processed and used in order to act as enablers for traditional transportation analysis models. We outline a layered, modular architectural framework that encompasses the entire process and present results from initial analysis of mobile phone call data in the context of mobility, transport and transport infrastructure. We finally introduce the Mobility Analytics Platform, developed by Ericsson Research, tailored for mobility analysis, and discuss techniques for analyzing transport supply and demand, and give indication on how cell phone use data can be used directly to analyze the status and use of the current transport infrastructure
Adaptation strategy to mitigate the impact of climate change on water resources in arid and semi-arid regions : a case study
Climate change and drought phenomena impacts have become a growing concern for water resources engineers and policy makers, mainly in arid and semi-arid areas. This study aims to contribute to the development of a decision support tool to prepare water resources managers and planners for climate change adaptation. The Hydrologiska Byråns Vattenbalansavdelning (The Water Balance Department of the Hydrological Bureau) hydrologic model was used to define the boundary conditions for the reservoir capacity yield model comprising daily reservoir inflow from a representative example watershed with the size of 14,924 km2 into a reservoir with the capacity of 6.80 Gm3. The reservoir capacity yield model was used to simulate variability in climate change-induced differences in reservoir capacity needs and performance (operational probability of failure, resilience, and vulnerability). Owing to the future precipitation reduction and potential evapotranspiration increase during the worst case scenario (−40% precipitation and +30% potential evapotranspiration), substantial reductions in streamflow of between −56% and −58% are anticipated for the dry and wet seasons, respectively. Furthermore, model simulations recommend that as a result of future climatic conditions, the reservoir operational probability of failure would generally increase due to declined reservoir inflow. The study developed preparedness plans to combat the consequences of climate change and drought
Complaint-driven Training Data Debugging for Query 2.0
As the need for machine learning (ML) increases rapidly across all industry
sectors, there is a significant interest among commercial database providers to
support "Query 2.0", which integrates model inference into SQL queries.
Debugging Query 2.0 is very challenging since an unexpected query result may be
caused by the bugs in training data (e.g., wrong labels, corrupted features).
In response, we propose Rain, a complaint-driven training data debugging
system. Rain allows users to specify complaints over the query's intermediate
or final output, and aims to return a minimum set of training examples so that
if they were removed, the complaints would be resolved. To the best of our
knowledge, we are the first to study this problem. A naive solution requires
retraining an exponential number of ML models. We propose two novel heuristic
approaches based on influence functions which both require linear retraining
steps. We provide an in-depth analytical and empirical analysis of the two
approaches and conduct extensive experiments to evaluate their effectiveness
using four real-world datasets. Results show that Rain achieves the highest
recall@k among all the baselines while still returns results interactively.Comment: Proceedings of the 2020 ACM SIGMOD International Conference on
Management of Dat
A global view of the OCA2-HERC2 region and pigmentation
Mutations in the gene OCA2 are responsible for oculocutaneous albinism type 2, but polymorphisms in and around OCA2 have also been associated with normal pigment variation. In Europeans, three haplotypes in the region have been shown to be associated with eye pigmentation and a missense SNP (rs1800407) has been associated with green/hazel eyes (Branicki et al. in Ann Hum Genet 73:160–170, 2009). In addition, a missense mutation (rs1800414) is a candidate for light skin pigmentation in East Asia (Yuasa et al. in Biochem Genet 45:535–542, 2007; Anno et al. in Int J Biol Sci 4, 2008). We have genotyped 3,432 individuals from 72 populations for 21 SNPs in the OCA2-HERC2 region including those previously associated with eye or skin pigmentation. We report that the blue-eye associated alleles at all three haplotypes were found at high frequencies in Europe; however, one is restricted to Europe and surrounding regions, while the other two are found at moderate to high frequencies throughout the world. We also observed that the derived allele of rs1800414 is essentially limited to East Asia where it is found at high frequencies. Long-range haplotype tests provide evidence of selection for the blue-eye allele at the three haplotyped systems but not for the green/hazel eye SNP allele. We also saw evidence of selection at the derived allele of rs1800414 in East Asia. Our data suggest that the haplotype restricted to Europe is the strongest marker for blue eyes globally and add further inferential evidence that the derived allele of rs1800414 is an East Asian skin pigmentation allele
The problems of selecting problems
We face several teaching problems where a set of exercises has to be selected based on their capability to make students discover typical misconceptions or their capability to evaluate the knowledge of the students. We consider four different optimization problems, developed from two basic decision problems. The first two optimization problems consist in selecting a set of exercises reaching some required levels of coverage for each topic. In the first problem we minimize the total time required to present the selected exercises, whereas the surplus coverage of topics is maximized in the second problem. The other two optimization problems consist in composing an exam in such a way that each student misconception reduces the overall mark of the exam to some specific required extent. In particular, we consider the problem of minimizing the size of the exam fulfilling these mark reduction constraints, and the problem of minimizing the differences between the required marks losses due to each misconception and the actual ones in the composed exam. For each optimization problem, we formally identify its approximation hardness and we heuristically solve it by using a genetic algorithm. We report experimental results for a case study based on a set of real exercises of Discrete Mathematics, a Computer Science degree subject
- …