66 research outputs found

    Variable Selection in Maximum Mean Discrepancy for Interpretable Distribution Comparison

    Full text link
    Two-sample testing decides whether two datasets are generated from the same distribution. This paper studies variable selection for two-sample testing, the task being to identify the variables (or dimensions) responsible for the discrepancies between the two distributions. This task is relevant to many problems of pattern analysis and machine learning, such as dataset shift adaptation, causal inference and model validation. Our approach is based on a two-sample test based on the Maximum Mean Discrepancy (MMD). We optimise the Automatic Relevance Detection (ARD) weights defined for individual variables to maximise the power of the MMD-based test. For this optimisation, we introduce sparse regularisation and propose two methods for dealing with the issue of selecting an appropriate regularisation parameter. One method determines the regularisation parameter in a data-driven way, and the other aggregates the results of different regularisation parameters. We confirm the validity of the proposed methods by systematic comparisons with baseline methods, and demonstrate their usefulness in exploratory analysis of high-dimensional traffic simulation data. Preliminary theoretical analyses are also provided, including a rigorous definition of variable selection for two-sample testing

    Comparison of predictive and descriptive models in order to plan the monitoring and research on the rock partridge (Alectoris graeca) in the North Eastern Alps

    Get PDF
    Within the implementation of the Management Plan for the Alpi Carniche region (SPA IT3321001, SCI IT3320001, SCI IT3320002, SCI IT3320003, SCI IT3320004) and the realization of the monitoring plan referred to art. 8 of RL No. 7/2008 (Friuli Venezia Giulia) some predictive and descriptive models for the presence and abundance of rock partridge Alectoris graeca saxatilis have been developed and tested. During 2010 the monitoring plan has been carried out during the spring (play-back censuses) and the summer (pointing dog censuses) in 10 sample areas to assess the presence, abundance and reproductive success of the species. These areas have been identified through expert knowledge and predictive models developed by the superimposition on regional UTM 1x1 kilometer grid quadrants of some CORINE Biotopes habitat parameters (open vegetation coverage >50% and open + transitional vegetation coverage >80%) and slope (>10%) and elevation (1000-2200 m above sea level), subsequently ranked from 0 to 4 for a suitability index. The census results related to UTM quadrants (n = 46, 40% with the presence of partridges) and buffer areas (100 meters of radius) created from the locations of the observed animals and the transect points of the censuses (n = 89) have been described by linear selection models that contain habitat classes from the Habitat Map of Friuli Venezia Giulia (Map of the Nature at the scale 1:50.000, ISPRA 2009) and morphological characteristics such as slope, elevation and aspect. The descriptive models have selected different variables according to the season (reproductive and post-reproductive), identifying the presence of Eastern Alpine calcicolous larch with moorland as one of the most important variables to define habitat suitability. Moreover, the descriptive models that use the lesser spatial scale (100 m buffer) seemed to describe better the presence and abundance of this species. The predictive models however were inappropriate to describe the presence of this species and should be used with caution to plan the monitoring activities. The research was supported by the Friuli Venezia Giulia Autonomous Region

    Neuroendocrine Dysregulation in Irritable Bowel Syndrome Patients: A Pilot Study

    Get PDF
    BACKGROUND/AIMS: Irritable bowel syndrome (IBS) is a multifactorial disorder, involving dysregulation of brain-gut axis. Our aim was to evaluate the neuroendocrine activity in IBS. METHODS: Thirty IBS and 30 healthy subjects were enrolled. Psychological symptoms were evaluated by questionnaires. Urinary 5-hydroxyindoleacetic acid (5-HIAA), plasma serotonin, endothelin, neuropeptide Y (NPY), plasma, and urinary cortisol levels were evaluated. Fourteen IBS subjects underwent microneurography to obtain multiunit recordings of efferent postganglionic muscle sympathetic nerve activity (MSNA). RESULTS: Prevalent psychological symptoms in IBS were maladjustment (60%), trait (40%) and state (17%) anxiety, obsessive compulsive-disorders (23%), and depressive symptoms (23%). IBS showed increased NPY (31.9 [43.7] vs 14.8 [18.1] pmol/L, P = 0.006), serotonin (214.9 [182.6] vs 141.0 [45.5] pg/mL, P = 0.010), and endothelin [1.1 [1.4] vs 2.1 [8.1], P = 0.054], compared to healthy subjects. Moreover, plasma NPY, endothelin, cortisol and serotonin, and urinary 5-HIAA were associated with some psychological disorders (P < 0.05). Despite a similar resting MSNA, after cold pressor test, IBS showed a blunted increase in MSNA burst frequency (+4.1 vs +7.8 bursts/minute, P = 0.048; +30.1% vs +78.1%, P = 0.023). Baseline MSNA tended to be associated with urinary cortisol (ρ = 0.557, P = 0.059), and moreover, changes in heart rate and MSNA after mental stress were associated with urinary (ρ = 0.682, P = 0.021) and plasma cortisol (ρ = 0.671, P = 0.024), respectively. CONCLUSION: Higher concentrations of endothelin, NPY, and serotonin were found to be associated with some psychological disorders in IBS patients together with an altered cardiovascular autonomic reactivity to acute stressors compared to healthy subjects

    Storage and Ingestion Systems in Support of Stream Processing: A Survey

    Get PDF
    Under the pressure of massive, exponentially increasing amounts ofheterogeneous data that are generated faster and faster, Big Data analyticsapplications have seen a shift from batch processing to stream processing,which can reduce the time needed to obtain meaningful insight dramatically.Stream processing is particularly well suited to address the challenges of fog/edgecomputing: much of this massive data comes from Internet of Things (IoT)devices and needs to be continuously funneled through an edge infrastructuretowards centralized clouds. Thus, it is only natural to process data on theirway as much as possible rather than wait for streams to accumulate on thecloud. Unfortunately, state-of-the-art stream processing systems are not wellsuited for this role: the data are accumulated (ingested), processed andpersisted (stored) separately, often using different services hosted ondifferent physical machines/clusters. Furthermore, there is only limited support foradvanced data manipulations, which often forces application developers tointroduce custom solutions and workarounds. In this survey article, wecharacterize the main state-of-the-art stream storage and ingestion systems.We identify the key aspects and discuss limitations and missing features inthe context of stream processing for fog/edge and cloud computing. The goal is tohelp practitioners understand and prepare for potential bottlenecks when usingsuch state-of-the-art systems. In particular, we discuss both functional(partitioning, metadata, search support, message routing, backpressuresupport) and non-functional aspects (high availability, durability,scalability, latency vs. throughput). As a conclusion of our study, weadvocate for a unified stream storage and ingestion system to speed-up datamanagement and reduce I/O redundancy (both in terms of storage space andnetwork utilization)

    FLASH radiotherapy with electrons: issues related to the production, monitoring, and dosimetric characterization of the beam

    Get PDF
    Various in vivo experimental works carried out on different animals and organs have shown that it is possible to reduce the damage caused to healthy tissue still preserving the therapeutic efficacy on the tumor tissue, by drastically reducing the total time of dose delivery (<200 ms). This effect, called the FLASH effect, immediately attracted considerable attention within the radiotherapy community, due to the possibility of widening the therapeutic window and treating effectively tumors which appear radioresistant to conventional techniques. Despite the experimental evidence, the radiobiological mechanisms underlying the FLASH effect and the beam parameters contributing to its optimization are not yet known in details. In order to fully understand the FLASH effect, it might be worthy to investigate some alternatives which can further improve the tools adopted so far, in terms of both linac technology and dosimetric systems. This work investigates the problems and solutions concerning the realization of an electron accelerator dedicated to FLASH therapy and optimized for in vivo experiments. Moreover, the work discusses the saturation problems of the most common radiotherapy dosimeters when used in the very high dose-per-pulse FLASH conditions and provides some preliminary experimental data on their behavior

    Corrigendum: FLASH Radiotherapy With Electrons: Issues Related to the Production, Monitoring, and Dosimetric Characterization of the Beam

    Get PDF
    In the original article, the following authors were missing: Luigi Faillace, Lucia Giuliano, Mauro Migliorati, Luigi Palumbo. The corrected Author Contributions statement appears below. Affiliation 3, ‘Sapienza University of Rome, Rome, Italy’, is also added for authors LF, LG, MM, and LP. The authors apologize for these errors and state that this does not change the scientific conclusions of the article in any way. The original article has been updated

    Knowledge Based Open Entity Matching

    Get PDF
    In this work we argue for the definition a knowledge-based entity matching framework for the implementation of a reliable and incrementally scalable solution. Such knowledge base is formed by an ontology and a set of entity matching rules suitable to be applied as a reliable equational theory in the context of the Semantic Web. In particular, we are going to prove that relying on the existence of a set of contextual mappings to ease the semantic heterogeneity characterizing descriptions on the Web, a knowledge-based solution can perform comparably, and sometimes better, than existing solutions at the state of the art. We further argue that a knowledge-based solution to the open entity matching problem ought to be considered under the open world assumption, as in some cases the descriptions to be matched may not contain the necessary information to take any accurate matching decision. The main goal of this work is to show how the framework proposed is suitable to pursue a reliable solution of the entity matching problem, regardless the set of rules for the ontology adopted. In fact, we believe that structural and syntactic heterogeneity affecting data on the Web undermine the definition of a global unique solution. However, we argue that a knowledge-driven approach, considering the semantic and meta-properties of compared attributes, can provide important benefits and lead to more reliable solutions. To achieve this goal, we are going to implement several experiments to evaluate different sets of rules, testing our thesis and learning important lessons for future developments. The sets of rules that we will consider to bootstrap the solution proposed in this work are the result of diverse complementary processes: first we want to investigate whether capturing the matching knowledge employed by people in taking entity matching decision by relying on machine learning techniques can produce an effective set of rules (bottom-up strategy); second, we investigate the application of formal ontology pools to analyze the features defined in the ontology and support the definition of entity matching rules (top-down strategy). Moreover, in this work we argue that by merging the rules resulting from these complementary processes, we can define a set of rules that can support reliably entity matching decision in an open context

    Occupabilità e sistemi di profilazione: alcune riflessioni scaturite da un'analisi della letteratura e un'indagine con testimoni privilegiati in Veneto

    No full text
    In recent years, owing to the global economic crisis as well, public employment services in Italy have witnessed a growing disproportion between (human and material) resources and an increase in the number of users and services provided. The gap between the amount of resources invested in Italy and the level of resources invested in other European countries is so significant as to make any comparison difficult. This is the reason why the need of modernizing these services with tools aimed at supporting user profiling is felt both at the operational level and from a managerial/strategic perspective. Based on the results of a Delphi survey conducted in cooperation with experts in the field of labour policies and employment services, who operate in the Veneto region, it is argued that profiling systems capable of measuring different components of employability at individual level would be very useful, in Italy as well as in several other countries, particularly for resource allocation and activity planning purposes. Several models can be implemented: the debate is open; however, it is hoped that action is taken quickly
    • 

    corecore