984 research outputs found

    Effective Classification using a small Training Set based on Discretization and Statistical Analysis

    Get PDF
    This work deals with the problem of producing a fast and accurate data classification, learning it from a possibly small set of records that are already classified. The proposed approach is based on the framework of the so-called Logical Analysis of Data (LAD), but enriched with information obtained from statistical considerations on the data. A number of discrete optimization problems are solved in the different steps of the procedure, but their computational demand can be controlled. The accuracy of the proposed approach is compared to that of the standard LAD algorithm, of Support Vector Machines and of Label Propagation algorithm on publicly available datasets of the UCI repository. Encouraging results are obtained and discusse

    A Combinatorial Optimization Approach to the Selection of Statistical Units

    Get PDF
    In the case of some large statistical surveys, the set of units that will constitute the scope of the survey must be selected. We focus on the real case of a Census of Agriculture, where the units are farms. Surveying each unit has a cost and brings a different portion of the whole information. In this case, one wants to determine a subset of units producing the minimum total cost for being surveyed and representing at least a certain portion of the total information. Uncertainty aspects also occur, because the portion of information corresponding to each unit is not perfectly known before surveying it. The proposed approach is based on combinatorial optimization, and the arising decision problems are modeled as multidimensional binary knapsack problems. Experimental results show the effectiveness of the proposed approach

    Identifying e-Commerce in Enterprises by means of Text Mining and Classification Algorithms

    Get PDF
    Monitoring specific features of the enterprises, for example, the adoption of e-commerce, is an important and basic task for several economic activities. This type of information is usually obtained by means of surveys, which are costly due to the amount of personnel involved in the task. An automatic detection of this information would allow consistent savings. This can actually be performed by relying on computer engineering, since in general this information is publicly available on-line through the corporate websites. This work describes how to convert the detection of e-commerce into a supervised classification problem, where each record is obtained from the automatic analysis of one corporate website, and the class is the presence or the absence of e-commerce facilities. The automatic generation of similar data records requires the use of several Text Mining phases; in particular we compare six strategies based on the selection of best words and best n-grams. After this, we classify the obtained dataset by means of four classification algorithms: Support Vector Machines; Random Forest; Statistical and Logical Analysis of Data; Logistic Classifier. This turns out to be a difficult case of classification problem. However, after a careful design and set-up of the whole procedure, the results on a practical case of Italian enterprises are encouraging

    A min-cut approach to functional regionalization, with a case study of the Italian local labour market areas

    Get PDF
    In several economical, statistical and geographical applications, a territory must be subdivided into functional regions. Such regions are not fixed and politically delimited, but should be identified by analyzing the interactions among all its constituent localities. This is a very delicate and important task, that often turns out to be computationally difficult. In this work we propose an innovative approach to this problem based on the solution of minimum cut problems over an undirected graph called here transitions graph. The proposed procedure guarantees that the obtained regions satisfy all the statistical conditions required when considering this type of problems. Results on real-world instances show the effectiveness of the proposed approach

    Logical analysis of data as a tool for the analysis of probabilistic discrete choice behavior

    Get PDF
    Probabilistic Discrete Choice Models (PDCM) have been extensively used to interpret the behavior of heterogeneous decision makers that face discrete alternatives. The classification approach of Logical Analysis of Data (LAD) uses discrete optimization to generate patterns, which are logic formulas characterizing the different classes. Patterns can be seen as rules explaining the phenomenon under analysis. In this work we discuss how LAD can be used as the first phase of the specification of PDCM. Since in this task the number of patterns generated may be extremely large, and many of them may be nearly equivalent, additional processing is necessary to obtain practically meaningful information. Hence, we propose computationally viable techniques to obtain small sets of patterns that constitute meaningful representations of the phenomenon and allow to discover significant associations between subsets of explanatory variables and the output. We consider the complex socio-economic problem of the analysis of the utilization of the Internet in Italy, using real data gathered by the Italian National Institute of Statistics

    Exploring the Potentialities of Automatic Extraction of University Webometric Information

    Get PDF
    The main objective of this work is to show the potentialities of recently developed approaches for automatic knowledge extraction directly from the universities’ websites. The information automatically extracted can be potentially updated with a frequency higher than once per year, and be safe from manipulations or misinterpretations. Moreover, this approach allows us flexibility in collecting indicators about the efficiency of universities’ websites and their effectiveness in disseminating key contents. These new indicators can complement traditional indicators of scientific research (e.g. number of articles and number of citations) and teaching (e.g. number of students and graduates) by introducing further dimensions to allow new insights for “profiling” the analyzed universities. The main findings of this study concern the evaluation of the potential in digitalization of universities, in particular by presenting techniques for the automatic extraction of information from the web to build indicators of quality and impact of universities’ websites. These indicators can complement traditional indicators and can be used to identify groups of universities with common features using clustering techniques working with the above indicators

    Sperimentazione di una tecnica naturale di decontaminazione di sedimenti marini di dragaggio per il riutilizzo come terreno agrario

    Get PDF
    The proposed technique is based on the use of natural plants (paspalum v., tamarix g.,spartium j.), and organic amendment, with the aim of reaching the following objectives: (1) sediment decontamination; (2) physical, chemical and biological amelioration of sediments. Good results were obtained in terms of adaptation of the plants used, decrease in contamination (about 20% for metals and 70% for hydrocarbons) and increase in nutrient content and microbial activity. Moreover, the proper monitoring of irrigation has permitted to reset the volume of leachate, ensuring, however, the field capacity and the decrease of salinity in the medium. The experiment was carried out at pilot-scale, treating 80 m3 of sediment with AGRIPORT technology

    Sac enlargement due to seroma after endovascular abdominal aortic aneurysm repair with the Endologix PowerLink device

    Get PDF
    A patient who had undergone endovascular repair of an abdominal aortic aneurysm with the Endologix PowerLink bifurcated system presented with delayed aortic aneurysm enlargement due to assumed endotension. He was treated with aortic sac evacuation and wrapping of the endograft. This is the first report of endotension and aneurysm sac enlargement after implantation of the PowerLink endograft

    Life-stage dependent response of the epiphytic lichen Lobaria pulmonaria to climate

    Get PDF
    Lichens are poikilohydric organisms, whose internal water content tends to reflect external humidity conditions. After drying, they can reactivate their metabolic activity through water vapor uptake or liquid water input. Thus, lichen water-related functional traits are important as they are involved in the duration of the hydrated period. Models predicting the effect of environmental conditions on lichens are based mainly on the presence or absence of adult thalli. Nevertheless, ecological conditions required by lichens might vary during their life cycle, for example during propagule establishment or in the first stages of thallus development. Little is known about the different ecological requirements at the different development stages in lichens. In this work, we measured water holding capacity (WHC) and specific thallus mass (STM) of adult and juvenile thalli of the model species Lobaria pulmonaria along a climatic gradient to constrain the processbased model LiBry. The LiBry model allows accounting for the productivity of lichens with different physiological strategies under various environmental conditions. We simulated the activity and performance of adult and juvenile thalli in 9 regions of Italy and Corsica. The model was used to test if adult thalli of L. pulmonaria have a higher survival probability due to their higher aerodynamic resistance. In the current climatic condition, the LiBry model predicts a higher survival probability of adults with decreasing absolute survival rates of both life stages with increasing temperature. Adult thalli also result in having higher active time, STM, and relative growth rate (RGR). We discuss the main implications of our simulation outputs, provide future perspectives and possible implementations of the LiBry mode
    • …
    corecore