256 research outputs found

    A Unifying View of Multiple Kernel Learning

    Full text link
    Recent research on multiple kernel learning has lead to a number of approaches for combining kernels in regularized risk minimization. The proposed approaches include different formulations of objectives and varying regularization strategies. In this paper we present a unifying general optimization criterion for multiple kernel learning and show how existing formulations are subsumed as special cases. We also derive the criterion's dual representation, which is suitable for general smooth optimization algorithms. Finally, we evaluate multiple kernel learning in this framework analytically using a Rademacher complexity bound on the generalization error and empirically in a set of experiments

    Models of asthma: density-equalizing mapping and output benchmarking

    Get PDF
    Despite the large amount of experimental studies already conducted on bronchial asthma, further insights into the molecular basics of the disease are required to establish new therapeutic approaches. As a basis for this research different animal models of asthma have been developed in the past years. However, precise bibliometric data on the use of different models do not exist so far. Therefore the present study was conducted to establish a data base of the existing experimental approaches. Density-equalizing algorithms were used and data was retrieved from a Thomson Institute for Scientific Information database. During the period from 1900 to 2006 a number of 3489 filed items were connected to animal models of asthma, the first being published in the year 1968. The studies were published by 52 countries with the US, Japan and the UK being the most productive suppliers, participating in 55.8% of all published items. Analyzing the average citation per item as an indicator for research quality Switzerland ranked first (30.54/item) and New Zealand ranked second for countries with more than 10 published studies. The 10 most productive journals included 4 with a main focus allergy and immunology and 4 with a main focus on the respiratory system. Two journals focussed on pharmacology or pharmacy. In all assigned subject categories examined for a relation to animal models of asthma, immunology ranked first. Assessing numbers of published items in relation to animal species it was found that mice were the preferred species followed by guinea pigs. In summary it can be concluded from density-equalizing calculations that the use of animal models of asthma is restricted to a relatively small number of countries. There are also differences in the use of species. These differences are based on variations in the research focus as assessed by subject category analysis

    Machine Learning Models that Remember Too Much

    Full text link
    Machine learning (ML) is becoming a commodity. Numerous ML frameworks and services are available to data holders who are not ML experts but want to train predictive models on their data. It is important that ML models trained on sensitive inputs (e.g., personal images or documents) not leak too much information about the training data. We consider a malicious ML provider who supplies model-training code to the data holder, does not observe the training, but then obtains white- or black-box access to the resulting model. In this setting, we design and implement practical algorithms, some of them very similar to standard ML techniques such as regularization and data augmentation, that "memorize" information about the training dataset in the model yet the model is as accurate and predictive as a conventionally trained model. We then explain how the adversary can extract memorized information from the model. We evaluate our techniques on standard ML tasks for image classification (CIFAR10), face recognition (LFW and FaceScrub), and text analysis (20 Newsgroups and IMDB). In all cases, we show how our algorithms create models that have high predictive power yet allow accurate extraction of subsets of their training data

    Scoliosis: density-equalizing mapping and scientometric analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Publications related to scoliosis have increased enormously. A differentiation between publications of major and minor importance has become difficult even for experts. Scientometric data on developments and tendencies in scoliosis research has not been available to date. The aim of the current study was to evaluate the scientific efforts of scoliosis research both quantitatively and qualitatively.</p> <p>Methods</p> <p>Large-scale data analysis, density-equalizing algorithms and scientometric methods were used to evaluate both the quantity and quality of research achievements of scientists studying scoliosis. Density-equalizing algorithms were applied to data retrieved from ISI-Web.</p> <p>Results</p> <p>From 1904 to 2007, 8,186 items pertaining to scoliosis were published and included in the database. The studies were published in 76 countries: the USA, the U.K. and Canada being the most productive centers. The Washington University (St. Louis, Missouri) was identified as the most prolific institution during that period, and orthopedics represented by far the most productive medical discipline. "BRADFORD, DS" is the most productive author (146 items), and "DANSEREAU, J" is the author with the highest scientific impact (h-index of 27).</p> <p>Conclusion</p> <p>Our results suggest that currently established measures of research output (i.e. impact factor, h-index) should be evaluated critically because phenomena, such as self-citation and co-authorship, distort the results and limit the value of the conclusions that may be drawn from these measures. Qualitative statements are just tractable by the comparison of the parameters with respect to multiple linkages. In order to obtain more objective evaluation tools, new measurements need to be developed.</p

    Density-equalizing mapping and scientometric benchmarking of European allergy research

    Get PDF
    Due to the great socioeconomic burden of allergic diseases, research in this field which is important for environmental medicine is currently increasing. Therefore the European Union has initiated the Global Allergy and Asthma European network (GA2LEN). However, despite increasing research in the past years detailed scientometric analyses have not been conducted so far. This study is the first scientometric analysis in a field of growing interest. It analyses scientific contributions in European allergy research between 2001 and 2007. Three different meetings of the European Academy of Allergy and Clinical Immunology were analysed for contributions and an increase in both the amount of research and networks was found

    Security Evaluation of Support Vector Machines in Adversarial Environments

    Full text link
    Support Vector Machines (SVMs) are among the most popular classification techniques adopted in security applications like malware detection, intrusion detection, and spam filtering. However, if SVMs are to be incorporated in real-world security systems, they must be able to cope with attack patterns that can either mislead the learning algorithm (poisoning), evade detection (evasion), or gain information about their internal parameters (privacy breaches). The main contributions of this chapter are twofold. First, we introduce a formal general framework for the empirical evaluation of the security of machine-learning systems. Second, according to our framework, we demonstrate the feasibility of evasion, poisoning and privacy attacks against SVMs in real-world security problems. For each attack technique, we evaluate its impact and discuss whether (and how) it can be countered through an adversary-aware design of SVMs. Our experiments are easily reproducible thanks to open-source code that we have made available, together with all the employed datasets, on a public repository.Comment: 47 pages, 9 figures; chapter accepted into book 'Support Vector Machine Applications

    Scientometric Analysis and Combined Density-Equalizing Mapping of Environmental Tobacco Smoke (ETS) Research

    Get PDF
    Background: Passive exposure to environmental tobacco smoke (ETS) is estimated to exert a major burden of disease. Currently, numerous countries have taken legal actions to protect the population against ETS. Numerous studies have been conducted in this field. Therefore, scientometric methods should be used to analyze the accumulated data since there is no such approach available so far. Methods and Results: A combination of scientometric methods and novel visualizing procedures were used, including density-equalizing mapping and radar charting techniques. 6,580 ETS-related studies published between 1900 and 2008 were identified in the ISI database. Using different scientometric approaches, a continuous increase of both quantitative and qualitative parameters was found. The combination with density-equalizing calculations demonstrated a leading position of the United States (2,959 items published) in terms of quantitative research activities. Charting techniques demonstrated that there are numerous bi- and multilateral networks between different countries and institutions in this field. Again, a leading position of American institutions was found. Conclusions: This is the first comprehensive scientometric analysis of data on global scientific activities in the field o

    Probabilistic Clustering of Time-Evolving Distance Data

    Full text link
    We present a novel probabilistic clustering model for objects that are represented via pairwise distances and observed at different time points. The proposed method utilizes the information given by adjacent time points to find the underlying cluster structure and obtain a smooth cluster evolution. This approach allows the number of objects and clusters to differ at every time point, and no identification on the identities of the objects is needed. Further, the model does not require the number of clusters being specified in advance -- they are instead determined automatically using a Dirichlet process prior. We validate our model on synthetic data showing that the proposed method is more accurate than state-of-the-art clustering methods. Finally, we use our dynamic clustering model to analyze and illustrate the evolution of brain cancer patients over time
    • …
    corecore