429 research outputs found

    Machine learning on a budget

    Full text link
    Thesis (Ph.D.)--Boston UniversityIn a typical discriminative learning setting, a set of labeled training examples is given, and the goal is to learn a decision rule that accurately classifies (or labels) unseen test examples. Much of machine learning research has focused on improving accuracy, but more recently costs of learning and decision making are becoming more important. Such costs arise both during training and testing. Labeling data for training is often an expensive process. During testing, acquiring or processing measurements for every decision is also costly. This work deals with two problems: how to reduce the amount of labeled data during training, and how to minimize measurements cost in making decisions during testing, while maintaining system accuracy. The first part falls into an area known as active learning. It deals with the problem of selecting a small subset of examples to label, from a pool of unlabeled data, for training a good classifier. This problem is relevant in many applications where a large collection of unlabeled data is readily available but to label an instance requires using an expensive expert (a radiologist annotating a medical image). We study active learning in the boosting framework. We develop a practical algorithm that labels examples to maximally reduce the space of feasible classifiers. We show that, under certain assumptions, our strategy achieves the generalization error performance of a system trained on the entire data set while only selecting logarithmically many samples to label. In the second part, we study sequential classifiers under budget constraints. In many systems, such as medical diagnosis and homeland security, sensors have varying acquisition costs, and these costs account for delay, throughput or monetary value. While some decisions require all measurements, it is often unnecessary to use every modality to classify every example. So the problem is to learn a system that, for every decision, sequentially selects sensors to meet a measurement budget while minimizing classification error. Initially, we study the case where the sensor order in which measurement are acquired is given. For every instance, our system has to decide whether to seek more measurements from the next sensor or to terminate by classifying based on the available information. We use Bayesian analysis of this problem to construct a novel multi-stage empirical risk objective and directly learn sequential decision functions from training data. We provide practical algorithms for binary and multi-class settings and derive generalization error guarantees. We compare our approach to alternative strategies on real world data. In the last section, we explore a decision system when the order of sensors is no longer fixed. We investigate how to combine ideas from reinforcement and imitation learning with empirical risk minimization to learn a dynamic sensor selection policy

    The Mechanisms of Formation of the Seasonal Variations of 137Cs Content in the Organisms of Freshwater Fish

    Get PDF
    Актуальность работы обусловлена тем, что до настоящего времени не определено, в какой период года удельная активность 137Cs в рыбах достигает наибольших величин. Методами математического моделирования проанализированы механизмы формирования сезонных колебаний содержания 137Cs в организме пресноводных рыб. Показано, что содержание 137Cs может достигать максимальных величин в любой период года. Период максимального содержания 137Cs в рыбах разных видов зависит от интенсивности питания рыб и содержания 137Cs в кормовых объектах. Актуальність роботи зумовлена тим, що на сьогодні не визначений період року, в який питома активність 137Cs у рибах досягає максимальних значень. Методами математичного моделювання проаналізовано механізми формування сезонних варіацій вмісту 137Cs в організмі прісноводних риб. Показано, що вміст 137Cs може досягати максимальних величин у будь-який період року. Період максимального вмісту 137Cs у рибах різних видів залежіть від інтенсивності їхнього живлення та вмісту 137Cs у кормових об’єктах. Relevance of the work due to the fact that up to now period of year, when specific activity of 137Cs in fish reaches its maximum is not defined. Using methods of mathematical modeling formation mechanisms of seasonal variations of 137Cs in the organism of freshwater fish was analysed. It is shown that the concentration of 137Cs can reach maximum values at any time during the year. The period of maximum of 137Cs content in fish of different species depends on the intensity of fish nutrition and 137Cs content in food objects.Работа выполнена в отделе пресноводной радиоэкологии ИГ НАН Украин

    Multi-Stage Classifier Design

    Full text link
    In many classification systems, sensing modalities have different acquisition costs. It is often {\it unnecessary} to use every modality to classify a majority of examples. We study a multi-stage system in a prediction time cost reduction setting, where the full data is available for training, but for a test example, measurements in a new modality can be acquired at each stage for an additional cost. We seek decision rules to reduce the average measurement acquisition cost. We formulate an empirical risk minimization problem (ERM) for a multi-stage reject classifier, wherein the stage kk classifier either classifies a sample using only the measurements acquired so far or rejects it to the next stage where more attributes can be acquired for a cost. To solve the ERM problem, we show that the optimal reject classifier at each stage is a combination of two binary classifiers, one biased towards positive examples and the other biased towards negative examples. We use this parameterization to construct stage-by-stage global surrogate risk, develop an iterative algorithm in the boosting framework and present convergence and generalization results. We test our work on synthetic, medical and explosives detection datasets. Our results demonstrate that substantial cost reduction without a significant sacrifice in accuracy is achievable

    Salt deposits of the Ufimian Formation in the Solikamsk depression

    Get PDF
    The paleogeographic and tectonic conditions of the accumulation of Ufimian deposits (Lower Permian) were reconstructed on the basis of study more than 2000 wells within the Solikamsk depression. The most complete cross-section of the salt-marl formation (9 large layers of rock salt and gypsum rock) was studied. On this basis, the modified scheme of stratification of the salt-marl layer was proposed. The stratum was dismembered, both with complete and partial preservation of the salt layers. The stratum is divided into 3 large cyclothemes in this scheme, the cyclothemes - into series of cyclites. Each cyclite has a complete cycle of the evaporate sedimentation. The maps were constructed for each salt layer. The maps show the configuration of the salt lagoon and the migration of its depocenter within the Solikamsk depression (Ufimian age). The study of the cross-section shows the vertical change of the composition. The salt-bearing rocks are replaced by carbonate rocks - it corresponds to a general transgression in the region. For upper layers of salt-marl strata the facial replacement of the salt by gypsum rocks has been revealed. The analysis of the configuration of the reconstructed lagoon at the geological time demonstrates its connection with the regional tectonic events and salt tectonics in the Kungur sediments

    Optimal choice of information security in automated systems via Markov cyber-attack models

    Get PDF
    One of the main problems to provisioning the information security of automated systems is the absence of unify approaches to the quantitative evaluation of their efficiency and reliability. In this article, we consider one of the approaches to this problem, which is based on the use of cyber-attack models described in terms of Markov chains with absorbing states. In particular, we describe one of these models in detail, in which, in contrast to the similar models of other authors, the different duration of attacks is provided. Moreover, we also have provided for this model the different absorbing states that are associated with the successful implementations for every of cyber-attacks. These features allow us to introduce two security metrics, which can be use for evaluating efficiency of the security remedies applied: the mean time to security failure and the mean risk of the attack implementation. Using these metrics, we formulate, in this article, a few optimization problems, which are of interest in the development and design of the secured automated systems. It has shown that these problems belong to the class of non-linear integer programming problems, and therefore we also suggest an efficient algorithm of their solving based on the concept of sequent analysis of variants. A program has been developed for studying Markov security models taking into account the duration of a computer attack and an example of solving one is given optimization problems whose solution is some optimal set of security remedies. This solution minimizes the cost and expenses sent on the security remedies at some constraints on the mean time to security failure

    Seismic profile solution under conditions of thawing permafrost with technogenic depression

    Get PDF
    The article given presents the results of geophysical research of the certain types of geodynamical phenomenon, which is generally located on areas with permafrost soil layers. The detailed description to the research methods of refracting interface of the technogenic thawing ground area under the conditions of flooded pressure by refraction and reflection methods is given. Setting of the problem and detection of the nature of the changes in the properties and configuration of the permafrost and thawed soil layers in the grounds of embankment is practically solved by the integrated application of seismic methods that in theory have different kinematics due to propagation of elastic waves and methods of interpretation the seismic profile

    Molecular alterations as target for therapy in metastatic osteosarcoma: a review of literature

    Get PDF
    Treating metastatic osteosarcoma (OS) remains a challenge in oncology. Current treatment strategies target the primary tumour rather than metastases and have a limited efficacy in the treatment of metastatic disease. Metastatic cells have specific features that render them less sensitive to therapy and targeting these features might enhance the efficacy of current treatment. A detailed study of the biological characteristics and behaviour of metastatic OS cells may provide a rational basis for innovative treatment strategies. The aim of this review is to give an overview of the biological changes in metastatic OS cells and the preclinical and clinical efforts targeting the different steps in OS metastases and how these contribute to designing a metastasis directed treatment for OS
    corecore