39 research outputs found

    Hide and Seek: Scaling Machine Learning for Combinatorial Optimization via the Probabilistic Method

    Full text link
    Applying deep learning to solve real-life instances of hard combinatorial problems has tremendous potential. Research in this direction has focused on the Boolean satisfiability (SAT) problem, both because of its theoretical centrality and practical importance. A major roadblock faced, though, is that training sets are restricted to random formulas of size several orders of magnitude smaller than formulas of practical interest, raising serious concerns about generalization. This is because labeling random formulas of increasing size rapidly becomes intractable. By exploiting the probabilistic method in a fundamental way, we remove this roadblock entirely: we show how to generate correctly labeled random formulas of any desired size, without having to solve the underlying decision problem. Moreover, the difficulty of the classification task for the formulas produced by our generator is tunable by varying a simple scalar parameter. This opens up an entirely new level of sophistication for the machine learning methods that can be brought to bear on Satisfiability. Using our generator, we train existing state-of-the-art models for the task of predicting satisfiability on formulas with 10,000 variables. We find that they do no better than random guessing. As a first indication of what can be achieved with the new generator, we present a novel classifier that performs significantly better than random guessing 99% on the same datasets, for most difficulty levels. Crucially, unlike past approaches that learn based on syntactic features of a formula, our classifier performs its learning on a short prefix of a solver's computation, an approach that we expect to be of independent interest

    Machine Learning Techniques and Cell Irradiation

    Get PDF
    Αυτή είναι μία εργασία πάνω στο πως τεχνικές μηχανικής μάθησης (Machine Learning) μπορούν να χρησιμοποιηθούν ώστε να προβλέψουν το βιολογικό αποτέλεσμα την ακτινοβόλησης κυττάρων. Το πρόβλημα συνίσταται στην πρόβλεψη των ποσοτήτων του Σχετική Βιολογικής Επίδρασης (RBE) και των παραμέτρων α και β του τετραγωνικού προτύπου. Η ποσότητες λαμβάνουν συνεχείς τιμές, αρά πρόκειται για να πρόβλημα πολυπαραγοντικής παλινδρόμησης. Η ποσότητα RBE στην περίπτωση μας, σηματοδοτεί, το εύρος του αποτελέσματος της ακτινοβολίας στην επιβίωσης των κυττάρων. Οι παράμετροι α και β, αναπαριστούν την γραμμική και τετραγωνική συνεισφορά, σε σχέση με τη δόση, στον κυτταρικό θάνατο αντίστοιχα. Γίνεται χρήση 3 αλγορίθμων και 2 συνόλων δεδομένων. Οι αλγόριθμοι που χρησιμοποιήθηκαν ήταν, Gradient Boosting Decision Trees (GBDT), ο Random Forest Regression (RF) και Support Vector Regression (SVR). Δύο υλοποιήσεις του GBDT αλγορίθμου χρησιμοποιήθηκαν. Σαν επιπλέον δοκιμή εφαρμόστηκε και ο αλγόριθμος voting regression (VR) ως μέθοδο ψηφοφορίας επί όλων των προηγούμενων μοντέλων, ο οποίος προβλέπει τις εξαρτημένες μεταβλητές βάσει μιας μεθόδου συναίνεσης. Οι αλγόριθμοι που χρησιμοποιήθηκαν είχαν ριζικά διαφορετική αρχιτεκτονική και προσέγγιση στο πρόβλημα, μεταξύ τους. Ο στόχος ήταν, να συνδυαστούν διαφορετικές τεχνικές και να δειχθεί ότι ένα συνδυαστικό μοντέλο αποδίδει καλύτερα σε προβλεπτική ικανότητα και δυνατότητα γενίκευσης. Τα αποτελέσματα δείχνουν ότι αλγόριθμος CatBoost που είναι μια υλοποίηση του GBDT αλγορίθμου αποδίδει σχετικά καλύτερα στην πρόβλεψη και στη γενίκευση. Τα σύνολα δεδομένων μας συνίστανται σε πειράματα ακτινοβόλησης ιονίζουσας ακτινοβολίας με ιόντα και φορτισμένα σωμάτια, και περιέχουν μεταβλητές σχετικές με την αντικείμενο της ακτινοβόλησης όπως το είδος του κυττάρου, φάση του κυτταρικού κύκλου, καρκινικά ή μη κύτταρα και αφετέρου ιδιότητες της ακτινοβολίας όπως τη Γραμμική Μεταφορά Ενέργειας (LET), ειδική ενέργεια, το είδος του ιόντος και την τροπικότητα της ακτινοβόλησης. Τα σύνολα δεδομένων είναι σχετικά μικρά, αποτελούμενα από πειράματα που έχουν γίνει με διαφορετικές μεθόδους και γενικώς δεν μπορούν να ληφθούν ως "μαύρο κουτί". Δείχνεται ότι τα δεδομένα είναι κάπως θορυβώδη και περιέχουν πολλαπλές συγγραμικότητες. Η όλη εργασία έχει ως σκοπό την επίδειξη της χρησιμότητας των τεχνικών ensemble και των τεχνικών συναίνεσης στην πρόβλεψη των αναφερθέντων ποσοτήτων. Αναβάθμιση της προβλεπτικής ικανότητας των προηγούμενων ποσοτήτων πριν την ραδιοθεραπεία είναι εφικτή με χρήση πιο μεγάλου και πιο συνεκτικού συνόλου δεδομένων. Επίσης, χρειάζεται μια πιο εις βάθος ανάλυση ώστε, να καταλήξουμε σε ποιες μεταβλητές είναι πιο καθοριστικές στην πρόβλεψη. Περαιτέρω ενσωμάτωση δεδομένων που αφορούν σε φωτονική ακτινοβόληση χρειάζεται ώστε να γίνουν οι εκτιμητές μας πιο γενικής χρήσης σε ευρύτερο πλαίσιο της ραδιοβιολογίας.This is study on how Machine Learning (ML) techniques can be used to predict the outcome of ionizing radiation on cells. The problem consists of predicting the quantities of Relative Biological Effectiveness (RBE) and the α and β coefficients of the quadratic model. The quantities to be predicted are continuous, so in essence it is a multi-variable regression problem. The RBE quantity in our case, signifies, what effect the radiation has on cell survival. The α and β coefficients, represent the linear and quadratic contributions to cell death respectively. There are 3 different ML algorithms used, using 2 separate datasets. The algorithms used were, Gradient Boosting Decision Trees (GBDT), the Random Forest Regression (RF) and Support Vector Regression (SVR). Two implementations of the GDBT were used. As a further trial, a voting regression (VR) is implemented across all previous algorithms, that predicts the dependent variables based on a consensus method. The algorithms employed were of radically different design and approach. The aim was to combine different techniques and show that the combined model fairs better in predictive performance and generalization. The results show that VR fairs quite better in generalizing and predicting. Our datasets consists of mainly HZE irradiation features, i.e they contain cell specific data like cell line, cell phase and tumorous state, as well as radiation features like LET , specific energy and ion species. The Datasets are quite small, compiled by different methods and generally cannot been seen as a black box. It is shown that the datasets are somewhat noisy and contain multi-collinearities The whole work is meant to be a showcase of the usefulness of ensemble and consensus techniques in predicting the aforementioned quantities. A bigger, more cohesive and consistent dataset is required, in order to predict, prior to a radiotherapy treatment, the outcome in a more robust and less arbitrary way. A more in-depth feature analysis is required, to assess which features are essential to the predictions. To make our estimators intended for a more general use in the scope of radiobiology, further integration with data from photon irradiation data are needed

    Synergistic exploitation of geoinformation methods for post-earthquake 3D mapping of Vrisa traditional settlement, Lesvos Island, Greece

    Get PDF
    The aim of this paper is to present the methodology followed and the results obtained by the synergistic exploitation of geo-information methods towards 3D mapping of the impact of the catastrophic earthquake of June 12th 2017 on the traditional settlement of Vrisa on the island of Lesvos, Greece. A campaign took place for collecting: a) more than 150 ground control points using an RTK system, b) more than 20.000 high-resolution terrestrial and aerial images using cameras and Unmanned Aircraft Systems and c) 140 point clouds by a 3D Terrestrial Laser Scanner. The Structure from Motion method has been applied on the high-resolution terrestrial and aerial photographs, for producing accurate and very detailed 3D models of the damaged buildings of the Vrisa settlement. Additionally, two Orthophoto maps and Digital Surface Models have been created, with a spatial resolution of 5cm and 3cm, respectively. The first orthophoto map has been created just one day after the earthquake, while the second one, a month later. In parallel, 3D laser scanning data have been exploited in order to validate the accuracy of the 3D models and the RTK measurements used for the geo-registration of all the above-mentioned datasets. The significant advantages of the proposed methodology are: a) the coverage of large scale areas; b) the production of 3D models having very high spatial resolution and c) the support of post-earthquake management and reconstruction processes of the Vrisa village, since such 3D information can serve all stakeholders, be it national and/or local organizations

    A numerical algorithm for the optimal placement of inset maps

    No full text
    The purpose of this paper is to present an algorithmic formulation addressing the cartographic problem of siting an inset map at specific map locations under spatial and cartographic constraints. The first part of the paper aims at: (a) presenting a numerical algorithm that solves the above siting problem under such constraints, and (b) investigating the effectiveness of this numerical algorithm for a more general geo- graphical problem, that of siting an anthropogenic structure or object of rectangular shape in suitable areas. The second part of this paper showcases the computational implementation of the above algorithm for addressing the cartographic problem of inset map placement in areas with land discontinuity

    Inset Mapper: A software tool in Island cartography

    No full text
    Island cartography deals with special cartographic problems confronted in the portrayal of island regions and demands the use of specially developed software tools. One of the most commonly faced problems is the need of inset map creation for very small islands, and sometimes isolated ones, that must be displayed in the main map. This paper presents the methodology followed for the development of the Inset Mapper (IM) Software toolbox, describes the toolbox, and showcases its ability to create inset maps in Island regions. The IM software tool provides a useful cartographic tool for assisting the selection of the most appropriate position and scale of the inset map in an Island region

    Critical analysis of materialism currents: d'Espagnat and Comte-Sponville

    No full text
    78 σ.Η παρούσα διπλωματική εργασία επιδιώκει να σκιαγραφήσει τη σημερινή συζήτηση γύρω από το θέμα του υλισμού και της χρήσης του ως εννοιολογικού πλαισίου στο οποίο είναι ερμηνεύσιμα τα σημερινά δεδομένα των φυσικών επιστημών. Ειδικότερα, επιχειρούμε να παρέμβουμε στην συζήτηση μεταξύ των d'Espagnat και Compte-Sponville γύρω από βασικά ζητήματα της φιλοσοφίας της επιστήμης. Το πρώτο κεφάλαιο αφορά στην εξέλιξη του υλιστικού ρεύματος από την γέννηση του έως σχεδόν σήμερα. Δεν πρόκειται για κάτι τετριμμένο, αντίθετα η μελέτη της ιστορικότητας των ιδεών είναι στοιχείο για μια διαλεκτική μέθοδο και συστατικό για τις σημερινές απαντήσεις που προσπαθούμε να δώσουμε. Ο διάλογος των δύο φιλοσόφων ανοίγει μεγάλα ζητήματα που έχουν σχέση με την φιλοσοφία της επιστήμης. Ο d'Espagnat παίρνει μια κριτική στάση πάνω στο ζήτημα της πραγματικότητας και της έννοιας της ύλης. Βασίζεται πάνω στην άρνηση της μηχανιστικής οντολογίας, κάνοντας όμως το σφάλμα να υιοθετεί ένα πνευματικό πόλο, πράγμα που υποσκάπτει τον ρεαλισμό του. Από την άλλη ο Compte-Sponville υποστηρίζει τον νεοϋλισμό που όπως θα δούμε έχει πολλά προβλήματα. Δεν προσπαθούμε να δαιμονοποίησουμε κάποιον από τους δύο. Αντίθετα προσπαθούμε να βρούμε τα εύλογα σημεία τους και να σκιαγραφήσουμε μια διαλεκτική υλιστική μέθοδο ερμηνείας του σώματος των σύγχρονων φυσικών επιστημών. Επιγραμματικά θίγονται τα θέματα της πραγματικότητας στην κβαντική μηχανική, της αιτιότητας, της τοπικότητας, του μη διαχωρίσιμου και της ασυνέχειας.The present thesis aims to outline the contemporary argumentation on the subject of materialism and its use as a conceptual framework, within which the contemporary state of natural science is interpretable. Especially, we are attemping to intervene in the dialogue between d'Espagnat and Compte-Sponville about basic issues of philosophy of science. The fist chapter concerns the evolution of materialism from its very beginnings up until almost today. It is not a trivial matter; on the contrary, studying the history of ideas is basic for a dialectic method and component to the answers we attemp to give. The dialogue of the two philosophers raises all major issues related to philosophy of science. d'Espagnat takes a critical stance on the issue of reality and the notion of matter. He relies on the denial of mechanistic ontology, but makes the mistake to adopt a spiritual pole that undermines his realist point of view. On the other hand Compte-Sponville supports neomaterialism, which as we will see entails many problems. Far from demonizing any of the two, we are trying to isolate the reasonable points of both of them and outline a dialectical materialist method of interpretation of the corpus of knowledge of modern natural science. In short, issues raised in the text include reality in quantum mechanics, causality, locality, non separability and discontinuity.Δημήτρης Λ. Παπακωνσταντίνο

    Tackling Negative Representation: The Use of Storytelling As a Critical Pedagogical Tool for Positive Representation of Roma

    Get PDF
    This article focuses on the negative representation of Roma in Greece in the early twenty-first century. It investigates how negative feedback takes the form of a self-fulfilling prophecy that suppresses the self-esteem of young Roma and maintains a distance between Romani identity and education despite several positive yet little known examples of Romani scientists and scholars. The article questions how negative Romani images canbe reversed in order to enhance Roma’s educational success. The importance of innovative educational activities based on Romani literature, critical multiculturalism, and the parameter of Romani bilingualism is highlighted. Particularly, the article focuses on the power and the echo that stories can have (storytelling),where protagonists have a Romani connection or identity and are portrayed as positive models, both within classrooms with Romani students and within a society where the idea of Romani literature is a fantasy

    Network meta-analysis results against a fictional treatment of average performance: treatment effects and ranking metric.

    Get PDF
    BACKGROUND Network meta-analysis (NMA) produces complex outputs as many comparisons between interventions are of interest. The estimated relative treatment effects are usually displayed in a forest plot or in a league table and several ranking metrics are calculated and presented. METHODS In this paper, we estimate relative treatment effects of each competing treatment against a fictional treatment of average performance using the 'deviation from the means' coding that has been used to parametrize categorical covariates in regression models. We then use this alternative parametrization of the NMA model to present a ranking metric (PreTA: Preferable Than Average) interpreted as the probability that a treatment is better than a fictional treatment of average performance. RESULTS We illustrate the alternative parametrization of the NMA model using two networks of interventions, a network of 18 antidepressants for acute depression and a network of four interventions for heavy menstrual bleeding. We also use these two networks to highlight differences among PreTA and existing ranking metrics. We further examine the agreement between PreTA and existing ranking metrics in 232 networks of interventions and conclude that their agreement depends on the precision with which relative effects are estimated. CONCLUSIONS A forest plot with NMA relative treatment effects using 'deviation from means' coding could complement presentation of NMA results in large networks and in absence of an obvious reference treatment. PreTA is a viable alternative to existing probabilistic ranking metrics that naturally incorporates uncertainty. This article is protected by copyright. All rights reserved
    corecore