974 research outputs found

    Adversarial content manipulation for analyzing and improving model robustness

    Get PDF
    The recent rapid progress in machine learning systems has opened up many real-world applications --- from recommendation engines on web platforms to safety critical systems like autonomous vehicles. A model deployed in the real-world will often encounter inputs far from its training distribution. For example, a self-driving car might come across a black stop sign in the wild. To ensure safe operation, it is vital to quantify the robustness of machine learning models to such out-of-distribution data before releasing them into the real-world. However, the standard paradigm of benchmarking machine learning models with fixed size test sets drawn from the same distribution as the training data is insufficient to identify these corner cases efficiently. In principle, if we could generate all valid variations of an input and measure the model response, we could quantify and guarantee model robustness locally. Yet, doing this with real world data is not scalable. In this thesis, we propose an alternative, using generative models to create synthetic data variations at scale and test robustness of target models to these variations. We explore methods to generate semantic data variations in a controlled fashion across visual and text modalities. We build generative models capable of performing controlled manipulation of data like changing visual context, editing appearance of an object in images or changing writing style of text. Leveraging these generative models we propose tools to study robustness of computer vision systems to input variations and systematically identify failure modes. In the text domain, we deploy these generative models to improve diversity of image captioning systems and perform writing style manipulation to obfuscate private attributes of the user. Our studies quantifying model robustness explore two kinds of input manipulations, model-agnostic and model-targeted. The model-agnostic manipulations leverage human knowledge to choose the kinds of changes without considering the target model being tested. This includes automatically editing images to remove objects not directly relevant to the task and create variations in visual context. Alternatively, in the model-targeted approach the input variations performed are directly adversarially guided by the target model. For example, we adversarially manipulate the appearance of an object in the image to fool an object detector, guided by the gradients of the detector. Using these methods, we measure and improve the robustness of various computer vision systems -- specifically image classification, segmentation, object detection and visual question answering systems -- to semantic input variations.Der schnelle Fortschritt von Methoden des maschinellen Lernens hat viele neue Anwendungen ermöglicht – von Recommender-Systemen bis hin zu sicherheitskritischen Systemen wie autonomen Fahrzeugen. In der realen Welt werden diese Systeme oft mit Eingaben außerhalb der Verteilung der Trainingsdaten konfrontiert. Zum Beispiel könnte ein autonomes Fahrzeug einem schwarzen Stoppschild begegnen. Um sicheren Betrieb zu gewährleisten, ist es entscheidend, die Robustheit dieser Systeme zu quantifizieren, bevor sie in der Praxis eingesetzt werden. Aktuell werden diese Modelle auf festen Eingaben von derselben Verteilung wie die Trainingsdaten evaluiert. Allerdings ist diese Strategie unzureichend, um solche Ausnahmefälle zu identifizieren. Prinzipiell könnte die Robustheit “lokal” bestimmt werden, indem wir alle zulässigen Variationen einer Eingabe generieren und die Ausgabe des Systems überprüfen. Jedoch skaliert dieser Ansatz schlecht zu echten Daten. In dieser Arbeit benutzen wir generative Modelle, um synthetische Variationen von Eingaben zu erstellen und so die Robustheit eines Modells zu überprüfen. Wir erforschen Methoden, die es uns erlauben, kontrolliert semantische Änderungen an Bild- und Textdaten vorzunehmen. Wir lernen generative Modelle, die kontrollierte Manipulation von Daten ermöglichen, zum Beispiel den visuellen Kontext zu ändern, die Erscheinung eines Objekts zu bearbeiten oder den Schreibstil von Text zu ändern. Basierend auf diesen Modellen entwickeln wir neue Methoden, um die Robustheit von Bilderkennungssystemen bezüglich Variationen in den Eingaben zu untersuchen und Fehlverhalten zu identifizieren. Im Gebiet von Textdaten verwenden wir diese Modelle, um die Diversität von sogenannten Automatische Bildbeschriftung-Modellen zu verbessern und Schreibtstil-Manipulation zu erlauben, um private Attribute des Benutzers zu verschleiern. Um die Robustheit von Modellen zu quantifizieren, werden zwei Arten von Eingabemanipulationen untersucht: Modell-agnostische und Modell-spezifische Manipulationen. Modell-agnostische Manipulationen basieren auf menschlichem Wissen, um bestimmte Änderungen auszuwählen, ohne das entsprechende Modell miteinzubeziehen. Dies beinhaltet das Entfernen von für die Aufgabe irrelevanten Objekten aus Bildern oder Variationen des visuellen Kontextes. In dem alternativen Modell-spezifischen Ansatz werden Änderungen vorgenommen, die für das Modell möglichst ungünstig sind. Zum Beispiel ändern wir die Erscheinung eines Objekts um ein Modell der Objekterkennung täuschen. Dies ist durch den Gradienten des Modells möglich. Mithilfe dieser Werkzeuge können wir die Robustheit von Systemen zur Bildklassifizierung oder -segmentierung, Objekterkennung und Visuelle Fragenbeantwortung quantifizieren und verbessern

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Investigation of protein-protein interactions: multibody docking, association/dissociation kinetics and macromolecular crowding

    Get PDF
    Protein-protein interactions are central to understanding how cells carry out their wide array of functions and metabolic procedures. Conventional studies on specific protein interactions focus either on details of one-to-one binding interfaces, or on large networks that require a priori knowledge of binding strengths. Moreover, specific protein interactions, occurring within a crowded macromolecular environment, which is precisely the case for interactions in a real cell, are often under-investigated. A macromolecular simulation package, called BioSimz, has been developed to perform Langevin dynamics simulations on multiple protein-protein interactions at atomic resolution, aimed at bridging the gaps between structural, kinetic and crowding studies on protein-protein interactions. Simulations on twenty-seven experimentally determined protein-protein interactions, indicated that the use of contact frequency information of proteins forming specific encounters can guide docking algorithms towards the most likely binding regions. Further evidence from eleven benchmarked protein interactions showed that the association rate constant of a complex, kon, can be estimated, with good agreement to experimental values, based on the retention time of its specific encounter. Performing these simulations with ten types of environmental protein crowders, it suggests, from the change of kon, that macromolecular crowding improves the association kinetics of slower-binding proteins, while it damps the association kinetics of fast, electrostatics-driven protein-protein interactions. It is hypothesised, based on evidence from docking, kinetics and crowding, that the dynamics of specific protein-protein encounters is vitally important in determining their association affinity. There are multiple factors by which encounter dynamics, and subsequently the kon, can be influenced, such as anchor residues, long-range forces, and environmental steering via crowders’ electrostatics and/or volume exclusion. The capacity of emulating these conditions on a common platform not only provides a holistic view of interacting dynamics, but also offers the possibility of evaluating and engineering protein-protein interactions from aspects that have never been opened before

    The large-scale environments of radio-loud active galactic nuclei and their evolution across cosmic time

    Get PDF
    Emerging from the cosmic web, galaxy clusters are the most massive gravitationally bound structures in the universe. Thought to have begun their assembly at 2 1.5 where major assembly is in progress. The search for galaxy clusters at high redshift, so far, has been mildly successful and only a handful of clusters at z > 1.5 have been confirmed. Because this redshift range was essentially unreachable with previous instrumentation, it was dubbed a ‘redshift desert’. The work presented in this thesis has made a major contribution to this field. The Clusters Around Radio- Loud AGN (CARLA) survey, a 400 hr targeted Warm Spitzer program, observed 420 radio-loud AGN (active galactic nuclei) at 1.3 1.5. We also showed that radio-loud AGN reside in denser environments than similarly massive galaxies. This makes high-redshift clusters around radio-loud AGN particularly interesting as they can reveal how galaxies in the most massive dark matter halos assembled. A complementary project, HERGE (Herschel Radio Galaxy Evolution Project) observed a sample of 71 radio galaxies at 1 < z < 5 at far-IR wavelengths with the Herschel Space Observatory. Supporting data in the mid-IR, partially in the near-IR and at sub-mm wave- lengths allow to study cluster fields in more detail. A pilot project on a single field showed that we can identify cluster members and constrain their star-formation properties. These projects laid the foundation for future work, which will make a significant impact on understanding the formation of the most massive structures over several billion years

    Reducing Inequality And Poverty During Liberalisation In China: Rural And Agricultural Experiences And Policy Options

    Get PDF
    While liberalisation is designed to help growth and alleviate poverty by removing impediments that stop people and regions from specialising and trading, the process known as Core liberalisation (CL) has three components: it frees markets in goods and services, land, capital, and labour; phases out non-market influences on prices; and clarifies property rights. In the case of China, CL accompanied rapid, robust economic growth and reduction in poverty. However, from the mid-1980s, inequality – among regions, between city and village, and within rural communities – soared, leaving stubborn poverty increasingly concentrated in ‘rural poverty islands’ (RPIs). By 2001, almost 40 per cent of China’s poor – but only about a fifth of the population – lived in these RPIs. This paper analyses evidence of liberalisation in China, factors limiting the gains from CL for poor people and regions, and provides policy recommendations.Inequality, Poverty, Trade Liberalisation, Rural, Agriculture, China.

    INTEGRATED MODELING AND MONITORING FOR A HEALTHY AND SUSTAINABLE BUILDING ENVIRONMENT

    Get PDF
    The transmission of airborne diseases indoors is a significant challenge to public health. Buildings are hotspots for viral transmission, which can result in adverse effects on human health and quality of life, especially considering that individuals spend approximately 87% of their time indoors. The emergence of the COVID-19 pandemic has highlighted the importance of considering health aspects during the development of sustainable built environments. Consequently, maintaining a healthy, sustainable, and comfortable built environment represents a major challenge for facilities management teams. However, research on the infection risks associated with emerging pandemics is still in its infancy, and the effectiveness of intervention strategies remains uncertain. Furthermore, the complex interplay between health, energy consumption, and human comfort remains poorly understood, impeding the development of comprehensive control strategies that encompass all three critical dimensions of building sustainability. In addition, existing technologies have limitations to conduct real-time monitoring, while current communication methods between occupants and facilities management teams suffer from a lack of effectiveness, user-friendliness, and informativeness. These deficiencies hinder their ability to address the pressing needs of occupants during pandemics. To address these challenges, this dissertation proposes a convergent framework that integrates modeling, simulation, and monitoring methodologies for the development and maintenance of a sustainable built environment. Airborne transmission risks were first modeled and estimated under different epidemic scenarios, allowing for the evaluation of various intervention strategies. Facility data was then used to develop methods for modeling and simulating the dimensions of energy consumption and thermal comfort, allowing for the identification of tradeoff relationships among health, energy, and comfort, and quantitatively analyzing the impact of indoor environments through HVAC control strategies on the three major dimensions. Finally, an integrated platform was developed to enable the real-time assessment of health, energy, and comfort, including monitoring, visualization, and conversational communication functionalities. The developed framework thus encompasses modeling, simulation, monitoring, and communication capabilities and can be widely adopted by facility management teams, providing insights and guidance to governments and policymakers based on their specific needs. The applicability of the framework extends beyond specific pandemics and can be used to address a broader range of infectious diseases

    Satellites as probes of dark matter and gravitational theories

    Get PDF
    The Milky Way hosts about 150 globular clusters, and at least 17 dwarf spheroidal galaxies. These satellites experience a constantly changing gravitational field on their orbits. Close encounters with the Galactic bulge and passages through the Galactic disk enhance the effect of the constantly changing tidal field. As a consequence satellite member stars can leave their host's gravitational potential. For globular clusters, internal mechanisms, such as 2-body relaxation are also resulting in a loss of stars. Hence, the globular clusters are constantly losing stars and are being dissolved. In this thesis I investigate 17 globular cluster for signs of dissolution. I.e., we are studying the two-dimensional distribution of (potential) cluster member stars on the sky using photometric data from the Sloan Digital Sky Survey. We use a color-magnitude weighted counting algorithm to count the stars around the globular clusters. We detect the known tidal tails of Pal 5 and NGC 5466. Further, we also confirm some previous finding of possible tidal features for NGC 5053 and NGC 6341. For NGC 4147, we observe for the first time complex two-dimensional features, resembling a multiple-arm morphology. For almost all clusters in our sample we observe a halo of extra tidal stars. We observe no new large scale tidal features for our sample of clusters containing stars brighter than ~22.5 mag. The lack of large scale tidal tails is compatible with theoretical predictions of the destruction timescales for the clusters in our sample. We also observe the two-dimensional distribution of stars around three dwarf spheroidal galaxies: Sextans, Leo II, and Ursa Minor. Each galaxy reveals a unique structure. The main, luminous body of Sextans is not filling the tidal radius. We observe an off-center peak of highest stellar density. For Leo II, we observe an almost symmetric structure, compatible with the theory that Leo II has never come close to the Milky Way. We detect the complex structure of Ursa Minor, with two off-center peaks. We observe no large scale structure emanating from this dwarf galaxy. We further investigate the possibility of a line-of-sight depth of Sextans and Ursa Minor. We study the thickness of the blue horizontal branch. For Sextans, we observe an increasing thickness with increasing radius, comparable with the photometric error. Only detailed modeling will be able to show the significance of this varying thickness. For Ursa Minor, the increase in horizontal branch thickness is negligible, compared to the photometric error. Hence, Ursa Minor shows no sign of a significant line-of-sight depth. The distribution of red and blue horizontal stars was investigated for Sextans. The ''red'' population is much more concentrated. The peak of the density of the two populations does not coincide. Further, we investigated one globular cluster in particular, Pal 14. This cluster is sparse and at a remote location in the Galaxy. We aim to answer the question whether Pal 14 is governed by classical or modified Newtonian dynamics. We measured the radial velocity of 17 red giant branch stars and (probable) AGB stars with UVES@VLT and the Keck I telescope. The resulting line-of-sight velocity dispersion is comparable to the theoretical predictions for the case of classical dynamics. The predicted value for modified dynamics is about twice as large as the observed value. With HST images we derived the cluster's mass function and computed its total mass. The main sequence mass function slope is flatter than the canonical value, the cluster seems to be depleted in lower mass stars. N-body simulations predict for a given mass of the cluster its line-of-sight velocity dispersion in modified dynamics. The measured mass for Pal 14 is requiring a much larger velocity dispersion in modified Newtonian dynamics than we have measured. This leads to the conclusion that if Pal 14 is on a circular orbit, modified dynamics cannot explain the low velocity dispersion and the measured mass simultaneously

    Resilient Infrastructure and Building Security

    Get PDF
    corecore