4,272 research outputs found
On information captured by neural networks: connections with memorization and generalization
Despite the popularity and success of deep learning, there is limited
understanding of when, how, and why neural networks generalize to unseen
examples. Since learning can be seen as extracting information from data, we
formally study information captured by neural networks during training.
Specifically, we start with viewing learning in presence of noisy labels from
an information-theoretic perspective and derive a learning algorithm that
limits label noise information in weights. We then define a notion of unique
information that an individual sample provides to the training of a deep
network, shedding some light on the behavior of neural networks on examples
that are atypical, ambiguous, or belong to underrepresented subpopulations. We
relate example informativeness to generalization by deriving nonvacuous
generalization gap bounds. Finally, by studying knowledge distillation, we
highlight the important role of data and label complexity in generalization.
Overall, our findings contribute to a deeper understanding of the mechanisms
underlying neural network generalization.Comment: PhD thesi
Applications of Digital Terrain Modeling to Address Problems in Geomorphology and Engineering Geology
This dissertation uses digital terrain modeling and computational methods to yield insight into three topics: 1) evaluating the influence of glacial topography on fluvial sediment transport in the Teton Range, WY, 2) integrating regional airborne lidar, UAV lidar, and structure from motion photogrammetry to characterize decadal-scale movement of slow-moving landslides in northern Kentucky, and 3) applying machine learning methods to surficial geologic mapping.
The role of topography as a boundary condition that controls the efficiency of fluvial erosion in the Teton Range, Wyoming, was investigated by using existing lidar data to delineate surficial geologic units, geometrically reconstruct the depth to bedrock, and estimate the sediment volume and sediment production rate in two catchments. This data was coupled with seismic reflection data in the bay into which these catchments drain. We found that while the sediment production rate of 0.17 ± 0.02 mm/yr is similar to the uplift rate of the Teton Range, only about 2.6% of the post-glacial sediment has been transported out of the catchments, and the denudation rate is just 0.004 ± 0.001 mm/yr. We conclude that once the topography has been altered by glaciers, which flatten the valley bottom and steepen the valley walls, rivers are incapable of evacuating the sediment effectively. Sediment will be trapped in the valleys until the next glacial advance, or until uplift steepens the system such that rivers can once again become efficient.
Repeat digital terrain surveys can be used to quantify changes to the Earth’s surface. Challenges include determining the threshold of change that can be detected when combining topographic data acquired by different platforms and of varying quality. To quantify the threshold of detectible elevation change in a slow-moving colluvial landslide in northern Kentucky over 14 years using county-wide lidar, uncrewed aerial vehicles (UAV) structure from motion surveys (SfM) and a UAV lidar survey, we used the statistics of noise from elevation difference maps in areas outside of the landslide. We found that the threshold of detectable elevation change ranges from 0.05 to 0.20 m, depending on the survey combination, and that detectable change in the landslide was found between all surveys, including those separated by only 2 weeks.
For most users, geologic maps may convey a level of certainty which obscures the decisions and interpretations made by the mapper. The combination of machine learning and digital terrain data provides a new method for producing geologic maps which can also convey and preserve the underlying uncertainty. We test the performance of machine learning methods to accurately map the surficial geology of two quadrangles in Kentucky using 31 variables derived from lidar data, including surface roughness, slope, topographic position, and residual topography. The performance of eight machine learning methods were compared, and the importance of each variable was measured. The classifier with the highest accuracy using just the most important variables was used to produce surficial geologic maps in 6 areas, with resulting accuracies ranging from 0.795 to 0.931. The uncertainty resulting from the machine learning process is conveyed using gradations of color, which can be modified depending on the needs of the map user
Recommended from our members
Asylum and immigration policy, policy communities and the British news media: a case study in policy-making
This research investigation examines the policy communities and networks (PC&N) perspective as a tool for understanding the influence of the news media in shaping the policy agenda. It does so, by examining the evolution of two case studies in a new policy arena, asylum, and immigration, from policy initiative to policy reversal. In order to understand how the dynamics of discourse shape the development of the policy agenda, it is fundamental to first understand the nature of information flow in social settings. Policy communities and networks provide the appropriate social setting in which to explore the role of the news media, as it facilitates the flow in which information is constructed, distributed, and absorbed within them. Existing literature on the influence of the news media on the development of opinion making is extensive, however literature on the influence of the news media on the development of policy making is emergent. By applying the PC&N perspective to understanding the role of the news media on issue definition, decision making and policy change, this research investigation contributes to the literature on both; as well as the emergent literature on the influence of the news media on immigration and asylum policy itself. In addition, through its empirical examination of the evolution of case study asylum and immigration policy reversals, this research investigation utilises a new methodology, content analysis, to identify the existence, nature and membership of policy communities and networks and insider groups active within them. In providing strong evidence that the policy communities and networks perspective is a valid approach for understanding the nature of policymaking and the role of the news media in shaping policy agendas, it also provides an alternative approach to examining policy making in an emergent field of policy science research, asylum and immigration policy network analysis
Computing Interpretable Representations of Cell Morphodynamics
Shape changes (morphodynamics) are one of the principal ways cells interact with their environments and perform key intrinsic behaviours like division. These dynamics arise from a myriad of complex signalling pathways that often organise with emergent simplicity to carry out critical functions including predation, collaboration and migration. A powerful method for analysis can therefore be to quantify this emergent structure, bypassing the low-level complexity. Enormous image datasets are now available to mine. However, it can be difficult to uncover interpretable representations of the global organisation of these heterogeneous dynamic processes. Here, such representations were developed for interpreting morphodynamics in two key areas: mode of action (MoA) comparison for drug discovery (developed using the economically devastating Asian soybean rust crop pathogen) and 3D migration of immune system T cells through extracellular matrices (ECMs). For MoA comparison, population development over a 2D space of shapes (morphospace) was described using two models with condition-dependent parameters: a top-down model of diffusive development over Waddington-type landscapes, and a bottom-up model of tip growth. A variety of landscapes were discovered, describing phenotype transitions during growth, and possible perturbations in the tip growth machinery that cause this variation were identified. For interpreting T cell migration, a new 3D shape descriptor that incorporates key polarisation information was developed, revealing low-dimensionality of shape, and the distinct morphodynamics of run-and-stop modes that emerge at minute timescales were mapped. Periodically oscillating morphodynamics that include retrograde deformation flows were found to underlie active translocation (run mode). Overall, it was found that highly interpretable representations could be uncovered while still leveraging the enormous discovery power of deep learning algorithms. The results show that whole-cell morphodynamics can be a convenient and powerful place to search for structure, with potentially life-saving applications in medicine and biocide discovery as well as immunotherapeutics.Open Acces
Efficient instance and hypothesis space revision in Meta-Interpretive Learning
Inductive Logic Programming (ILP) is a form of Machine Learning. The goal of ILP is to induce hypotheses, as logic programs, that generalise training examples. ILP is characterised by a high expressivity, generalisation ability and interpretability. Meta-Interpretive Learning (MIL) is a state-of-the-art sub-field of ILP. However, current MIL approaches have limited efficiency: the sample and learning complexity respectively are polynomial and exponential in the number of clauses. My thesis is that improvements over the sample and learning complexity can be achieved in MIL through instance and hypothesis space revision. Specifically, we investigate 1) methods that revise the instance space, 2) methods that revise the hypothesis space and 3) methods that revise both the instance and the hypothesis spaces for achieving more efficient MIL.
First, we introduce a method for building training sets with active learning in Bayesian MIL. Instances are selected maximising the entropy. We demonstrate this method can reduce the sample complexity and supports efficient learning of agent strategies. Second, we introduce a new method for revising the MIL hypothesis space with predicate invention. Our method generates predicates bottom-up from the background knowledge related to the training examples. We demonstrate this method is complete and can reduce the learning and sample complexity. Finally, we introduce a new MIL system called MIGO for learning optimal two-player game strategies. MIGO learns from playing: its training sets are built from the sequence of actions it chooses. Moreover, MIGO revises its hypothesis space with Dependent Learning: it first solves simpler tasks and can reuse any learned solution for solving more complex tasks. We demonstrate MIGO significantly outperforms both classical and deep reinforcement learning. The methods presented in this thesis open exciting perspectives for efficiently learning theories with MIL in a wide range of applications including robotics, modelling of agent strategies and game playing.Open Acces
Signals and Images in Sea Technologies
Life below water is the 14th Sustainable Development Goal (SDG) envisaged by the United Nations and is aimed at conserving and sustainably using the oceans, seas, and marine resources for sustainable development. It is not difficult to argue that signals and image technologies may play an essential role in achieving the foreseen targets linked to SDG 14. Besides increasing the general knowledge of ocean health by means of data analysis, methodologies based on signal and image processing can be helpful in environmental monitoring, in protecting and restoring ecosystems, in finding new sensor technologies for green routing and eco-friendly ships, in providing tools for implementing best practices for sustainable fishing, as well as in defining frameworks and intelligent systems for enforcing sea law and making the sea a safer and more secure place. Imaging is also a key element for the exploration of the underwater world for various scopes, ranging from the predictive maintenance of sub-sea pipelines and other infrastructure projects, to the discovery, documentation, and protection of sunken cultural heritage. The scope of this Special Issue encompasses investigations into techniques and ICT approaches and, in particular, the study and application of signal- and image-based methods and, in turn, exploration of the advantages of their application in the previously mentioned areas
Recent Advances in Single-Particle Tracking: Experiment and Analysis
This Special Issue of Entropy, titled “Recent Advances in Single-Particle Tracking: Experiment and Analysis”, contains a collection of 13 papers concerning different aspects of single-particle tracking, a popular experimental technique that has deeply penetrated molecular biology and statistical and chemical physics. Presenting original research, yet written in an accessible style, this collection will be useful for both newcomers to the field and more experienced researchers looking for some reference. Several papers are written by authorities in the field, and the topics cover aspects of experimental setups, analytical methods of tracking data analysis, a machine learning approach to data and, finally, some more general issues related to diffusion
Distilling the neural correlates of conscious somatosensory perception
The ability to consciously perceive the world profoundly defines our lives as human beings. Somehow, our brains process information in a way that allows us to become aware of the images, sounds, touches, smells, and tastes surrounding us. Yet our understanding of the neurobiological processes that generate perceptual awareness is very limited. One of the most contested questions in the neuroscientific study of conscious perception is whether awareness arises from the activity of early sensory brain regions, or instead requires later processing in widespread supramodal networks. It has been suggested that the conflicting evidence supporting these two perspectives may be the result of methodological confounds in classical experimental tasks. In order to infer participants’ perceptual awareness in these tasks, they need to report the contents of their perception. This means that the neural signals underlying the emergence of perceptual awareness often cannot be dissociated from pre- and postperceptual processes. Consequently, some of the previously observed effects may not be correlates of awareness after all but instead may have resulted from task requirements.
In this thesis, I investigate this possibility in the somatosensory modality. To scrutinise the task dependence of the neural correlates of somatosensory awareness, I developed an experimental paradigm that controls for the most common experimental confounds. In a somatosensory-visual matching task, participants were required to detect electrical target stimuli at ten different intensity levels. Instead of reporting their perception directly, they compared their somatosensory percepts to simultaneously presented visual cues that signalled stimulus presence or absence and then reported a match or mismatch accordingly. As a result, target detection was decorrelated from working memory and reports, the behavioural relevance of detected and undetected stimuli was equated, the influence of attentional processes was mitigated, and perceptual uncertainty was varied in a controlled manner. Results from a functional magnetic resonance imaging (fMRI) study and an electroencephalography (EEG) study showed that, when controlled for task demands, the neural correlates of somatosensory awareness were restricted to relatively early activity (~150 ms) in secondary somatosensory regions. In contrast, late activity (>300 ms) indicative of processing in frontoparietal networks occurred irrespective of stimulus awareness, and activity in anterior insular, anterior cingulate, and supplementary motor cortex was associated with processing perceptual uncertainty and reports. These results add novel evidence to the early-local vs. late-global debate and favour the view that perceptual awareness emerges at the level of modality-specific sensory cortices.Die Fähigkeit zur bewussten Wahrnehmung bestimmt maßgeblich unser Selbstbild als Menschen. Unser Gehirn verarbeitet Informationen auf eine Weise, die es uns ermöglicht, uns der Bilder, Töne, Berührungen, Gerüche und Geschmäcker, die uns umgeben, bewusst zu werden. Unser Verständnis davon, wie neurobiologische Prozesse diese bewusste Wahrnehmung erzeugen, ist jedoch noch sehr begrenzt. Eine der umstrittensten Fragen in der neurowissenschaftlichen Erforschung des perzeptuellen Bewusstseins besteht darin, ob die bewusste Wahrnehmung aus der Aktivität früher sensorischer Hirnregionen entsteht, oder aber die spätere Prozessierung in ausgedehnten supramodalen Netzwerken erfordert. Eine mögliche Erklärung für die widersprüchlichen Ergebnisse, die diesen beiden Perspektiven zugrunde liegen, wird in methodologischen Störfaktoren vermutet, die in klassischen experimentellen Paradigmen auftreten können. Um auf die Wahrnehmung der Versuchspersonen schließen zu können, müssen diese den Inhalt ihrer Wahrnehmung berichten. Das führt dazu, dass neuronale Korrelate bewusster Wahrnehmung häufig nicht sauber von prä- und postperzeptuellen Prozessen getrennt werden können. Folglich könnten einige der zuvor beobachteten Effekte, anstatt tatsächlich bewusste Wahrnehmung widerzuspiegeln, aus den Anforderungen experimenteller Paradigmen entstanden sein.
In dieser Arbeit untersuche ich diese Möglichkeit in der somatosensorischen Modalität. Um zu überprüfen, inwiefern neuronale Korrelate bewusster somatosensorischer Wahrnehmung von den Anforderungen experimenteller Aufgaben abhängen, habe ich ein Paradigma entwickelt, dass die häufigsten experimentellen Störfaktoren kontrolliert. In einer somatosensorisch-visuellen Vergleichsaufgabe mussten die Versuchspersonen elektrische Zielreize in zehn verschiedenen Intensitätsstufen detektieren. Anstatt diese jedoch direkt zu berichten, sollten sie ihre somatosensorischen Perzepte mit gleichzeitig präsentierten visuellen Symbolen vergleichen, die entweder Reizanwesenheit oder -abwesenheit signalisierten. Entsprechend wurde dann eine Übereinstimmung oder Nichtübereinstimmung berichtet. Dadurch wurde die Reizwahrnehmung von Arbeitsgedächtnis und Berichterstattung dekorreliert, die Verhaltensrelevanz detektierter und nicht detektierter Reize gleichgesetzt, der Einfluss von Aufmerksamkeitsprozessen reduziert und die mit der Detektion verbundene Unsicherheit auf kontrollierte Weise variiert. Die Ergebnisse aus einer funktionellen Magnetresonanztomographie (fMRT)-Studie und einer Elektroenzephalographie (EEG)-Studie zeigen, dass die neuronalen Korrelate bewusster somatosensorischer Wahrnehmung auf relativ frühe Aktivität (~150 ms) in sekundären somatosensorischen Regionen beschränkt sind, wenn experimentelle Störfaktoren kontrolliert werden. Im Gegensatz dazu trat späte Aktivität (>300 ms), die auf die Verarbeitung in frontoparietalen Netzwerken hindeutet, unabhängig von der Reizwahrnehmung auf, und Aktivität im anterioren insulären, anterioren cingulären und supplementär-motorischen Kortex war mit der Verarbeitung von Detektionsunsicherheit und der Berichterstattung verbunden. Diese Ergebnisse liefern neue Erkenntnisse zur Debatte um die Relevanz früher, lokaler vs. später, globaler Hirnaktivität und unterstützen die Ansicht, dass perzeptuelles Bewusstsein in modalitätsspezifischen sensorischen Kortizes entsteht
Recommended from our members
Mixture Models in Machine Learning
Modeling with mixtures is a powerful method in the statistical toolkit that can be used for representing the presence of sub-populations within an overall population. In many applications ranging from financial models to genetics, a mixture model is used to fit the data. The primary difficulty in learning mixture models is that the observed data set does not identify the sub-population to which an individual observation belongs. Despite being studied for more than a century, the theoretical guarantees of mixture models remain unknown for several important settings.
In this thesis, we look at three groups of problems. The first part is aimed at estimating the parameters of a mixture of simple distributions. We ask the following question: How many samples are necessary and sufficient to learn the latent parameters? We propose several approaches for this problem that include complex analytic tools to connect statistical distances between pairs of mixtures with the characteristic function. We show sufficient sample complexity guarantees for mixtures of popular distributions (including Gaussian, Poisson and Geometric). For many distributions, our results provide the first sample complexity guarantees for parameter estimation in the corresponding mixture. Using these techniques, we also provide improved lower bounds on the Total Variation distance between Gaussian mixtures with two components and demonstrate new results in some sequence reconstruction problems.
In the second part, we study Mixtures of Sparse Linear Regressions where the goal is to learn the best set of linear relationships between the scalar responses (i.e., labels) and the explanatory variables (i.e., features). We focus on a scenario where a learner is able to choose the features to get the labels. To tackle the high dimensionality of data, we further assume that the linear maps are also sparse , i.e., have only few prominent features among many. For this setting, we devise algorithms with sub-linear (as a function of the dimension) sample complexity guarantees that are also robust to noise.
In the final part, we study Mixtures of Sparse Linear Classifiers in the same setting as above. Given a set of features and the binary labels, the objective of this task is to find a set of hyperplanes in the space of features such that for any (feature, label) pair, there exists a hyperplane in the set that justifies the mapping. We devise efficient algorithms with sub-linear sample complexity guarantees for learning the unknown hyperplanes under similar sparsity assumptions as above. To that end, we propose several novel techniques that include tensor decomposition methods and combinatorial designs
- …