83 research outputs found
On the Dark Side of Calibration for Modern Neural Networks
Modern neural networks are highly uncalibrated. It poses a significant
challenge for safety-critical systems to utilise deep neural networks (DNNs),
reliably. Many recently proposed approaches have demonstrated substantial
progress in improving DNN calibration. However, they hardly touch upon
refinement, which historically has been an essential aspect of calibration.
Refinement indicates separability of a network's correct and incorrect
predictions. This paper presents a theoretically and empirically supported
exposition for reviewing a model's calibration and refinement. Firstly, we show
the breakdown of expected calibration error (ECE), into predicted confidence
and refinement. Connecting with this result, we highlight that regularisation
based calibration only focuses on naively reducing a model's confidence. This
logically has a severe downside to a model's refinement. We support our claims
through rigorous empirical evaluations of many state of the art calibration
approaches on standard datasets. We find that many calibration approaches with
the likes of label smoothing, mixup etc. lower the utility of a DNN by
degrading its refinement. Even under natural data shift, this
calibration-refinement trade-off holds for the majority of calibration methods.
These findings call for an urgent retrospective into some popular pathways
taken for modern DNN calibration.Comment: 15 pages including references and supplementa
Extracting information from informal communication
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (leaves 89-93).This thesis focuses on the problem of extracting information from informal communication. Textual informal communication, such as e-mail, bulletin boards and blogs, has become a vast information resource. However, such information is poorly organized and difficult for a computer to understand due to lack of editing and structure. Thus, techniques which work well for formal text, such as newspaper articles, may be considered insufficient on informal text. One focus of ours is to attempt to advance the state-of-the-art for sub-problems of the information extraction task. We make contributions to the problems of named entity extraction, co-reference resolution and context tracking. We channel our efforts toward methods which are particularly applicable to informal communication. We also consider a type of information which is somewhat unique to informal communication: preferences and opinions. Individuals often expression their opinions on products and services in such communication. Others' may read these "reviews" to try to predict their own experiences. However, humans do a poor job of aggregating and generalizing large sets of data. We develop techniques that can perform the job of predicting unobserved opinions.(cont.) We address both the single-user case where information about the items is known, and the multi-user case where we can generalize opinions without external information. Experiments on large-scale rating data sets validate our approach.by Jason D.M. Rennie.Ph.D
Real-Time Deep Learning-Based Face Recognition System
This research proposes Real-time Deep Learning-based Face recognition algorithms using MATLAB and Python. Generally, Face recognition is defined as the process through which people are identified using facial images. This technology is applied broadly in biometrics, security information, accessing controlled areas, etc. The facial recognition system can be built by following two steps. In the first step, the facial features are picked up or extracted, then the second step involves pattern classification. Deep learning, specifically the convolutional neural network (CNN), has recently made more progress in face recognition technology. Convolution Neural Network is one among the Deep Learning approaches and has shown excellent performance in many fields, such as image recognition of a large amount of training data (such as ImageNet). However, due to hardware limitations and insufficient training datasets, high performance is not achieved. Therefore, in this work, the Transfer Learning method is used to improve the performance of the face-recognition system even for a smaller number of images. For this, two pre-trained models, namely, GoogLeNet CNN (in MATLAB) and FaceNet (in Python) are used. Transfer learning is used to perform fine-tuning on the last layer of CNN model for new classification tasks. FaceNet presents a unified system for face verification (is this the same person?), recognition (who is this person?) and clustering (finds common people among these faces) using the method based on learning a Euclidean embedding per image using a deep convolutional network
Machine learning and electronic health records
In this work, we investigate the benefits and complications of using machine learning on EHR data. We survey some recent literature and conduct experiments on real data collected from hospital EHR systems.Masteroppgave i informatikkINF399MAMN-INFMAMN-PRO
Deep learning for supernovae detection
In future astronomical sky surveys it will be humanly impossible to classify the tens of thousands of candidate transients detected per night. This thesis explores the potential of using state-of-the-art machine learning algorithms to handle this burden more accurately and quickly than trained astronomers. To this end Deep Learning methods are applied to classify transients using real-world data from the Sloan Digital Sky Survey. Using cutting-edge training techniques several Convolutional Neural networks are trained and hyper-parameters tuned to outperform previous approaches and find that human labelling errors are the primary obstacle to further improvement. The tuning and optimisation of the deep models took in excess of 700 hours on a 4-Titan X GPU cluster
Machine Learning of Scientific Events: Classification, Detection, and Verification
Classification and segmentation of objects using machine learning algorithms have been widely used in a large variety of scientific domains in the past few decades. With the exponential growth in the number of ground-based, air-borne, and space-borne observatories, Heliophysics has been taking full advantage of such algorithms in many automated tasks, and obtained valuable knowledge by detecting solar events and analyzing the big-picture patterns. Despite the fact that in many cases, the strengths of the general-purpose algorithms seem to be transferable to problems of scientific domains where scientific events are of interest, in practice there are some critical issues which I address in this dissertation. First, I discuss the four main categories of such issues and then in the proceeding chapters I present real-world examples and the different approaches I take for tackling them. In Chapter II, I take a classical path for classification of three solar events; Active Regions, Coronal Holes, and Quiet Suns. I optimize a set of ten image parameters and improve the classification performance by up to 36%. In Chapter III, in contrast, I utilize an automated feature extraction algorithm, i.e., a deep neural network, for detection and segmentation of another solar event, namely solar Filaments. Using an off-the-shelf algorithm, I overcome several of the issues of the existing detection module, while facing an important challenge; lack of an appropriate evaluation metric for verification of the segmentations. In Chapter IV, I introduce a novel metric to provide a more accurate verification especially for salient objects with fine structures. This metric, called Multi-Scale Intersection over Union (MIoU), is a fusion of two concepts; fractal dimension from Geometry, and Intersection over Union (IoU) which is a popular metric for segmentation verification. Through several experiments I examine the advantages of using MIoU over IoU, and I conclude this chapter by a follow-through on the segmentation results of the previously implemented filament detection module
- …