Search CORE

83 research outputs found

On the Dark Side of Calibration for Modern Neural Networks

Author: Bay Alessandro
Mirabile Andrea
Sengupta Biswa
Singh Aditya
Publication venue
Publication date: 17/06/2021
Field of study

Modern neural networks are highly uncalibrated. It poses a significant challenge for safety-critical systems to utilise deep neural networks (DNNs), reliably. Many recently proposed approaches have demonstrated substantial progress in improving DNN calibration. However, they hardly touch upon refinement, which historically has been an essential aspect of calibration. Refinement indicates separability of a network's correct and incorrect predictions. This paper presents a theoretically and empirically supported exposition for reviewing a model's calibration and refinement. Firstly, we show the breakdown of expected calibration error (ECE), into predicted confidence and refinement. Connecting with this result, we highlight that regularisation based calibration only focuses on naively reducing a model's confidence. This logically has a severe downside to a model's refinement. We support our claims through rigorous empirical evaluations of many state of the art calibration approaches on standard datasets. We find that many calibration approaches with the likes of label smoothing, mixup etc. lower the utility of a DNN by degrading its refinement. Even under natural data shift, this calibration-refinement trade-off holds for the majority of calibration methods. These findings call for an urgent retrospective into some popular pathways taken for modern DNN calibration.Comment: 15 pages including references and supplementa

arXiv.org e-Print Archive

Extracting information from informal communication

Author: Rennie Jason D. M. (Jason Daniel Malyutin), 1976-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2007
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (leaves 89-93).This thesis focuses on the problem of extracting information from informal communication. Textual informal communication, such as e-mail, bulletin boards and blogs, has become a vast information resource. However, such information is poorly organized and difficult for a computer to understand due to lack of editing and structure. Thus, techniques which work well for formal text, such as newspaper articles, may be considered insufficient on informal text. One focus of ours is to attempt to advance the state-of-the-art for sub-problems of the information extraction task. We make contributions to the problems of named entity extraction, co-reference resolution and context tracking. We channel our efforts toward methods which are particularly applicable to informal communication. We also consider a type of information which is somewhat unique to informal communication: preferences and opinions. Individuals often expression their opinions on products and services in such communication. Others' may read these "reviews" to try to predict their own experiences. However, humans do a poor job of aggregating and generalizing large sets of data. We develop techniques that can perform the job of predicting unobserved opinions.(cont.) We address both the single-user case where information about the items is known, and the multi-user case where we can generalize opinions without external information. Experiments on large-scale rating data sets validate our approach.by Jason D.M. Rennie.Ph.D

DSpace@MIT

Real-Time Deep Learning-Based Face Recognition System

Author: Rajagopalan Aarthi
Publication venue: The Repository at St. Cloud State
Publication date: 01/08/2022
Field of study

This research proposes Real-time Deep Learning-based Face recognition algorithms using MATLAB and Python. Generally, Face recognition is defined as the process through which people are identified using facial images. This technology is applied broadly in biometrics, security information, accessing controlled areas, etc. The facial recognition system can be built by following two steps. In the first step, the facial features are picked up or extracted, then the second step involves pattern classification. Deep learning, specifically the convolutional neural network (CNN), has recently made more progress in face recognition technology. Convolution Neural Network is one among the Deep Learning approaches and has shown excellent performance in many fields, such as image recognition of a large amount of training data (such as ImageNet). However, due to hardware limitations and insufficient training datasets, high performance is not achieved. Therefore, in this work, the Transfer Learning method is used to improve the performance of the face-recognition system even for a smaller number of images. For this, two pre-trained models, namely, GoogLeNet CNN (in MATLAB) and FaceNet (in Python) are used. Transfer learning is used to perform fine-tuning on the last layer of CNN model for new classification tasks. FaceNet presents a unified system for face verification (is this the same person?), recognition (who is this person?) and clustering (finds common people among these faces) using the method based on learning a Euclidean embedding per image using a deep convolutional network

St. Cloud State University

Machine learning and electronic health records

Author: Stavland Sivert
Publication venue: The University of Bergen
Publication date: 01/01/2020
Field of study

In this work, we investigate the benefits and complications of using machine learning on EHR data. We survey some recent literature and conduct experiments on real data collected from hospital EHR systems.Masteroppgave i informatikkINF399MAMN-INFMAMN-PRO

University of Bergen

NORA - Norwegian Open Research Archives

Deep learning for supernovae detection

Author: Amar Gilad
Publication venue: Cosmology and Gravity Group
Publication date: 01/01/2017
Field of study

In future astronomical sky surveys it will be humanly impossible to classify the tens of thousands of candidate transients detected per night. This thesis explores the potential of using state-of-the-art machine learning algorithms to handle this burden more accurately and quickly than trained astronomers. To this end Deep Learning methods are applied to classify transients using real-world data from the Sloan Digital Sky Survey. Using cutting-edge training techniques several Convolutional Neural networks are trained and hyper-parameters tuned to outperform previous approaches and find that human labelling errors are the primary obstacle to further improvement. The tuning and optimisation of the deep models took in excess of 700 hours on a 4-Titan X GPU cluster

Cape Town University OpenUCT

Machine Learning of Scientific Events: Classification, Detection, and Verification

Author: Ahmadzadeh Azim
Publication venue: ScholarWorks @ Georgia State University
Publication date: 04/05/2021
Field of study

Classification and segmentation of objects using machine learning algorithms have been widely used in a large variety of scientific domains in the past few decades. With the exponential growth in the number of ground-based, air-borne, and space-borne observatories, Heliophysics has been taking full advantage of such algorithms in many automated tasks, and obtained valuable knowledge by detecting solar events and analyzing the big-picture patterns. Despite the fact that in many cases, the strengths of the general-purpose algorithms seem to be transferable to problems of scientific domains where scientific events are of interest, in practice there are some critical issues which I address in this dissertation. First, I discuss the four main categories of such issues and then in the proceeding chapters I present real-world examples and the different approaches I take for tackling them. In Chapter II, I take a classical path for classification of three solar events; Active Regions, Coronal Holes, and Quiet Suns. I optimize a set of ten image parameters and improve the classification performance by up to 36%. In Chapter III, in contrast, I utilize an automated feature extraction algorithm, i.e., a deep neural network, for detection and segmentation of another solar event, namely solar Filaments. Using an off-the-shelf algorithm, I overcome several of the issues of the existing detection module, while facing an important challenge; lack of an appropriate evaluation metric for verification of the segmentations. In Chapter IV, I introduce a novel metric to provide a more accurate verification especially for salient objects with fine structures. This metric, called Multi-Scale Intersection over Union (MIoU), is a fusion of two concepts; fractal dimension from Geometry, and Intersection over Union (IoU) which is a popular metric for segmentation verification. Through several experiments I examine the advantages of using MIoU over IoU, and I conclude this chapter by a follow-through on the segmentation results of the previously implemented filament detection module

ScholarWorks @ Georgia State University

Zero-Shot Learning of Human-Object Interactions through Common-sense Knowledge

Author: Sarullo Alessio
Publication venue
Publication date: 01/08/2021
Field of study

The University of Manchester - Institutional Repository