Search CORE

1,399 research outputs found

Using manual and automated annotations to search images by semantic similarity

Author: CGM Snoek
CP Town
H Turtle
HD Wactlar
João Magalhães
JZ Wang
M Flickner
MJ Swain
N Rasiwasia
N Vasconcelos
Stefan Rüger
T Volkmer
Y Rui
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Crossref

Open Research Online (The Open University)

On the Differential Privacy of Bayesian Inference

Author: Dimitrakakis Christos
Rubinstein Benjamin
Zhang Zuhe
Publication venue
Publication date: 22/12/2015
Field of study

We study how to communicate findings of Bayesian inference to third parties, while preserving the strong guarantee of differential privacy. Our main contributions are four different algorithms for private Bayesian inference on proba-bilistic graphical models. These include two mechanisms for adding noise to the Bayesian updates, either directly to the posterior parameters, or to their Fourier transform so as to preserve update consistency. We also utilise a recently introduced posterior sampling mechanism, for which we prove bounds for the specific but general case of discrete Bayesian networks; and we introduce a maximum-a-posteriori private mechanism. Our analysis includes utility and privacy bounds, with a novel focus on the influence of graph structure on privacy. Worked examples and experiments with Bayesian na{\"i}ve Bayes and Bayesian linear regression illustrate the application of our mechanisms.Comment: AAAI 2016, Feb 2016, Phoenix, Arizona, United State

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL Descartes

Chalmers Research

Chalmers Publication Library

Hal-Diderot

Association for the Advancement of Artificial Intelligence: AAAI Publications

A lossless online Bayesian classifier.

Author: Liew Alan Wee-Chung
Nguyen Thi Thu Thuy
Nguyen Tien Thanh
Sharma Rabi
Publication venue: 'Elsevier BV'
Publication date: 16/03/2019
Field of study

We are living in a world progressively driven by data. Besides the issue that big data cannot be entirely stored in the main memory as required by traditional offline learning methods, the problem of learning data that can only be collected over time is also very prevalent. Consequently, there is a need of online methods which can handle sequentially arriving data and offer the same accuracy as offline methods. In this paper, we introduce a new lossless online Bayesian-based classifier which uses the arriving data in a 1-by-1 manner and discards each data right after use. The lossless property of our proposed method guarantees that it can reach the same prediction performance as its offline counterpart regardless of the incremental training order. Experimental results demonstrate its superior performance over many well-known state-of-the-art online learning methods in the literature

Open Access Institutional Repository at Robert Gordon University

Assesing Completeness of Solvency and Financial Condition Reports through the use of Machine Learning and Text Classification

Author: Nugent Ruairí
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2018
Field of study

Text mining is a method for extracting useful information from unstructured data through the identification and exploration of large amounts of text. It is a valuable support tool for organisations. It enables a greater understanding and identification of relevant business insights from text. Critically it identifies connections between information within texts that would otherwise go unnoticed. Its application is prevalent in areas such as marketing and political science however, until recently it has been largely overlooked within economics. Central banks are beginning to investigate the benefits of machine learning, sentiment analysis and natural language processing in light of the large amount of unstructured data available to them. This includes news articles, financial contracts, social media, supervisory and market intelligence and regulatory reports. In this research paper a dataset consisting of regulatory required Solvency and Financial Condition Reports (SFCR) is analysed to determine if machine learning and text classification can assist assessing the completeness of SFCRs. The completeness is determined by whether or not the document adheres to nine European guidelines. Natural language processing and supervised machine learning techniques are implemented to classify pages of the report as belonging to one of the guidelines

Arrow@TUDublin

System (for) Tracking Equilibrium and Determining Incline (STEADI)

Author: Silvina Agastya
Publication venue
Publication date: 01/07/2013
Field of study

The goal of this project was to design and implement a smartphone-based wearable system to detect fall events in real time. It has the acronym STEADI. Rather than have expensive customised hardware STEADI was implemented in a cost effective manner using a generic mobile computing device. In order to detect the fall event, we propose a fall detector that uses the accelerometer available in a mobile phone. As for detecting a fall we mainly divide the system in two sections, the signal processing and classification. For the processing both a median filter and a high pass filter are used. A Median filter is used to amplify/enhance the signal by removing impulsive noise while preserving the signal shape while the High pass filter is used to emphasise transitions in the signal. Then, in order to recognize a fall event, our STEADI system implements two methods that are a simple threshold analysis to determine whether or not a fall has occurred (threshold-based) and a more sophisticated Naïve-Bayes classification method to differentiate falling from other mobile activities. Our experimental results show that by applying the signal processing and Naïve-Bayes classification together increases the accuracy by more than 20% compared with using the threshold-based method alone. The Naïve-Bayes achieved a detection accuracy of 95% in overall. Furthermore, an external sensor is introduced in order to enhance its accuracy. In addition to the fall detection, the systems can also provide location information using Google Maps as to the whereabouts of the fall event using the available GPS on the smartphone and sends the message to the caretaker via an SMS

MURAL - Maynooth University Research Archive Library

System (for) Tracking Equilibrium and Determining Incline (STEADI)

Author: Silvina Agastya
Publication venue
Publication date: 01/07/2013
Field of study

MURAL - Maynooth University Research Archive Library

Irish Universities

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Semi-Supervised Learning For Identifying Opinions In Web Content

Author: Yu Ning
Publication venue: [Bloomington, Ind.] : Indiana University
Publication date: 01/01/2011
Field of study

Thesis (Ph.D.) - Indiana University, Information Science, 2011Opinions published on the World Wide Web (Web) offer opportunities for detecting personal attitudes regarding topics, products, and services. The opinion detection literature indicates that both a large body of opinions and a wide variety of opinion features are essential for capturing subtle opinion information. Although a large amount of opinion-labeled data is preferable for opinion detection systems, opinion-labeled data is often limited, especially at sub-document levels, and manual annotation is tedious, expensive and error-prone. This shortage of opinion-labeled data is less challenging in some domains (e.g., movie reviews) than in others (e.g., blog posts). While a simple method for improving accuracy in challenging domains is to borrow opinion-labeled data from a non-target data domain, this approach often fails because of the domain transfer problem: Opinion detection strategies designed for one data domain generally do not perform well in another domain. However, while it is difficult to obtain opinion-labeled data, unlabeled user-generated opinion data are readily available. Semi-supervised learning (SSL) requires only limited labeled data to automatically label unlabeled data and has achieved promising results in various natural language processing (NLP) tasks, including traditional topic classification; but SSL has been applied in only a few opinion detection studies. This study investigates application of four different SSL algorithms in three types of Web content: edited news articles, semi-structured movie reviews, and the informal and unstructured content of the blogosphere. SSL algorithms are also evaluated for their effectiveness in sparse data situations and domain adaptation. Research findings suggest that, when there is limited labeled data, SSL is a promising approach for opinion detection in Web content. Although the contributions of SSL varied across data domains, significant improvement was demonstrated for the most challenging data domain--the blogosphere--when a domain transfer-based SSL strategy was implemented

IUScholarWorks (University of Indiana)

Improving Floating Search Feature Selection using Genetic Algorithm

Author: Homsapaya Kanyanut
Sornil Ohm
Publication venue: LPPM ITBis Lembah Dempo
Publication date: 01/12/2017
Field of study

Classification, a process for predicting the class of a given input data, is one of the most fundamental tasks in data mining. Classification performance is negatively affected by noisy data and therefore selecting features relevant to the problem is a critical step in classification, especially when applied to large datasets. In this article, a novel filter-based floating search technique for feature selection to select an optimal set of features for classification purposes is proposed. A genetic algorithm is employed to improve the quality of the features selected by the floating search method in each iteration. A criterion function is applied to select relevant and high-quality features that can improve classification accuracy. The proposed method was evaluated using 20 standard machine learning datasets of various size and complexity. The results show that the proposed method is effective in general across different classifiers and performs well in comparison with recently reported techniques. In addition, the application of the proposed method with support vector machine provides the best performance among the classifiers studied and outperformed previous researches with the majority of data sets

Journal of ICT Research and Applications

Directory of Open Access Journals

ITB Journal