Search CORE

9,733 research outputs found

Data Imputation through the Identification of Local Anomalies

Author: Kozat Suleyman S.
Ozkan Huseyin
Pelvan Ozgun S.
Publication venue
Publication date: 30/09/2014
Field of study

We introduce a comprehensive and statistical framework in a model free setting for a complete treatment of localized data corruptions due to severe noise sources, e.g., an occluder in the case of a visual recording. Within this framework, we propose i) a novel algorithm to efficiently separate, i.e., detect and localize, possible corruptions from a given suspicious data instance and ii) a Maximum A Posteriori (MAP) estimator to impute the corrupted data. As a generalization to Euclidean distance, we also propose a novel distance measure, which is based on the ranked deviations among the data attributes and empirically shown to be superior in separating the corruptions. Our algorithm first splits the suspicious instance into parts through a binary partitioning tree in the space of data attributes and iteratively tests those parts to detect local anomalies using the nominal statistics extracted from an uncorrupted (clean) reference data set. Once each part is labeled as anomalous vs normal, the corresponding binary patterns over this tree that characterize corruptions are identified and the affected attributes are imputed. Under a certain conditional independency structure assumed for the binary patterns, we analytically show that the false alarm rate of the introduced algorithm in detecting the corruptions is independent of the data and can be directly set without any parameter tuning. The proposed framework is tested over several well-known machine learning data sets with synthetically generated corruptions; and experimentally shown to produce remarkable improvements in terms of classification purposes with strong corruption separation capabilities. Our experiments also indicate that the proposed algorithms outperform the typical approaches and are robust to varying training phase conditions

arXiv.org e-Print Archive

Crossref

Bilkent University Institutional Repository

OpenMETU (Middle East Technical University)

Technocracy inside the rule of law : challenges in the foundations of legal norms

Author: Fallada García-Valle Juan Ramón
Publication venue
Publication date: 01/01/2012
Field of study

Technocracy is usually opposed to democracy. Here, another perspective is taken: technocracy is countered with the rule of law. In trying to understand the contemporary dynamics of the rule of law, two main types of legal systems (in a broad sense) have to be distinguished: firstly, the legal norm, studied by the science of law; secondly, the scientific laws (which includes the legalities of the different sciences and communities). They both contain normative prescriptions. But their differ in their subjects‘ source: while legal norms are the will’s expression of the normative authority, technical prescriptions can be derived from scientific laws, which are grounded over the commonly supposed objectivity of the scientific knowledge about reality. They both impose sanctions too, but in the legal norm they refer to what is established by the norm itself, while in the scientific legality they consist in the reward or the punishment derived from the efficacy or inefficacy to reach the end pursued by the action. The way of legitimation also differs: while legal norms have to have followed the formal procedures and must not have contravened any fundamental right, technical norms‘ validity depend on its theoretical foundations or on its efficacy. Nowadays, scientific knowledge has become and important feature in policy-making. Contradictions can arise between these legal systems. These conflicts are specially grave when the recognition or exercise of fundamental rights is instrumentally used, or when they are violated in order to increase the policies‘ efficacy. A political system is technocratic, when, in case of contradiction, the scientific law finally prevails

Hochschulschriftenserver - Universität Frankfurt am Main

Multidimensional Scaling on Multiple Input Distance Matrices

Author: Bai Song
Bai Xiang
Latecki Longin Jan
Tian Qi
Publication venue
Publication date: 12/02/2017
Field of study

Multidimensional Scaling (MDS) is a classic technique that seeks vectorial representations for data points, given the pairwise distances between them. However, in recent years, data are usually collected from diverse sources or have multiple heterogeneous representations. How to do multidimensional scaling on multiple input distance matrices is still unsolved to our best knowledge. In this paper, we first define this new task formally. Then, we propose a new algorithm called Multi-View Multidimensional Scaling (MVMDS) by considering each input distance matrix as one view. Our algorithm is able to learn the weights of views (i.e., distance matrices) automatically by exploring the consensus information and complementary nature of views. Experimental results on synthetic as well as real datasets demonstrate the effectiveness of MVMDS. We hope that our work encourages a wider consideration in many domains where MDS is needed

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Empathy, Simulation, and Neuroscience: A Phenomenological Case Against Simulation Theory

Author: Burns Timothy
Publication venue
Publication date: 01/01/2017
Field of study

In recent years, some simulation theorists have claimed that the discovery of mirror neurons provides empirical support for the position that mind reading is, at some basic level, simulation. The purpose of this essay is to question that claim. I begin by providing brief context for the current mind reading debate and then developing an influential simulationist account of mind reading. I then draw on the works of Edmund Husserl and Edith Stein to develop an alternative, phenomenological account. In conclusion, I offer multiple objections against simulation theory and argue that the empirical evidence mirror neurons offer us does not necessarily support the view that empathy is simulation

PhilPapers

Directory of Open Access Journals

Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation

Author: Albrecht
Austin
Baird
Batista
Boehm
Boehm
Breiman
Briand
Briand
Briand
Brockmeier
Cartwright
Cheung
Clark
Feelders
Finnie
Gama
Gray
Holte
Jain
Jeffery
Jun Liu
Jönsson
Kemerer
Khotanzad
Kibler
Kim
Kitchenham
Kohavi
Little
Little
Little
Little
Little
Martin Shepperd
Miranda
Myrtveit
Pickard
Putnam
Qinbao Song
Quinlan
Robins
Rubin
Rubin
Rubin
Rubin
Samson
Selby
Shao
Shepperd
Shepperd
Siedelecki
Song
Song
Srinivasan
Strike
Tabachnick
Tay
Walkerden
Walston
Xiangru Chen
Publication venue: 'Elsevier BV'
Publication date: 01/12/2008
Field of study

Missing data is a widespread problem that can affect the ability to use data to construct effective prediction systems. We investigate a common machine learning technique that can tolerate missing values, namely C4.5, to predict cost using six real world software project databases. We analyze the predictive performance after using the k-NN missing data imputation technique to see if it is better to tolerate missing data or to try to impute missing values and then apply the C4.5 algorithm. For the investigation, we simulated three missingness mechanisms, three missing data patterns, and five missing data percentages. We found that the k-NN imputation can improve the prediction accuracy of C4.5. At the same time, both C4.5 and k-NN are little affected by the missingness mechanism, but that the missing data pattern and the missing data percentage have a strong negative impact upon prediction (or imputation) accuracy particularly if the missing data percentage exceeds 40%

Crossref

Brunel University Research Archive

Rice seed image classiﬁcation based on HOG descriptor with missing values imputation

Author: Nguyen-Quoc Huy
Truong Hoang Vinh
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/08/2020
Field of study

Rice is a primary source of food consumed by almost half of world population. Rice quality mainly depends on the purity of the rice seed. In order to ensure the purity of rice variety, the recognition process is an essential stage. In this paper, we ﬁrstly propose to use histogram of oriented gradient (HOG) descriptor to characterize rice seed images. Since the size of image is totally random and the features extracted by HOG can not be used directly by classiﬁer due to the different dimensions. We apply several imputation methods to ﬁll the missing data for HOG descriptor. The experiment is applied on the VNRICE benchmark dataset to evaluate the proposed approach

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System