1,210 research outputs found
A survey of outlier detection methodologies
Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review
Blockchain: A Graph Primer
Bitcoin and its underlying technology Blockchain have become popular in
recent years. Designed to facilitate a secure distributed platform without
central authorities, Blockchain is heralded as a paradigm that will be as
powerful as Big Data, Cloud Computing and Machine learning. Blockchain
incorporates novel ideas from various fields such as public key encryption and
distributed systems. As such, a reader often comes across resources that
explain the Blockchain technology from a certain perspective only, leaving the
reader with more questions than before. We will offer a holistic view on
Blockchain. Starting with a brief history, we will give the building blocks of
Blockchain, and explain their interactions. As graph mining has become a major
part its analysis, we will elaborate on graph theoretical aspects of the
Blockchain technology. We also devote a section to the future of Blockchain and
explain how extensions like Smart Contracts and De-centralized Autonomous
Organizations will function. Without assuming any reader expertise, our aim is
to provide a concise but complete description of the Blockchain technology.Comment: 16 pages, 8 figure
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
Time-series analyses of Monterey Bay coastal microbial picoplankton using a ‘genome proxy’ microarray
To investigate the temporal, spatial and phylogenetic resolution of marine microbial community structure and variability, we designed and expanded a genome proxy array (an oligonucleotide microarray targeting marine microbial genome fragments and genomes), evaluated it against metagenomic sequencing, and applied it to time-series samples from the Monterey Bay. The expanded array targeted 268 microbial genotypes across much of the known diversity of cultured and uncultured marine microbes. The target abundances measured by the array were highly correlated to pyrosequence-based abundances (linear regression R2 = 0.85–0.91, P < 0.0001). Fifty-seven samples from ∼4 years in Monterey Bay were examined with the array, spanning the photic zone (0 m), the base of the surface mixed layer (30 m) and the subphotic zone (200 m). A significant portion of the expanded genome proxy array's targets showed signal (95 out of 268 targets present in ≥ 1 sample). The multi-year community survey showed the consistent presence of a core group of common and abundant targeted taxa at each depth in Monterey Bay, higher variability among shallow than deep samples, and episodic occurrences of more transient marine genotypes. The abundance of the most dominant genotypes peaked after strong episodic upwelling events. The genome-proxy array's ability to track populations of closely related genotypes indicated population shifts within several abundant target taxa, with specific populations in some cases clustering by depth or oceanographic season. Although 51 cultivated organisms were targeted (representing 19% of the array) the majority of targets detected and of total target signal (85% and ∼92% respectively) were from uncultivated genotypes, often those derived from Monterey Bay. The array provided a relatively cost-effective approach (∼$15 per array) for surveying the natural history of uncultivated lineages.Gordon and Betty Moore FoundationNational Science Foundation (U.S.) (Science and Technology Center Award EF0424599)National Science Foundation (U.S.) (Microbial Observatory Award MCB-0348001)United States. Dept. of Energy. Office of Scienc
Enhancement of external wall decoration material for the building in safety inspection method
As buildings wear out, external wall tiles or attachments will usually fall off, sometimes causing human injuries. At present, the method employed for middle-high rise buildings is mainly the method of visual inspection. The inspection results in using this method are affected by the factors of subjectivity, safety and cost. This study aims to provide a lowercost and more efficient evaluation method for inspecting the status of buildings’ external walls. This proposed method implements Forward Looking Infrared (FLIR) technology and high-resolution photographic equipment on Unmanned Aerial Vehicle (UAV) which can improve the image recording of the detection process, as well as the overall visual detection technology, and solve the existing visual detection problem of inspectors. Also, the images detected by visual inspection and UAV high-resolution video are used to develop a suitable visual evaluation process and test table for external walls. Through the test results of several cases, the deterioration status and needs for maintenance are taken into account according to the degree of performance indicators. The findings of the study is that the proposed mechanism is more efficient and lower cost for the detection of external walls or ancillary structures’ abnormal status, which is easy to use in practice
Recommended from our members
Comprehensive profiling of rhizome-associated alternative splicing and alternative polyadenylation in moso bamboo (Phyllostachys edulis).
Moso bamboo (Phyllostachys edulis) represents one of the fastest-spreading plants in the world, due in part to its well-developed rhizome system. However, the post-transcriptional mechanism for the development of the rhizome system in bamboo has not been comprehensively studied. We therefore used a combination of single-molecule long-read sequencing technology and polyadenylation site sequencing (PAS-seq) to re-annotate the bamboo genome, and identify genome-wide alternative splicing (AS) and alternative polyadenylation (APA) in the rhizome system. In total, 145 522 mapped full-length non-chimeric (FLNC) reads were analyzed, resulting in the correction of 2241 mis-annotated genes and the identification of 8091 previously unannotated loci. Notably, more than 42 280 distinct splicing isoforms were derived from 128 667 intron-containing full-length FLNC reads, including a large number of AS events associated with rhizome systems. In addition, we characterized 25 069 polyadenylation sites from 11 450 genes, 6311 of which have APA sites. Further analysis of intronic polyadenylation revealed that LTR/Gypsy and LTR/Copia were two major transposable elements within the intronic polyadenylation region. Furthermore, this study provided a quantitative atlas of poly(A) usage. Several hundred differential poly(A) sites in the rhizome-root system were identified. Taken together, these results suggest that post-transcriptional regulation may potentially have a vital role in the underground rhizome-root system
- …