50 research outputs found

    An Improved Overlapping Clustering Algorithm to Detect Outlier

    Get PDF
    MCOKE algorithm in identifying data objects to multi cluster is known for its simplicity and effectiveness. Its drawback is the use of maxdist as a global threshold in assigning objects to one or more cluster while it is sensitive to outliers. Having outliers in the datasets can significantly affect the effectiveness of maxdist as regards to overlapping clustering. In this paper, the outlier detection is incorporated in MCOKE algorithm so that it can detect and remove outliers that can participate in the calculation of assigning objects to one or more clusters. The improved MCOKE algorithm provides better identification of overlapping clustering results. The performance was evaluated via F1 score performance criterion. Evaluation results revealed that the outlier detection demonstrated higher accuracy rate in identifying abnormal data (outliers) when applied to real datasets

    Computerized Analysis of Magnetic Resonance Images to Study Cerebral Anatomy in Developing Neonates

    Get PDF
    The study of cerebral anatomy in developing neonates is of great importance for the understanding of brain development during the early period of life. This dissertation therefore focuses on three challenges in the modelling of cerebral anatomy in neonates during brain development. The methods that have been developed all use Magnetic Resonance Images (MRI) as source data. To facilitate study of vascular development in the neonatal period, a set of image analysis algorithms are developed to automatically extract and model cerebral vessel trees. The whole process consists of cerebral vessel tracking from automatically placed seed points, vessel tree generation, and vasculature registration and matching. These algorithms have been tested on clinical Time-of- Flight (TOF) MR angiographic datasets. To facilitate study of the neonatal cortex a complete cerebral cortex segmentation and reconstruction pipeline has been developed. Segmentation of the neonatal cortex is not effectively done by existing algorithms designed for the adult brain because the contrast between grey and white matter is reversed. This causes pixels containing tissue mixtures to be incorrectly labelled by conventional methods. The neonatal cortical segmentation method that has been developed is based on a novel expectation-maximization (EM) method with explicit correction for mislabelled partial volume voxels. Based on the resulting cortical segmentation, an implicit surface evolution technique is adopted for the reconstruction of the cortex in neonates. The performance of the method is investigated by performing a detailed landmark study. To facilitate study of cortical development, a cortical surface registration algorithm for aligning the cortical surface is developed. The method first inflates extracted cortical surfaces and then performs a non-rigid surface registration using free-form deformations (FFDs) to remove residual alignment. Validation experiments using data labelled by an expert observer demonstrate that the method can capture local changes and follow the growth of specific sulcus

    Probabilistic analysis of euclidean multi depot vehicle routing and related problems

    Get PDF
    We consider a generalization of the classical traveling salesman problem: the multi depot vehicle routing problem (MDVRP). Let DD be a set of kk depots and PP be sets nn customers in [0,1]d[0,1]^d with the usual Euclidean metric. A multi depot vehicle routing tour is a set of disjoint cycles such that all customers are covered and each cycle contains exactly one depot. The goal is to find a tour of minimum length. L(D,P)L(D,P) denotes the length of an optimal MDVRP tour for depot set DD and customer set PP. It is evident that the asymptotic behavior of \L(D,P) for nn tending to infinity depends on the customer-depot ratio n/kn/k. We study three cases: k=o(n)k=o(n), k=λn+o(n)k=\lambda n +o(n) for a constant λ>0\lambda >0, and k=\Omega(n^{1+\ee}) for \ee>0. In the first two cases we show that L(D,P)L(D,P) divided by n(d−1)/dn^{(d-1)/d} converges completely to a constant if the customers and depots are given by iid random variables. In the last case we prove that the expected tour length divided by n(d−1)/dn^{(d-1)/d} and multiplied by k1/dk^{1/d} converges to a constant if the customers and depots are given by iid random variables with uniform distribution

    Design and analysis of algorithms for similarity search based on intrinsic dimension

    Get PDF
    One of the most fundamental operations employed in data mining tasks such as classification, cluster analysis, and anomaly detection, is that of similarity search. It has been used in numerous fields of application such as multimedia, information retrieval, recommender systems and pattern recognition. Specifically, a similarity query aims to retrieve from the database the most similar objects to a query object, where the underlying similarity measure is usually expressed as a distance function. The cost of processing similarity queries has been typically assessed in terms of the representational dimension of the data involved, that is, the number of features used to represent individual data objects. It is generally the case that high representational dimension would result in a significant increase in the processing cost of similarity queries. This relation is often attributed to an amalgamation of phenomena, collectively referred to as the curse of dimensionality. However, the observed effects of dimensionality in practice may not be as severe as expected. This has led to the development of models quantifying the complexity of data in terms of some measure of the intrinsic dimensionality. The generalized expansion dimension (GED) is one of such models, which estimates the intrinsic dimension in the vicinity of a query point q through the observation of the ranks and distances of pairs of neighbors with respect to q. This dissertation is mainly concerned with the design and analysis of search algorithms, based on the GED model. In particular, three variants of similarity search problem are considered, including adaptive similarity search, flexible aggregate similarity search, and subspace similarity search. The good practical performance of the proposed algorithms demonstrates the effectiveness of dimensionality-driven design of search algorithms

    Courbure discrÚte : théorie et applications

    Get PDF
    International audienceThe present volume contains the proceedings of the 2013 Meeting on discrete curvature, held at CIRM, Luminy, France. The aim of this meeting was to bring together researchers from various backgrounds, ranging from mathematics to computer science, with a focus on both theory and applications. With 27 invited talks and 8 posters, the conference attracted 70 researchers from all over the world. The challenge of finding a common ground on the topic of discrete curvature was met with success, and these proceedings are a testimony of this wor

    Integrating Remote Sensing Techniques into Forest Monitoring: Selected Topics with a Focus on Thermal Remote Sensing

    Get PDF
    A sustainable management of natural resources, in particular of forests, is of great importance to preserve the ecological, environmental and economic benefits of forests for future generations. An enhanced understanding of the current situation and ongoing trends of forests, e.g. through policy interventions, is crucial to managing the forest wisely. In this context, forest monitoring is essential for collecting the base data required and for observing trends. Despite the wide range of approved methods and techniques for both close-range and satellite-based remote sensing monitoring, ongoing forest monitoring research is still grappling with specific and unresolved questions: The data acquired must be more reliable, in particular over a long-term period; costs need to be reduced through advancements in both methods and technology that offer easier and more feasible ways of interpreting data. This thesis comprises a number of focused studies, each with their individual and specific research questions, and aims to explore the benefits of innovative methods and technologies. The main emphasis of the studies presented is the integration of close-range and satellite-based remote sensing for enhancing the efficiency of forest monitoring. Manuscript I discusses thermal canopy photography, a new field of application. This approach takes advantage of the large differences in temperature between sky and non-sky pixels and overcomes the inconsistencies of finding an optimal threshold. For an unambiguously separation of “sky” and “non-sky” pixels, a global threshold of 0 °C was defined. Currently, optical or hemispherical canopy photography is the most widely used method to extract crown-related variables. However, a number of aspects, such as exposure, illumination conditions, and threshold definition present a challenge in optical canopy photography and dramatically influence the result; consequently, a comparison of the results from optical canopy photography at a different point in time derived is not advisable. For forest monitoring, where repeated measurements of the canopy cover on the same plots were undertaken, it is therefore of utmost importance to devise a standard protocol to estimate changes in and compare the canopy covers. This paper offers such a protocol by introducing thermal canopy photography. A feasible and accurate method that examines the strong correlation (R2 = 0.96) of canopy closure values derived from thermal and optical image pairs. Thermal photography, as a close-range remote sensing technique, also aids data collection and analysis in other contexts, for instance to expand our knowledge about bamboo tree species: Information about the maturity of bamboo culms is of utmost importance for managing bamboo stands because only then the process of lignification is finished and the culm is technically stronger and more resistant to insect and fungi attacks. The findings of a study (Manuscript III) conducted in Pereira, Colombia, show small differences in culm surface temperature between culms of different ages for the bamboo species Guadua angustifolia K., which may be a sign of maturity. The surface temperature of 12 culms was measured after sunrise using the thermal camera system FLIR 60Ebx. This study shows an innovative close-range remote sensing technique which may support researchers’ determination of the maturity of bamboo culms. This research is in its inception phase and our results are the first of this kind. In the context of analyzing, in particular of thermal imagery time-series data, Manuscript (IV) offers a new methodology using advanced statistical methods. Otsu Thresholding, an automatic segmentation technique is used in a first processing step. O’Sullivan penalized splines estimated the temperature profile extracted from the canopy leaf temperature. A final comparison of the different profiles is done by constructing simultaneous confidence bands. The result shows an approximately significant difference in canopy leaf temperature. For this study, we successfully cooperated with the Center for Statistics at Göttingen University (Prof. Kneib). The second close-range remote sensing technology employed in this thesis is terrestrial laser scanning which is used here to enhance our understanding about buttressed trees. Big trees with an irregular non-convex shape are important contributors to aboveground biomass in tropical forests, but an accurate estimation of their biomass is still a challenge and often remains biased. Allometric equations including tree diameter and height as predictors are currently used in tropical forests, but they are often not calibrated for such large and irregular trees where measuring the diameter is quite difficult. Against this background, Manuscript II shows the result of the 3D-analysis of 12 buttressed trees. This study was conducted in the Botanical Garden of Bogor, Indonesia, using a state-of-the-art terrestrial laser scanner. The findings allow for new insights into the irregular geometry of buttressed trees and the methodological approach employed in this paper will help to improve volume and biomass models for this kind of tree. The results suggest a strong relationship (RÂČ = 0.87) between cross-sectional areas at diameter above buttress (DAB) height and the actual tree basal area measured at 1.3 m height. The accuracy of field biomass estimates is crucial if the data are used to calibrate models to predict the forest biomass on landscape level using remote sensing imagery. The linkage between technology and methodology in the context of forest monitoring remote sensing enhance our knowledge in extracting more reliable information on tree cover estimation. The pre-processing of satellite images plays a crucial role in the processing workflow and particularly the illumination correction has a direct effect on the estimated tree cover. Manuscript IV evaluates four DEMs (Pleiades DSM, SRTM30, SRTM V4.1 and SRTM-X) that are available for the area of Shitai County (Anhui Province, Southeast China) for the purpose of an optimized illumination correction and tree cover estimation from optical RapidEye satellite images. The findings presented in this study suggest that the change in tree cover is contingent on the respective digital elevation models used for pre-processing the data. Imagery corrected with the freely available SRTM30 DEM with 30 m resolution leads to a higher accuracy in the estimation of tree cover based on the high-resolution and cost intensive Pleaides DEM. These manuscripts eventually seek to resolve some of the issues and provide answers to some of the detailed questions that still persist at different steps of the forest monitoring process. In future, these new and innovate methods and technologies will maybe integrate into forest monitoring programs
    corecore