251,203 research outputs found

    Segmentasi Citra Tanaman Obat dengan metode K-Means dan Otsu

    Get PDF
    Segmentation is the most important thing in the object identification process. Because machine learning-based interest segmentation of true color images is the most difficult task in computer vision. Because in the segmentation process there is a separation between foreground and background from a 3 layer RGB image to a layer 1 process to get a complete image without noise, this greatly affects the level of accuracy in image identification. In addition, we use several image processing operators such as filters, holes and openarea to remove image areas that we do not need. Therefore, in this study, we tested the images on 5 types of medicinal flowers using k-means segmentation with values of k=2 and k=3, as well as the otsu method. Both methods of segmentation are carried out by each method to get the appropriate pattern. The goal is to get the important areas that can be calculated by the image identification algorithm. This research uses 250 images and produces 750 patterns for the identification process. The results obtained are 96% to identify the flower type taraxacum laeticolor Dahlst with the K-means k=2 segmentation method

    Clustering Algorithms for Spatial Big Data

    Get PDF
    In our time people and devices constantly generate data. User activity generates data about needs and preferences as well as the quality of their experiences in different ways: i. e. streaming a video, looking at the news, searching for a restaurant or a an hotel, playing a game with others, making purchases, driving a car. Even when people put their devices in their pockets, the network is generating location and other data that keeps services running and ready to use. This rapid developments in the availability and access to data and in particular spatially referenced data in a different areas, has induced the need for better analysis techniques to understand the various phenomena. Spatial clustering algorithms, which groups similar spatial objects into classes, can be used for the identification of areas sharing common characteristics. The aim of this paper is to analyze the performance of three different clustering algorithms i.e. the Density-Based Spatial Clustering of Applications with Noise algorithm (DBSCAN), the Fast Search by Density Peak (FSDP) algorithm and the classic K-means algorithm (K-Means) as regards the analysis of spatial big data. We propose a modification of the FSDP algorithm in order to improve its efficiency in large databases. The applications concern both synthetic data sets and satellite images

    Stepwise Model Reconstruction of Robotic Manipulator Based on Data-Driven Method

    Full text link
    Research on dynamics of robotic manipulators provides promising support for model-based control. In general, rigorous first-principles-based dynamics modeling and accurate identification of mechanism parameters are critical to achieving high precision in model-based control, while data-driven model reconstruction provides alternative approaches of the above process. Taking the level of activation of data as an indicator, this paper classifies the collected robotic manipulator data by means of K-means clustering algorithm. With the fundamental prior knowledge, we find the corresponding dynamical properties behind the classified data separately. Afterwards, the sparse identification of nonlinear dynamics (SINDy) method is used to reconstruct the dynamics model of the robotic manipulator step by step according to the activation level of the classified data. The simulation results show that the proposed method not only reduces the complexity of the basis function library, enabling the application of SINDy method to multi-degree-of-freedom robotic manipulators, but also decreases the influence of data noise on the regression results. Finally, the dynamic control based on the reconfigured model is deployed on the experimental platform, and the experimental results prove the effectiveness of the proposed method.Comment: 8 pages, 11 figure

    A Clustering Algorithm Based on an Ensemble of Dissimilarities: An Application in the Bioinformatics Domain

    Get PDF
    Clustering algorithms such as k-means depend heavily on choosing an appropriate distance metric that reflect accurately the object proximities. A wide range of dissimilarities may be defined that often lead to different clustering results. Choosing the best dissimilarity is an ill-posed problem and learning a general distance from the data is a complex task, particularly for high dimensional problems. Therefore, an appealing approach is to learn an ensemble of dissimilarities. In this paper, we have developed a semi-supervised clustering algorithm that learns a linear combination of dissimilarities considering incomplete knowledge in the form of pairwise constraints. The minimization of the loss function is based on a robust and efficient quadratic optimization algorithm. Besides, a regularization term is considered that controls the complexity of the distance metric learned avoiding overfitting. The algorithm has been applied to the identification of tumor samples using the gene expression profiles, where domain experts provide often incomplete knowledge in the form of pairwise constraints. We report that the algorithm proposed outperforms a standard semi-supervised clustering technique available in the literature and clustering results based on a single dissimilarity. The improvement is particularly relevant for applications with high level of noise

    K-OPLS package: Kernel-based orthogonal projections to latent structures for prediction and interpretation in feature space

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Kernel-based classification and regression methods have been successfully applied to modelling a wide variety of biological data. The Kernel-based Orthogonal Projections to Latent Structures (K-OPLS) method offers unique properties facilitating separate modelling of predictive variation and structured noise in the feature space. While providing prediction results similar to other kernel-based methods, K-OPLS features enhanced interpretational capabilities; allowing detection of unanticipated systematic variation in the data such as instrumental drift, batch variability or unexpected biological variation.</p> <p>Results</p> <p>We demonstrate an implementation of the K-OPLS algorithm for MATLAB and R, licensed under the GNU GPL and available at <url>http://www.sourceforge.net/projects/kopls/</url>. The package includes essential functionality and documentation for model evaluation (using cross-validation), training and prediction of future samples. Incorporated is also a set of diagnostic tools and plot functions to simplify the visualisation of data, e.g. for detecting trends or for identification of outlying samples. The utility of the software package is demonstrated by means of a metabolic profiling data set from a biological study of hybrid aspen.</p> <p>Conclusion</p> <p>The properties of the K-OPLS method are well suited for analysis of biological data, which in conjunction with the availability of the outlined open-source package provides a comprehensive solution for kernel-based analysis in bioinformatics applications.</p

    Principal component-based image segmentation: a new approach to outline in vitro cell colonies

    Get PDF
    The in vitro clonogenic assay is a technique to study the ability of a cell to form a colony in a culture dish. By optical imaging, dishes with stained colonies can be scanned and assessed digitally. Identification, segmentation and counting of stained colonies play a vital part in high-throughput screening and quantitative assessment of biological assays. Image processing of such pictured/scanned assays can be affected by image/scan acquisition artifacts like background noise and spatially varying illumination, and contaminants in the suspension medium. Although existing approaches tackle these issues, the segmentation quality requires further improvement, particularly on noisy and low contrast images. In this work, we present an objective and versatile machine learning procedure to amend these issues by characterizing, extracting and segmenting inquired colonies using principal component analysis, k-means clustering and a modified watershed segmentation algorithm. The intention is to automatically identify visible colonies through spatial texture assessment and accordingly discriminate them from background in preparation for successive segmentation. The proposed segmentation algorithm yielded a similar quality as manual counting by human observers. High F1 scores (>0.9) and low root-mean-square errors (around 14%) underlined good agreement with ground truth data. Moreover, it outperformed a recent state-of-the-art method. The methodology will be an important tool in future cancer research applications

    Model structure selection using an integrated forward orthogonal search algorithm assisted by squared correlation and mutual information

    No full text
    Model structure selection plays a key role in non-linear system identification. The first step in non-linear system identification is to determine which model terms should be included in the model. Once significant model terms have been determined, a model selection criterion can then be applied to select a suitable model subset. The well known Orthogonal Least Squares (OLS) type algorithms are one of the most efficient and commonly used techniques for model structure selection. However, it has been observed that the OLS type algorithms may occasionally select incorrect model terms or yield a redundant model subset in the presence of particular noise structures or input signals. A very efficient Integrated Forward Orthogonal Search (IFOS) algorithm, which is assisted by the squared correlation and mutual information, and which incorporates a Generalised Cross-Validation (GCV) criterion and hypothesis tests, is introduced to overcome these limitations in model structure selection

    A regularizing iterative ensemble Kalman method for PDE-constrained inverse problems

    Get PDF
    We introduce a derivative-free computational framework for approximating solutions to nonlinear PDE-constrained inverse problems. The aim is to merge ideas from iterative regularization with ensemble Kalman methods from Bayesian inference to develop a derivative-free stable method easy to implement in applications where the PDE (forward) model is only accessible as a black box. The method can be derived as an approximation of the regularizing Levenberg-Marquardt (LM) scheme [14] in which the derivative of the forward operator and its adjoint are replaced with empirical covariances from an ensemble of elements from the admissible space of solutions. The resulting ensemble method consists of an update formula that is applied to each ensemble member and that has a regularization parameter selected in a similar fashion to the one in the LM scheme. Moreover, an early termination of the scheme is proposed according to a discrepancy principle-type of criterion. The proposed method can be also viewed as a regularizing version of standard Kalman approaches which are often unstable unless ad-hoc fixes, such as covariance localization, are implemented. We provide a numerical investigation of the conditions under which the proposed method inherits the regularizing properties of the LM scheme of [14]. More concretely, we study the effect of ensemble size, number of measurements, selection of initial ensemble and tunable parameters on the performance of the method. The numerical investigation is carried out with synthetic experiments on two model inverse problems: (i) identification of conductivity on a Darcy flow model and (ii) electrical impedance tomography with the complete electrode model. We further demonstrate the potential application of the method in solving shape identification problems by means of a level-set approach for the parameterization of unknown geometries
    corecore