146 research outputs found

    Efficient Variations of the Quality Threshold Clustering Algorithm

    Get PDF
    Clustering gene expression data such that the diameters of the clusters formed are no greater than a specified threshold prompted the development of the Quality Threshold Clustering (QTC) algorithm. It iteratively forms clusters of non-increasing size until all points are clustered; the largest cluster is always selected first. The QTC algorithm applies in many other domains that require a similar quality guarantee based on cluster diameter. The worst-case complexity of the original QTC algorithm is (n5). Since practical applications often involve large datasets, researchers called for more efficient versions of the QTC algorithm. This dissertation aimed to develop and evaluate efficient variations of the QTC algorithm that guarantee a maximum cluster diameter while producing partitions that are similar to those produced by the original QTC algorithm. The QTC algorithm is expensive because it considers forming clusters around every item in the dataset. This dissertation addressed this issue by developing methods for selecting a small subset of promising items around which to form clusters. A second factor that adversely affects the efficiency of the QTC algorithm is the computational cost of updating cluster diameters as new items are added to clusters. This dissertation proposed and evaluated alternate methods to meet the cluster diameter constraint while not having to repeatedly update the cluster diameters. The variations of the QTC algorithm developed in this dissertation were evaluated on benchmark datasets using two measures: execution time and quality of solutions produced. Execution times were compared to the time taken to execute the most efficient published implementation of the QTC algorithm. Since the partitions produced by the proposed variations are not guaranteed to be identical to those produced by the original algorithm, the Jaccard measure of partition similarity was used to measure the quality of the solutions. The findings of this research were threefold. First, the Stochastic QTC alone wasn’t computationally helpful since in order to produce partitions that were acceptably similar to those found by the deterministic QTCs, the algorithm had to be seeded with a large number of centers (ntry ≈ n). Second, the preprocessed data methods are desirable since they reduce the complexity of the search for candidate cluster points. Third, radius based methods are promising since they produce partitions that are acceptably similar to those found by the deterministic QTCs in significantly less time

    Automatic Parameter Adaptation for Multi-object Tracking

    Get PDF
    Object tracking quality usually depends on video context (e.g. object occlusion level, object density). In order to decrease this dependency, this paper presents a learning approach to adapt the tracker parameters to the context variations. In an offline phase, satisfactory tracking parameters are learned for video context clusters. In the online control phase, once a context change is detected, the tracking parameters are tuned using the learned values. The experimental results show that the proposed approach outperforms the recent trackers in state of the art. This paper brings two contributions: (1) a classification method of video sequences to learn offline tracking parameters, (2) a new method to tune online tracking parameters using tracking context.Comment: International Conference on Computer Vision Systems (ICVS) (2013

    In situ analysis for intelligent control

    Get PDF
    We report a pilot study on in situ analysis of backscatter data for intelligent control of a scientific instrument on an Autonomous Underwater Vehicle (AUV) carried out at the Monterey Bay Aquarium Research Institute (MBARI). The objective of the study is to investigate techniques which use machine intelligence to enable event-response scenarios. Specifically we analyse a set of techniques for automated sample acquisition in the water-column using an electro-mechanical "Gulper", designed at MBARI. This is a syringe-like sampling device, carried onboard an AUV. The techniques we use in this study are clustering algorithms, intended to identify the important distinguishing characteristics of bodies of points within a data sample. We demonstrate that the complementary features of two clustering approaches can offer robust identification of interesting features in the water-column, which, in turn, can support automatic event-response control in the use of the Gulper

    Individual Behavior Modeling with Sensors Using Process Mining

    Get PDF
    [EN] Understanding human behavior can assist in the adoption of satisfactory health interventions and improved care. One of the main problems relies on the definition of human behaviors, as human activities depend on multiple variables and are of dynamic nature. Although smart homes have advanced in the latest years and contributed to unobtrusive human behavior tracking, artificial intelligence has not coped yet with the problem of variability and dynamism of these behaviors. Process mining is an emerging discipline capable of adapting to the nature of high-variate data and extract knowledge to define behavior patterns. In this study, we analyze data from 25 in-house residents acquired with indoor location sensors by means of process mining clustering techniques, which allows obtaining workflows of the human behavior inside the house. Data are clustered by adjusting two variables: the similarity index and the Euclidean distance between workflows. Thereafter, two main models are created: (1) a workflow view to analyze the characteristics of the discovered clusters and the information they reveal about human behavior and (2) a calendar view, in which common behaviors are rendered in the way of a calendar allowing to detect relevant patterns depending on the day of the week and the season of the year. Three representative patients who performed three different behaviors: stable, unstable, and complex behaviors according to the proposed approach are investigated. This approach provides human behavior details in the manner of a workflow model, discovering user paths, frequent transitions between rooms, and the time the user was in each room, in addition to showing the results into the calendar view increases readability and visual attraction of human behaviors, allowing to us detect patterns happening on special days.This research was funded by ITACA SABIEN and partially supported by CONICYT REDI 170136.Dogan, O.; Martinez-Millana, A.; Rojas, E.; Sepulveda, M.; Munoz Gama, J.; Traver Salcedo, V.; Fernández Llatas, C. (2019). Individual Behavior Modeling with Sensors Using Process Mining. Electronics. 8(7):1-17. https://doi.org/10.3390/electronics8070766S11787Gubbi, J., Buyya, R., Marusic, S., & Palaniswami, M. (2013). Internet of Things (IoT): A vision, architectural elements, and future directions. Future Generation Computer Systems, 29(7), 1645-1660. doi:10.1016/j.future.2013.01.010Guo, B., Zhang, D., Wang, Z., Yu, Z., & Zhou, X. (2013). Opportunistic IoT: Exploring the harmonious interaction between human and the internet of things. Journal of Network and Computer Applications, 36(6), 1531-1539. doi:10.1016/j.jnca.2012.12.028Riley, W. T., Nilsen, W. J., Manolio, T. A., Masys, D. R., & Lauer, M. (2015). News from the NIH: potential contributions of the behavioral and social sciences to the precision medicine initiative. Translational Behavioral Medicine, 5(3), 243-246. doi:10.1007/s13142-015-0320-5Xue-Wen Chen, & Xiaotong Lin. (2014). Big Data Deep Learning: Challenges and Perspectives. IEEE Access, 2, 514-525. doi:10.1109/access.2014.2325029Atzori, L., Iera, A., & Morabito, G. (2010). The Internet of Things: A survey. Computer Networks, 54(15), 2787-2805. doi:10.1016/j.comnet.2010.05.010Mamlin, B. W., & Tierney, W. M. (2016). The Promise of Information and Communication Technology in Healthcare: Extracting Value From the Chaos. The American Journal of the Medical Sciences, 351(1), 59-68. doi:10.1016/j.amjms.2015.10.015Bayo-Monton, J.-L., Martinez-Millana, A., Han, W., Fernandez-Llatas, C., Sun, Y., & Traver, V. (2018). Wearable Sensors Integrated with Internet of Things for Advancing eHealth Care. Sensors, 18(6), 1851. doi:10.3390/s18061851Larry Jameson, J., & Longo, D. L. (2015). Precision Medicine—Personalized, Problematic, and Promising. Obstetrical & Gynecological Survey, 70(10), 612-614. doi:10.1097/01.ogx.0000472121.21647.38Chaaraoui, A. A., Climent-Pérez, P., & Flórez-Revuelta, F. (2012). A review on vision techniques applied to Human Behaviour Analysis for Ambient-Assisted Living. Expert Systems with Applications, 39(12), 10873-10888. doi:10.1016/j.eswa.2012.03.005Botia, J. A., Villa, A., & Palma, J. (2012). Ambient Assisted Living system for in-home monitoring of healthy independent elders. Expert Systems with Applications, 39(9), 8136-8148. doi:10.1016/j.eswa.2012.01.153Bamis, A., Lymberopoulos, D., Teixeira, T., & Savvides, A. (2010). The BehaviorScope framework for enabling ambient assisted living. Personal and Ubiquitous Computing, 14(6), 473-487. doi:10.1007/s00779-010-0282-zDogan, O., Bayo-Monton, J.-L., Fernandez-Llatas, C., & Oztaysi, B. (2019). Analyzing of Gender Behaviors from Paths Using Process Mining: A Shopping Mall Application. Sensors, 19(3), 557. doi:10.3390/s19030557Fernández-Llatas, C., Benedi, J.-M., García-Gómez, J., & Traver, V. (2013). Process Mining for Individualized Behavior Modeling Using Wireless Tracking in Nursing Homes. Sensors, 13(11), 15434-15451. doi:10.3390/s131115434Martinez-Millana, A., Lizondo, A., Gatta, R., Vera, S., Salcedo, V., & Fernandez-Llatas, C. (2019). Process Mining Dashboard in Operating Rooms: Analysis of Staff Expectations with Analytic Hierarchy Process. International Journal of Environmental Research and Public Health, 16(2), 199. doi:10.3390/ijerph16020199Fernandez-Llatas, C., Lizondo, A., Monton, E., Benedi, J.-M., & Traver, V. (2015). Process Mining Methodology for Health Process Tracking Using Real-Time Indoor Location Systems. Sensors, 15(12), 29821-29840. doi:10.3390/s151229769Mshali, H., Lemlouma, T., Moloney, M., & Magoni, D. (2018). A survey on health monitoring systems for health smart homes. International Journal of Industrial Ergonomics, 66, 26-56. doi:10.1016/j.ergon.2018.02.002Kim, E., Helal, S., & Cook, D. (2010). Human Activity Recognition and Pattern Discovery. IEEE Pervasive Computing, 9(1), 48-53. doi:10.1109/mprv.2010.7Li, N., & Becerik-Gerber, B. (2011). Performance-based evaluation of RFID-based indoor location sensing solutions for the built environment. Advanced Engineering Informatics, 25(3), 535-546. doi:10.1016/j.aei.2011.02.004Fang, S.-H., Wang, C.-H., Huang, T.-Y., Yang, C.-H., & Chen, Y.-S. (2012). An Enhanced ZigBee Indoor Positioning System With an Ensemble Approach. IEEE Communications Letters, 16(4), 564-567. doi:10.1109/lcomm.2012.022112.120131Álvarez-García, J. A., Barsocchi, P., Chessa, S., & Salvi, D. (2013). Evaluation of localization and activity recognition systems for ambient assisted living: The experience of the 2012 EvAAL competition. Journal of Ambient Intelligence and Smart Environments, 5(1), 119-132. doi:10.3233/ais-120192Byrne, C., Collier, R., & O’Hare, G. (2018). A Review and Classification of Assisted Living Systems. Information, 9(7), 182. doi:10.3390/info9070182Manzoor, A., Truong, H.-L., Calatroni, A., Roggen, D., Bouroche, M., Clarke, S., … Dustdar, S. (2013). Analyzing the impact of different action primitives in designing high-level human activity recognition systems. Journal of Ambient Intelligence and Smart Environments, 5(5), 443-461. doi:10.3233/ais-130223Lee, S., Ha, K., & Lee, K. (2006). A pyroelectric infrared sensor-based indoor location-aware system for the smart home. IEEE Transactions on Consumer Electronics, 52(4), 1311-1317. doi:10.1109/tce.2006.273150Conca, T., Saint-Pierre, C., Herskovic, V., Sepúlveda, M., Capurro, D., Prieto, F., & Fernandez-Llatas, C. (2018). Multidisciplinary Collaboration in the Treatment of Patients With Type 2 Diabetes in Primary Care: Analysis Using Process Mining. Journal of Medical Internet Research, 20(4), e127. doi:10.2196/jmir.8884Lee, J., Bagheri, B., & Kao, H.-A. (2015). A Cyber-Physical Systems architecture for Industry 4.0-based manufacturing systems. Manufacturing Letters, 3, 18-23. doi:10.1016/j.mfglet.2014.12.00

    Reducing the Time Requirement of k-Means Algorithm

    Get PDF
    Traditional k-means and most k-means variants are still computationally expensive for large datasets, such as microarray data, which have large datasets with large dimension size d. In k-means clustering, we are given a set of n data points in ddimensional space Rd and an integer k. The problem is to determine a set of k points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. In this work, we develop a novel k-means algorithm, which is simple but more efficient than the traditional k-means and the recent enhanced k-means. Our new algorithm is based on the recently established relationship between principal component analysis and the k-means clustering. We provided the correctness proof for this algorithm. Results obtained from testing the algorithm on three biological data and six non-biological data (three of these data are real, while the other three are simulated) also indicate that our algorithm is empirically faster than other known k-means algorithms. We assessed the quality of our algorithm clusters against the clusters of a known structure using the Hubert-Arabie Adjusted Rand index (ARIHA). We found that when k is close to d, the quality is good (ARIHA.0.8) and when k is not close to d, the quality of our new k-means algorithm is excellent (ARIHA.0.9). In this paper, emphases are on the reduction of the time requirement of the k-means algorithm and its application to microarray data due to the desire to create a tool for clustering and malaria research. However, the new clustering algorithm can be used for other clustering needs as long as an appropriate measure of distance between the centroids and the members is used. This has been demonstrated in this work on six non-biological data

    Microarray-based gene expression profiles of silkworm brains

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Molecular genetic studies of <it>Bombyx mori </it>have led to profound advances in our understanding of the regulation of development. <it>Bombyx mori </it>brain, as a main endocrine organ, plays important regulatory roles in various biological processes. Microarray technology will allow the genome-wide analysis of gene expression patterns in silkworm brains.</p> <p>Results</p> <p>We reported microarray-based gene expression profiles in silkworm brains at four stages including V7, P1, P3 and P5. A total of 4,550 genes were transcribed in at least one selected stage. Of these, clustering algorithms separated the expressed genes into stably expressed genes and variably expressed genes. The results of the gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) analysis of stably expressed genes showed that the ribosomal and oxidative phosphorylation pathways were principal pathways. Secondly, four clusters of genes with significantly different expression patterns were observed in the 1,175 variably expressed genes. Thirdly, thirty-two neuropeptide genes, six neuropeptide-like precursor genes, and 117 cuticular protein genes were expressed in selected developmental stages.</p> <p>Conclusion</p> <p>Major characteristics of the transcriptional profiles in the brains of <it>Bombyx mori </it>at specific development stages were present in this study. Our data provided useful information for future research.</p

    Common Arc Method for Diffraction Pattern Orientation

    Get PDF
    Very short pulses of x-ray free-electron lasers opened the way to obtain diffraction signal from single particles beyond the radiation dose limit. For 3D structure reconstruction many patterns are recorded in the object's unknown orientation. We describe a method for orientation of continuous diffraction patterns of non-periodic objects, utilizing intensity correlations in the curved intersections of the corresponding Ewald spheres, hence named Common Arc orientation. Present implementation of the algorithm optionally takes into account the Friedel law, handles missing data and is capable to determine the point group of symmetric objects. Its performance is demonstrated on simulated diffraction datasets and verification of the results indicates high orientation accuracy even at low signal levels. The Common Arc method fills a gap in the wide palette of the orientation methods.Comment: 16 pages, 10 figure

    New insights into Dehalococcoides mccartyi metabolism from a reconstructed metabolic network-based systems-level analysis of D. mccartyi transcriptomes

    Get PDF
    Organohalide respiration, mediated by Dehalococcoides mccartyi, is a useful bioremediation process that transforms ground water pollutants and known human carcinogens such as trichloroethene and vinyl chloride into benign ethenes. Successful application of this process depends on the fundamental understanding of the respiration and metabolism of D. mccartyi. Reductive dehalogenases, encoded by rdhA genes of these anaerobic bacteria, exclusively catalyze organohalide respiration and drive metabolism. To better elucidate D. mccartyi metabolism and physiology, we analyzed available transcriptomic data for a pure isolate (Dehalococcoides mccartyi strain 195) and a mixed microbial consortium (KB-1) using the previously developed pan-genome-scale reconstructed metabolic network of D. mccartyi. The transcriptomic data, together with available proteomic data helped confirm transcription and expression of the majority genes in D. mccartyi genomes. A composite genome of two highly similar D. mccartyi strains (KB-1 Dhc) from the KB-1 metagenome sequence was constructed, and operon prediction was conducted for this composite genome and other single genomes. This operon analysis, together with the quality threshold clustering analysis of transcriptomic data helped generate experimentally testable hypotheses regarding the function of a number of hypothetical proteins and the poorly understood mechanism of energy conservation in D. mccartyi. We also identified functionally enriched important clusters (13 for strain 195 and 11 for KB-1 Dhc) of co-expressed metabolic genes using information from the reconstructed metabolic network. This analysis highlighted some metabolic genes and processes, including lipid metabolism, energy metabolism, and transport that potentially play important roles in organohalide respiration. Overall, this study shows the importance of an organism’s metabolic reconstruction in analyzing various ‘‘omics’’ data to obtain improved understanding of the metabolism and physiology of the organism

    Expression profile of cuticular genes of silkworm, Bombyx mori

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Insect cuticle plays essential roles in many physiological functions. During molting and metamorphosis tremendous changes occur in silkworm cuticle where multiple proteins exist and genes encoding them constitute about 1.5% of all <it>Bombyx mori </it>genes.</p> <p>Results</p> <p>In an effort to determine their expression profiles, a microarray-based investigation was carried out using mRNA collected from larvae to pupae. The results showed that a total of 6676 genes involved in various functions and physiological pathways were activated. The vast majority (93%) of cuticular protein genes were expressed in selected stages with varying expression patterns. There was no correlation between expression patterns and the presence of conserved motifs. Twenty-six RR genes distributed in chromosome 22 were co-expressed at the larval and wandering stages. The 2 kb upstream regions of these genes were further analyzed and three putative elements were identified.</p> <p>Conclusions</p> <p>Data from the present study provide, for the first time, a comprehensive expression profile of genes in silkworm epidermal tissues and evidence that putative elements exist to allow massive production of mRNAs from specific cuticular protein genes.</p

    Assisted labeling for spam account detection on twitter

    Get PDF
    Online Social Networks (OSNs) have become increasingly popular both because of their ease of use and their availability through almost any smart device. Unfortunately, these characteristics make OSNs also target of users interested in performing malicious activities, such as spreading malware and performing phishing attacks. In this paper we address the problem of spam detection on Twitter providing a novel method to support the creation of large-scale annotated datasets. More specifically, URL inspection and tweet clustering are performed in order to detect some common behaviors of spammers and legitimate users. Finally, the manual annotation effort is further reduced by grouping similar users according to some characteristics. Experimental results show the effectiveness of the proposed approach
    • …
    corecore