220 research outputs found

    Gravity Theory-Based Affinity Propagation Clustering Algorithm and Its Applications

    Get PDF
    The original Affinity Propagation clustering algorithm (AP) only used the Euclidean distance of data sample as the only standard for similarity calculation. This method of calculation had great limitations for data with high dimension and sparsity when the original algorithm was running. Due to the single calculation method of similarity, the convergence and clustering accuracy of the algorithm were greatly affected. On the other hand, in the universe, we can consider the formation of galaxies is a clustering process. In addition, the interaction between different celestial bodies are achieved through universal gravitation. This paper introduced the Density Peak clustering algorithm (DP) and gravitational thought into the AP algorithm, and constructed the density property to calculate the similarity, put forward the Affinity Propagation clustering algorithm based on Gravity (GAP). The proposed algorithm was more accurate to calculate similarity of simple points through the local density of corresponding points, and then used the gravity formula to update the similarity matrix. The data clustering process could be seen as the sample points spontaneously attract each other based on โ€˜gravitationโ€™. Experimental results showed that the convergence performance of GAP algorithm is obviously improved over the AP algorithm, and the clustering effect was better

    ISIM: Iterative Self-Improved Model for Weakly Supervised Segmentation

    Full text link
    Weakly Supervised Semantic Segmentation (WSSS) is a challenging task aiming to learn the segmentation labels from class-level labels. In the literature, exploiting the information obtained from Class Activation Maps (CAMs) is widely used for WSSS studies. However, as CAMs are obtained from a classification network, they are interested in the most discriminative parts of the objects, producing non-complete prior information for segmentation tasks. In this study, to obtain more coherent CAMs with segmentation labels, we propose a framework that employs an iterative approach in a modified encoder-decoder-based segmentation model, which simultaneously supports classification and segmentation tasks. As no ground-truth segmentation labels are given, the same model also generates the pseudo-segmentation labels with the help of dense Conditional Random Fields (dCRF). As a result, the proposed framework becomes an iterative self-improved model. The experiments performed with DeepLabv3 and UNet models show a significant gain on the Pascal VOC12 dataset, and the DeepLabv3 application increases the current state-of-the-art metric by %2.5. The implementation associated with the experiments can be found: https://github.com/cenkbircanoglu/isim.Comment: This paper was submitted to IJCV on 15 Nov 2021. The reviewers decided to reject it. After getting the reviews, we wanted to study more. Unfortunately, one of the authors had severe issues (COVID-19 vaccination). After one year, the study was outdated and similar studies had been published. So, we leave the study by putting it in an archive in case it might have some effect on the literatur

    Level of service criteria of urban walking environment in indian context using cluster analysis

    Get PDF
    To know how well roadways accommodate pedestrian travel or how they are pedestrian friendly it becomes necessary to assess the walking conditions. It would also help evaluating and prioritizing the needs of existing roadways for sidewalk construction. Estimation of Pedestrian Level of Service (PLOS) is the most common approach to assess the quality of operations of pedestrian facilities. The focus of this study is to identify and access the suitable methodology to evaluate PLOS for off-street pedestrian facilities in Indian context. Defining the level of service criteria for urban off-streets pedestrian facilities are basically classification problems. Cluster analysis is found to be the most suitable technique for solving these classification problems. Cluster analysis groups object based on the information found in the data describing their relationships. K-means, Hierarchical Agglomerative Clustering (HAC), Fuzzy c-means (FCM), Self Organizing MAP (SOM) in Artificial Neural Network (ANN), Affinity Propagation (AP) and Genetic Algorithm Fuzzy (GA Fuzzy) clustering are the six methods are those employed to define PLOS criteria in this study. Four parameters such as pedestrian space, flow rate, speed and volume to capacity (v/c) ratio are considered to classify PLOS categories of off-street pedestrian facilities. And from the analysis six LOS categories i.e. A, B, C, D, E and F which are having different ranges of the four parameters are defined. From the study it found that pedestrian faces a good level of service of โ€œAโ€, โ€œBโ€ and โ€œCโ€ are more often than at poor levels of service of โ€œDโ€, โ€œEโ€ and โ€œFโ€. From all the six clustering methods K-means is found to be the most suitable one to classify PLOS in Indian context

    ๊ฐ์ฒด ์ธ์‹์˜ ๋ ˆ์ด๋ธ” ํšจ์œจ์  ํ•™์Šต

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2023. 2. ์œค์„ฑ๋กœ.๋”ฅ๋Ÿฌ๋‹์˜ ๋ฐœ์ „์€ ์ด๋ฏธ์ง€ ๋ฌผ์ฒด ์ธ์‹ ๋ถ„์•ผ๋ฅผ ํฌ๊ฒŒ ๋ฐœ์ „์‹œ์ผฐ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Ÿฌํ•œ ๋ฐœ์ „์€ ์ˆ˜๋งŽ์€ ํ•™์Šต ์ด๋ฏธ์ง€์™€ ๊ฐ ์ด๋ฏธ์ง€์— ์‚ฌ๋žŒ์ด ์ง์ ‘ ์ƒ์„ฑํ•œ ๋ฌผ์ฒด์˜ ์œ„์น˜ ์ •๋ณด์— ๋Œ€ํ•œ ๋ ˆ์ด๋ธ” ๋•๋ถ„์— ๊ฐ€๋Šฅํ•œ ๊ฒƒ์ด์˜€๋‹ค. ์ด๋ฏธ์ง€ ๋ฌผ์ฒด ์ธ์‹ ๋ถ„์•ผ๋ฅผ ์‹ค์ƒํ™œ์—์„œ ํ™œ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋‹ค์–‘ํ•œ ๋ฌผ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ์ธ์‹ ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•˜๋ฉฐ, ์ด๋ฅผ ์œ„ํ•ด์„  ๊ฐ ์นดํ…Œ๊ณ ๋ฆฌ๋‹น ์ˆ˜๋งŽ์€ ํ•™์Šต ๋ฐ์ดํ„ฐ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ํ•˜์ง€๋งŒ ๊ฐ ์ด๋ฏธ์ง€๋‹น ๋ฌผ์ฒด์˜ ์œ„์น˜๋ฅผ ๊ฐ ํ”ฝ์…€๋งˆ๋‹ค ์ฃผ์„์„ ๋‹ค๋Š” ๊ฒƒ์€ ๋งŽ์€ ๋น„์šฉ์ด ๋“ค์–ด๊ฐ„๋‹ค. ์ด๋Ÿฌํ•œ ์ •๋ณด๋ฅผ ์–ป์„ ๋•Œ ํ•„์š”ํ•œ ๋น„์šฉ์€ ์•ฝํ•œ์ง€๋„ํ•™์Šต์œผ๋กœ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค. ์•ฝํ•œ ์ง€๋„ ํ•™์Šต์ด๋ž€, ๋ฌผ์ฒด์˜ ๋ช…์‹œ์ ์ธ ์œ„์น˜ ์ •๋ณด๋ฅผ ํฌํ•จํ•˜๋Š” ๋ ˆ์ด๋ธ”๋ณด๋‹ค ๋” ๊ฐ’์‹ธ๊ฒŒ ์–ป์„ ์ˆ˜๋Š” ์žˆ์ง€๋งŒ, ์•ฝํ•œ ์œ„์น˜ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๋‰ด๋Ÿด๋„คํŠธ์›Œํฌ๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ๋Š” ๋ฌผ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ •๋ณด, ํ•™์Šต ์™ธ ๋ถ„ํฌ ๋ฐ์ดํ„ฐ (out-of-distribution) ๋ฐ์ดํ„ฐ, ๊ทธ๋ฆฌ๊ณ  ๋ฌผ์ฒด์˜ ๋ฐ•์Šค ๋ ˆ์ด๋ธ”์„ ํ™œ์šฉํ•˜๋Š” ์•ฝํ•œ์ง€๋„ํ•™์Šต ๋ฐฉ๋ฒ•๋ก ๋“ค์„ ๋‹ค๋ฃฌ๋‹ค. ์ฒซ ๋ฒˆ์งธ๋กœ, ๋ฌผ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ •๋ณด๋ฅผ ์ด์šฉํ•œ ์•ฝํ•œ ์ง€๋„ ํ•™์Šต์„ ๋‹ค๋ฃฌ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ์นดํ…Œ๋กœ๊ธฐ ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•๋“ค์€ ํ•™์Šต๋œ ๋ถ„๋ฅ˜๊ธฐ๋กœ๋ถ€ํ„ฐ ์–ป์–ด์ง„ ๊ธฐ์—ฌ๋„๋งต (attribution map) ์„ ํ™œ์šฉํ•˜์ง€๋งŒ, ์ด๋“ค์€ ๋ฌผ์ฒด์˜ ์ผ๋ถ€๋งŒ์„ ์ฐพ์•„๋‚ด๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด ๋ฌธ์ œ์— ๋Œ€ํ•œ ๊ทผ๋ณธ ์›์ธ์„ ์ด๋ก ์ ์ธ ๊ด€์ ์—์„œ ์˜๋…ผํ•˜๊ณ , ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ์„ธ ๊ฐ€์ง€์˜ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•œ๋‹ค. ํ•˜์ง€๋งŒ, ๋ฌผ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ •๋ณด๋งŒ ํ™œ์šฉํ•˜๊ฒŒ ๋˜๋ฉด ์ด๋ฏธ์ง€์˜ ์ „๊ฒฝ๊ณผ ๋ฐฐ๊ฒฝ์ด ์•…์˜์ ์ธ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ๊ฐ€์ง„๋‹ค๊ณ  ์ž˜ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด๋Ÿฌํ•œ ์ƒ๊ด€๊ด€๊ณ„๋ฅผ ํ•™์Šต ์™ธ ๋ถ„ํฌ ๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์™„ํ™”ํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ๋ฌผ์ฒด์˜ ์นดํ…Œ๊ณ ๋ฆฌ ์ •๋ณด์— ๊ธฐ๋ฐ˜ํ•œ ๋ฐฉ๋ฒ•๋ก ๋“ค์€ ๊ฐ™์€ ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๋ฌผ์ฒด๋ฅผ ๋ถ„๋ฆฌํ•˜์ง€ ๋ชปํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ธ์Šคํ„ด์Šค ๋ถ„ํ•  (instance segmentation) ์— ์ ์šฉ๋˜๊ธฐ๋Š” ํž˜๋“ค๋‹ค. ๋”ฐ๋ผ์„œ ๋ฌผ์ฒด์˜ ๋ฐ•์Šค ๋ ˆ์ด๋ธ”์„ ํ™œ์šฉํ•œ ์•ฝํ•œ ์ง€๋„ํ•™์Šต ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋ก ์„ ํ†ตํ•ด ๋ ˆ์ด๋ธ”์„ ์ œ์ž‘ํ•˜๋Š” ์‹œ๊ฐ„์„ ํš๊ธฐ์ ์œผ๋กœ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์‹คํ—˜๊ฒฐ๊ณผ๋ฅผ ํ†ตํ•ด ํ™•์ธํ–ˆ๋‹ค. ์–ด๋ ค์šด ๋ฐ์ดํ„ฐ์…‹์ธ Pascal VOC ์— ๋Œ€ํ•ด ์šฐ๋ฆฌ๋Š” 91%์˜ ๋ฐ์ดํ„ฐ ๋น„์šฉ์„ ๊ฐ์†Œํ•˜๋ฉด์„œ, ๊ฐ•ํ•œ ๋ ˆ์ด๋ธ”๋กœ ํ•™์Šต๋œ ๋น„๊ต๊ตฐ์˜ 89%์˜ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค. ๋˜ํ•œ, ๋ฌผ์ฒด์˜ ๋ฐ•์Šค ์ •๋ณด๋ฅผ ํ™œ์šฉํ•ด์„œ๋Š” 83% ์˜ ๋ฐ์ดํ„ฐ ๋น„์šฉ์„ ๊ฐ์†Œํ•˜๋ฉด์„œ, ๊ฐ•ํ•œ ๋ ˆ์ด๋ธ”๋กœ ํ•™์Šต๋œ ๋น„๊ต๊ตฐ์˜ 96%์˜ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜์˜€๋‹ค. ๋ณธ ํ•™์œ„๋…ผ๋ฌธ์—์„œ ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋ก ๋“ค์ด ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜์˜ ๋ฌผ์ฒด ์ธ์‹์ด ๋‹ค์–‘ํ•œ ๋ฐ์ดํ„ฐ์™€ ๋‹ค์–‘ํ•œ ํ™˜๊ฒฝ์—์„œ ํ™œ์šฉ๋˜๋Š” ๋ฐ์— ์žˆ์–ด ๋„์›€์ด ๋˜๊ธฐ๋ฅผ ๊ธฐ๋Œ€ํ•œ๋‹ค.Advances in deep neural network approaches have produced tremendous progress in object recognition tasks, but it has come at the cost of annotating a huge amount of training images with explicit localization cues. To use object recognition tasks in real-life applications requires a large variety of object classes and a great deal of labeled data for each class. However, labeling pixel-level annotations of each object class is laborious, and hampers the expansion of object classes. The need for such expensive annotations is sidestepped by weakly supervised learning, in which a DNN is trained on images with some form of abbreviated annotation that is cheaper than explicit localization cues. In the dissertation, we study the methods of using various form of weak supervision, i.e., image-level class labels, out-of-distribution data, and bounding box labels. We first study image-level class labels for weakly supervised semantic segmentation. Most of the weakly supervised methods on image-level class labels depend on attribution maps from a trained classifier, but their focus tends to be restricted to a small discriminative region of the target object. We theoretically discuss the root cause of this problem, and propose three novel techniques to address this issue. However, built on class labels only, the produced localization maps are known to suffer from the confusion between foreground and background cues, i.e., spurious correlation. We address the spurious correlation problem by utilizing out-of-distribution data. Finally, methods based on class labels cannot separate different instance objects of the same class, which is essential for instance segmentation. Therefore, we utilize bounding box labels for weakly supervised instance segmentation as boxes provide information about individual objects and their locations. Experimental results show that annotation cost for learning semantic segmentation and instance segmentation can be significantly reduced: On the challenging Pascal VOC dataset, we have achieved 89% of the performance of the fully supervised equivalent by using only class labels, which reduces the label cost by 91%. In addition, we have achieved 96% of the performance of the fully supervised equivalent by using bounding box labels, which reduces the label cost by 83%. We expect that the methods introduced in this dissertation will be helpful for applying deep learning based object recognition tasks in a variety of domains and scenarios.1 Introduction 1 2 Background 8 2.1 Object Recognition 8 2.2 Weak Supervision 13 2.3 Preliminary Algirothms 16 2.3.1 Attribution Methods for Image Classifier 16 2.3.2 Refinement Techniques of Localization Maps 18 3 Learning with Image-Level Class Labels 22 3.1 Introduction 22 3.2 Related Work 23 3.2.1 FickleNet: Stochastic Inference Approach 23 3.2.2 Other Recent Approaches 26 3.3 Anti-Adversarially Manipulated Attribution 28 3.3.1 Adversarial Attack 28 3.3.2 Proposed Method 29 3.3.3 Experiments 33 3.3.4 Discussion 36 3.3.5 Analysis of Results by Class 42 3.4 Reducing Information Bottleneck 46 3.4.1 Information Bottleneck 46 3.4.2 Motivation 47 3.4.3 Proposed Method 49 3.4.4 Experiments 52 3.5 Summary 60 4 Learning with Auxiliary Data 62 4.1 Introduction 62 4.2 Related Work 65 4.3 Methods 66 4.3.1 Collecting the Hard Out-of-Distribution Data 67 4.3.2 Learning with the Hard Out-of-Distribution Data 69 4.3.3 Training Segmentation Networks 71 4.4 Experiments 73 4.4.1 Experimental Setup 73 4.4.2 Experimental Results 73 4.4.3 Analysis and Discussion 76 4.5 Analysis of OoD Collection Process 81 4.6 Integrating Proposed Methods 82 4.7 Summary 83 5 Learning with Bounding Box Labels 85 5.1 Introduction 85 5.2 Related Work 87 5.3 Methods 89 5.3.1 Revisiting Object Detectors 89 5.3.2 Bounding Box Attribution Map 90 5.3.3 Training the Segmentation Network 91 5.4 Experiments 93 5.4.1 Experimental Setup 93 5.4.2 Weakly Supervised Instance Segmentation 94 5.4.3 Weakly Supervised Semantic Segmentation 96 5.4.4 Ablation Study 98 5.5 Detailed Analysis of the BBAM 100 5.6 Summary 104 6 Conclusion 105 6.1 Dissertation Summary 105 6.2 Limitations and Future Direction 107 Abstract (In Korean) 133๋ฐ•

    Two and three dimensional segmentation of multimodal imagery

    Get PDF
    The role of segmentation in the realms of image understanding/analysis, computer vision, pattern recognition, remote sensing and medical imaging in recent years has been significantly augmented due to accelerated scientific advances made in the acquisition of image data. This low-level analysis protocol is critical to numerous applications, with the primary goal of expediting and improving the effectiveness of subsequent high-level operations by providing a condensed and pertinent representation of image information. In this research, we propose a novel unsupervised segmentation framework for facilitating meaningful segregation of 2-D/3-D image data across multiple modalities (color, remote-sensing and biomedical imaging) into non-overlapping partitions using several spatial-spectral attributes. Initially, our framework exploits the information obtained from detecting edges inherent in the data. To this effect, by using a vector gradient detection technique, pixels without edges are grouped and individually labeled to partition some initial portion of the input image content. Pixels that contain higher gradient densities are included by the dynamic generation of segments as the algorithm progresses to generate an initial region map. Subsequently, texture modeling is performed and the obtained gradient, texture and intensity information along with the aforementioned initial partition map are used to perform a multivariate refinement procedure, to fuse groups with similar characteristics yielding the final output segmentation. Experimental results obtained in comparison to published/state-of the-art segmentation techniques for color as well as multi/hyperspectral imagery, demonstrate the advantages of the proposed method. Furthermore, for the purpose of achieving improved computational efficiency we propose an extension of the aforestated methodology in a multi-resolution framework, demonstrated on color images. Finally, this research also encompasses a 3-D extension of the aforementioned algorithm demonstrated on medical (Magnetic Resonance Imaging / Computed Tomography) volumes

    Fuzzy Logic

    Get PDF
    The capability of Fuzzy Logic in the development of emerging technologies is introduced in this book. The book consists of sixteen chapters showing various applications in the field of Bioinformatics, Health, Security, Communications, Transportations, Financial Management, Energy and Environment Systems. This book is a major reference source for all those concerned with applied intelligent systems. The intended readers are researchers, engineers, medical practitioners, and graduate students interested in fuzzy logic systems

    K-Means and Alternative Clustering Methods in Modern Power Systems

    Get PDF
    As power systems evolve by integrating renewable energy sources, distributed generation, and electric vehicles, the complexity of managing these systems increases. With the increase in data accessibility and advancements in computational capabilities, clustering algorithms, including K-means, are becoming essential tools for researchers in analyzing, optimizing, and modernizing power systems. This paper presents a comprehensive review of over 440 articles published through 2022, emphasizing the application of K-means clustering, a widely recognized and frequently used algorithm, along with its alternative clustering methods within modern power systems. The main contributions of this study include a bibliometric analysis to understand the historical development and wide-ranging applications of K-means clustering in power systems. This research also thoroughly examines K-means, its various variants, potential limitations, and advantages. Furthermore, the study explores alternative clustering algorithms that can complete or substitute K-means. Some prominent examples include K-medoids, Time-series K-means, BIRCH, Bayesian clustering, HDBSCAN, CLIQUE, SPECTRAL, SOMs, TICC, and swarm-based methods, broadening the understanding and applications of clustering methodologies in modern power systems. The paper highlights the wide-ranging applications of these techniques, from load forecasting and fault detection to power quality analysis and system security assessment. Throughout the examination, it has been observed that the number of publications employing clustering algorithms within modern power systems is following an exponential upward trend. This emphasizes the necessity for professionals to understand various clustering methods, including their benefits and potential challenges, to incorporate the most suitable ones into their studies

    Single-Microphone Speech Enhancement and Separation Using Deep Learning

    Get PDF

    Single-Microphone Speech Enhancement and Separation Using Deep Learning

    Get PDF
    The cocktail party problem comprises the challenging task of understanding a speech signal in a complex acoustic environment, where multiple speakers and background noise signals simultaneously interfere with the speech signal of interest. A signal processing algorithm that can effectively increase the speech intelligibility and quality of speech signals in such complicated acoustic situations is highly desirable. Especially for applications involving mobile communication devices and hearing assistive devices. Due to the re-emergence of machine learning techniques, today, known as deep learning, the challenges involved with such algorithms might be overcome. In this PhD thesis, we study and develop deep learning-based techniques for two sub-disciplines of the cocktail party problem: single-microphone speech enhancement and single-microphone multi-talker speech separation. Specifically, we conduct in-depth empirical analysis of the generalizability capability of modern deep learning-based single-microphone speech enhancement algorithms. We show that performance of such algorithms is closely linked to the training data, and good generalizability can be achieved with carefully designed training data. Furthermore, we propose uPIT, a deep learning-based algorithm for single-microphone speech separation and we report state-of-the-art results on a speaker-independent multi-talker speech separation task. Additionally, we show that uPIT works well for joint speech separation and enhancement without explicit prior knowledge about the noise type or number of speakers. Finally, we show that deep learning-based speech enhancement algorithms designed to minimize the classical short-time spectral amplitude mean squared error leads to enhanced speech signals which are essentially optimal in terms of STOI, a state-of-the-art speech intelligibility estimator.Comment: PhD Thesis. 233 page
    • โ€ฆ
    corecore