355 research outputs found

    Analysis of single‑ and dual‑dictionary strategies in pedestrian classifcation

    Get PDF
    Sparse coding has recently been a hot topic in visual tasks in image processing and computer vision. It has applications and brings benefts in reconstruction-like tasks and in classifcation-like tasks as well. However, regarding binary classifcation problems, there are several choices to learn and use dictionaries that have not been studied. In particular, how single-dictionary and dual-dictionary approaches compare in terms of classifcation performance is largely unexplored. We compare three single-dictionary strategies and two dual-dictionary strategies for the problem of pedestrian classifcation (“pedestrian” vs “background” images). In each of these fve cases, images are represented as the sparse coefcients induced from the respective dictionaries, and these coefcients are the input to a regular classifer both for training and subsequent classifcation of novel unseen instances. Experimental results with the INRIA pedestrian dataset suggest, on the one hand, that dictionaries learned from only one of the classes, even from the background class, are enough for obtaining competitive good classifcation performance. On the other hand, while better performance is generally obtained when instances of both classes are used for dictionary learning, the representation induced by a single dictionary learned from a set of instances from both classes provides comparable or even superior performance over the representations induced by two dictionaries learned separately from the pedestrian and background classes

    Discovering a Domain Knowledge Representation for Image Grouping: Multimodal Data Modeling, Fusion, and Interactive Learning

    Get PDF
    In visually-oriented specialized medical domains such as dermatology and radiology, physicians explore interesting image cases from medical image repositories for comparative case studies to aid clinical diagnoses, educate medical trainees, and support medical research. However, general image classification and retrieval approaches fail in grouping medical images from the physicians\u27 viewpoint. This is because fully-automated learning techniques cannot yet bridge the gap between image features and domain-specific content for the absence of expert knowledge. Understanding how experts get information from medical images is therefore an important research topic. As a prior study, we conducted data elicitation experiments, where physicians were instructed to inspect each medical image towards a diagnosis while describing image content to a student seated nearby. Experts\u27 eye movements and their verbal descriptions of the image content were recorded to capture various aspects of expert image understanding. This dissertation aims at an intuitive approach to extracting expert knowledge, which is to find patterns in expert data elicited from image-based diagnoses. These patterns are useful to understand both the characteristics of the medical images and the experts\u27 cognitive reasoning processes. The transformation from the viewed raw image features to interpretation as domain-specific concepts requires experts\u27 domain knowledge and cognitive reasoning. This dissertation also approximates this transformation using a matrix factorization-based framework, which helps project multiple expert-derived data modalities to high-level abstractions. To combine additional expert interventions with computational processing capabilities, an interactive machine learning paradigm is developed to treat experts as an integral part of the learning process. Specifically, experts refine medical image groups presented by the learned model locally, to incrementally re-learn the model globally. This paradigm avoids the onerous expert annotations for model training, while aligning the learned model with experts\u27 sense-making

    Learning Discriminative Feature Representations for Visual Categorization

    Get PDF
    Learning discriminative feature representations has attracted a great deal of attention due to its potential value and wide usage in a variety of areas, such as image/video recognition and retrieval, human activities analysis, intelligent surveillance and human-computer interaction. In this thesis we first introduce a new boosted key-frame selection scheme for action recognition. Specifically, we propose to select a subset of key poses for the representation of each action via AdaBoost and a new classifier, namely WLNBNN, is then developed for final classification. The experimental results of the proposed method are 0.6% - 13.2% better than previous work. After that, a domain-adaptive learning approach based on multiobjective genetic programming (MOGP) has been developed for image classification. In this method, a set of primitive 2-D operators are randomly combined to construct feature descriptors through the MOGP evolving and then evaluated by two objective fitness criteria, i.e., the classification error and the tree complexity. Later, the (near-)optimal feature descriptor can be obtained. The proposed approach can achieve 0.9% ∼ 25.9% better performance compared with state-of-the-art methods. Moreover, effective dimensionality reduction algorithms have also been widely used for obtaining better representations. In this thesis, we have proposed a novel linear unsupervised algorithm, termed Discriminative Partition Sparsity Analysis (DPSA), explicitly considering different probabilistic distributions that exist over the data points, simultaneously preserving the natural locality relationship among the data. All these above methods have been systematically evaluated on several public datasets, showing their accurate and robust performance (0.44% - 6.69% better than the previous) for action and image categorization. Targeting efficient image classification , we also introduce a novel unsupervised framework termed evolutionary compact embedding (ECE) which can automatically learn the task-specific binary hash codes. It is regarded as an optimization algorithm which combines the genetic programming (GP) and a boosting trick. The experimental results manifest ECE significantly outperform others by 1.58% - 2.19% for classification tasks. In addition, a supervised framework, bilinear local feature hashing (BLFH), has also been proposed to learn highly discriminative binary codes on the local descriptors for large-scale image similarity search. We address it as a nonconvex optimization problem to seek orthogonal projection matrices for hashing, which can successfully preserve the pairwise similarity between different local features and simultaneously take image-to-class (I2C) distances into consideration. BLFH produces outstanding results (0.017% - 0.149% better) compared to the state-of-the-art hashing techniques

    Approachable Error Bounded Lossy Compression

    Get PDF
    Compression is commonly used in HPC applications to move and store data. Traditional lossless compression, however, does not provide adequate compression of floating point data often found in scientific codes. Recently, researchers and scientists have turned to lossy compression techniques that approximate the original data rather than reproduce it in order to achieve desired levels of compression. Typical lossy compressors do not bound the errors introduced into the data, leading to the development of error bounded lossy compressors (EBLC). These tools provide the desired levels of compression as mathematical guarantees on the errors introduced. However, the current state of EBLC leaves much to be desired. The existing EBLC all have different interfaces requiring codes to be changed to adopt new techniques; EBLC have many more configuration options than their predecessors, making them more difficult to use; and EBLC typically bound quantities like point wise errors rather than higher level metrics such as spectra, p-values, or test statistics that scientists typically use. My dissertation aims to provide a uniform interface to compression and to develop tools to allow application scientists to understand and apply EBLC. This dissertation proposal presents three groups of work: LibPressio, a standard interface for compression and analysis; FRaZ/LibPressio-Opt frameworks for the automated configuration of compressors using LibPressio; and work on tools for analyzing errors in particular domains

    EG-ICE 2021 Workshop on Intelligent Computing in Engineering

    Get PDF
    The 28th EG-ICE International Workshop 2021 brings together international experts working at the interface between advanced computing and modern engineering challenges. Many engineering tasks require open-world resolutions to support multi-actor collaboration, coping with approximate models, providing effective engineer-computer interaction, search in multi-dimensional solution spaces, accommodating uncertainty, including specialist domain knowledge, performing sensor-data interpretation and dealing with incomplete knowledge. While results from computer science provide much initial support for resolution, adaptation is unavoidable and most importantly, feedback from addressing engineering challenges drives fundamental computer-science research. Competence and knowledge transfer goes both ways

    Artificial Intelligence for Multimedia Signal Processing

    Get PDF
    Artificial intelligence technologies are also actively applied to broadcasting and multimedia processing technologies. A lot of research has been conducted in a wide variety of fields, such as content creation, transmission, and security, and these attempts have been made in the past two to three years to improve image, video, speech, and other data compression efficiency in areas related to MPEG media processing technology. Additionally, technologies such as media creation, processing, editing, and creating scenarios are very important areas of research in multimedia processing and engineering. This book contains a collection of some topics broadly across advanced computational intelligence algorithms and technologies for emerging multimedia signal processing as: Computer vision field, speech/sound/text processing, and content analysis/information mining

    Cycling Through the Pandemic : Tactical Urbanism and the Implementation of Pop-Up Bike Lanes in the Time of COVID-19

    Get PDF
    Provides an international overview on how tactical urbanism was implemented to give more space to cycling Demonstrates the conceptual framework surrounding tactical urbanism and how it plays out theoretically Proposes new methodological insights to understand the effects of tactical urbanism intervention

    Fast catheter segmentation and tracking based on x-ray fluoroscopic and echocardiographic modalities for catheter-based cardiac minimally invasive interventions

    Get PDF
    X-ray fluoroscopy and echocardiography imaging (ultrasound, US) are two imaging modalities that are widely used in cardiac catheterization. For these modalities, a fast, accurate and stable algorithm for the detection and tracking of catheters is required to allow clinicians to observe the catheter location in real-time. Currently X-ray fluoroscopy is routinely used as the standard modality in catheter ablation interventions. However, it lacks the ability to visualize soft tissue and uses harmful radiation. US does not have these limitations but often contains acoustic artifacts and has a small field of view. These make the detection and tracking of the catheter in US very challenging. The first contribution in this thesis is a framework which combines Kalman filter and discrete optimization for multiple catheter segmentation and tracking in X-ray images. Kalman filter is used to identify the whole catheter from a single point detected on the catheter in the first frame of a sequence of x-ray images. An energy-based formulation is developed that can be used to track the catheters in the following frames. We also propose a discrete optimization for minimizing the energy function in each frame of the X-ray image sequence. Our approach is robust to tangential motion of the catheter and combines the tubular and salient feature measurements into a single robust and efficient framework. The second contribution is an algorithm for catheter extraction in 3D ultrasound images based on (a) the registration between the X-ray and ultrasound images and (b) the segmentation of the catheter in X-ray images. The search space for the catheter extraction in the ultrasound images is constrained to lie on or close to a curved surface in the ultrasound volume. The curved surface corresponds to the back-projection of the extracted catheter from the X-ray image to the ultrasound volume. Blob-like features are detected in the US images and organized in a graphical model. The extracted catheter is modelled as the optimal path in this graphical model. Both contributions allow the use of ultrasound imaging for the improved visualization of soft tissue. However, X-ray imaging is still required for each ultrasound frame and the amount of X-ray exposure has not been reduced. The final contribution in this thesis is a system that can track the catheter in ultrasound volumes automatically without the need for X-ray imaging during the tracking. Instead X-ray imaging is only required for the system initialization and for recovery from tracking failures. This allows a significant reduction in the amount of X-ray exposure for patient and clinicians.Open Acces

    EG-ICE 2021 Workshop on Intelligent Computing in Engineering

    Get PDF
    The 28th EG-ICE International Workshop 2021 brings together international experts working at the interface between advanced computing and modern engineering challenges. Many engineering tasks require open-world resolutions to support multi-actor collaboration, coping with approximate models, providing effective engineer-computer interaction, search in multi-dimensional solution spaces, accommodating uncertainty, including specialist domain knowledge, performing sensor-data interpretation and dealing with incomplete knowledge. While results from computer science provide much initial support for resolution, adaptation is unavoidable and most importantly, feedback from addressing engineering challenges drives fundamental computer-science research. Competence and knowledge transfer goes both ways
    corecore