97,982 research outputs found
A stochastic segmentation method for interesting region detection and image retrieval
The explosively increasing digital photo urges for an efficient image retrieval sys- tem so that digital images can be organized, shared, and reused. Current content based image retrieval (CBIR) systems face multiple challenges in all aspects: image representation, classification and indexing. Image representation of current CBIR system is of such low quality that the background is often mixed with the objects which makes the signature of an image less distinguishable or even misleading. An image classifier connects the low level feature with the high level concept and the low quality feature will only make the effort of bridging of the semantic gap harder.
A new system to tackle these challenges more efficiently has been developed. My contribution consists of: (a) A stochastic image segmentation algorithm that is able to achieve better balance on integrity/oversegmentation. The algorithm estimates the average contour conformation and obtains more accurate results and is very at- tractive for feature extraction for customer photos as well as for tissue segmentation in 3D medical images. (b) A new interesting region detection method which can seamlessly integrate GMM and SVM in one scheme. It proves that the pattern of the common interests can be efficiently learned using the interesting region classifier. (c) The popularity and useability of the metadata of the +200 different models sold on market is explored and metadata is used both for interesting region detection and image classification. This incorporation of camera metadata has been missed in the computer vision community for decades. (d) A new high dimensional GMM estimator that tackles the oscillation of principle dimensionality of GMM in high dimension in real world dataset by estimating the average conformation along the evolution history. (e) An image retrieval system that can support query by keyword, query by example, and ontology browsing alternatively
Precise Proximal Femur Fracture Classification for Interactive Training and Surgical Planning
We demonstrate the feasibility of a fully automatic computer-aided diagnosis
(CAD) tool, based on deep learning, that localizes and classifies proximal
femur fractures on X-ray images according to the AO classification. The
proposed framework aims to improve patient treatment planning and provide
support for the training of trauma surgeon residents. A database of 1347
clinical radiographic studies was collected. Radiologists and trauma surgeons
annotated all fractures with bounding boxes, and provided a classification
according to the AO standard. The proposed CAD tool for the classification of
radiographs into types "A", "B" and "not-fractured", reaches a F1-score of 87%
and AUC of 0.95, when classifying fractures versus not-fractured cases it
improves up to 94% and 0.98. Prior localization of the fracture results in an
improvement with respect to full image classification. 100% of the predicted
centers of the region of interest are contained in the manually provided
bounding boxes. The system retrieves on average 9 relevant images (from the
same class) out of 10 cases. Our CAD scheme localizes, detects and further
classifies proximal femur fractures achieving results comparable to
expert-level and state-of-the-art performance. Our auxiliary localization model
was highly accurate predicting the region of interest in the radiograph. We
further investigated several strategies of verification for its adoption into
the daily clinical routine. A sensitivity analysis of the size of the ROI and
image retrieval as a clinical use case were presented.Comment: Accepted at IPCAI 2020 and IJCAR
An Extreme Learning Machine-Relevance Feedback Framework for Enhancing the Accuracy of a Hybrid Image Retrieval System
The process of searching, indexing and retrieving images from a massive database is a challenging task and the solution to these problems is an efficient image retrieval system. In this paper, a unique hybrid Content-based image retrieval system is proposed where different attributes of an image like texture, color and shape are extracted by using Gray level co-occurrence matrix (GLCM), color moment and various region props procedure respectively. A hybrid feature matrix or vector (HFV) is formed by an integration of feature vectors belonging to three individual visual attributes. This HFV is given as an input to an Extreme learning machine (ELM) classifier which is based on a solitary hidden layer of neurons and also is a type of feed-forward neural system. ELM performs efficient class prediction of the query image based on the pre-trained data. Lastly, to capture the high level human semantic information, Relevance feedback (RF) is utilized to retrain or reformulate the training of ELM. The advantage of the proposed system is that a combination of an ELM-RF framework leads to an evolution of a modified learning and intelligent classification system. To measure the efficiency of the proposed system, various parameters like Precision, Recall and Accuracy are evaluated. Average precision of 93.05%, 81.03%, 75.8% and 90.14% is obtained respectively on Corel-1K, Corel-5K, Corel-10K and GHIM-10 benchmark datasets. The experimental analysis portrays that the implemented technique outmatches many state-of-the-art related approaches depicting varied hybrid CBIR system
An Automatic Image Content Retrieval Method for better Mobile Device Display User Experiences
A growing number of commercially available mobile phones come with integrated high-resolution digital cameras. That enables a new class of dedicated applications to image analysis such as mobile visual search, image cropping, object detection, content-based image retrieval, image classification. In this paper, a new mobile application for image content retrieval and classification for mobile device display is proposed to enrich the visual experience of users. The mobile application can extract a certain number of images based on the content of an image with visual saliency methods aiming at detecting the most critical regions in a given image from a perceptual viewpoint. First, the most critical areas from a perceptual perspective are extracted using the local maxima of a 2D saliency function. Next, a salient region is cropped using the bounding box centred on the local maxima of the thresholded Saliency Map of the image. Then, each image crop feds into an Image Classification system based on SVM and SIFT descriptors to detect the class of object present in the image. ImageNet repository was used as the reference for semantic category classification. Android platform was used to implement the mobile application on a client-server architecture. A mobile client sends the photo taken by the camera to the server, which processes the image and returns the results (image contents such as image crops and related target classes) to the mobile client. The application was run on thousands of pictures and showed encouraging results towards a better user visual experience with mobile displays
K-Space at TRECVid 2007
In this paper we describe K-Space participation in
TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance.
The first of the two systems was a ‘shot’ based interface,
where the results from a query were presented as a ranked
list of shots. The second interface was ‘broadcast’ based,
where results were presented as a ranked list of broadcasts.
Both systems made use of the outputs of our high-level feature submission as well as low-level visual features
- …