24 research outputs found
Saliency Prediction for Mobile User Interfaces
We introduce models for saliency prediction for mobile user interfaces. A
mobile interface may include elements like buttons, text, etc. in addition to
natural images which enable performing a variety of tasks. Saliency in natural
images is a well studied area. However, given the difference in what
constitutes a mobile interface, and the usage context of these devices, we
postulate that saliency prediction for mobile interface images requires a fresh
approach. Mobile interface design involves operating on elements, the building
blocks of the interface. We first collected eye-gaze data from mobile devices
for free viewing task. Using this data, we develop a novel autoencoder based
multi-scale deep learning model that provides saliency prediction at the mobile
interface element level. Compared to saliency prediction approaches developed
for natural images, we show that our approach performs significantly better on
a range of established metrics.Comment: Paper accepted at WACV 201
Beyond 2D-grids: a dependence maximization view on image browsing
Ideally, one would like to perform image search using an intuitive and friendly approach. Many existing image search engines, however, present users with sets of images arranged in some default order on the screen, typically the relevance to a query, only. While this certainly has its advantages, arguably, a more flexible and intuitive way would be to sort images into arbitrary structures such as grids, hierarchies, or spheres so that images that are visually or semantically alike are placed together. This paper focuses on designing such a navigation system for image browsers. This is a challenging task because arbitrary layout structure makes it difficult -- if not impossible -- to compute cross-similarities between images and structure coordinates, the main ingredient of traditional layouting approaches. For this reason, we resort to a recently developed machine learning technique: kernelized sorting. It is a general technique for matching pairs of objects from different domains without requiring cross-domain similarity measures and hence elegantly allows sorting images into arbitrary structures. Moreover, we extend it so that some images can be preselected for instance forming the tip of the hierarchy allowing to subsequently navigate through the search results in the lower levels in an intuitive way
Perbandingan Algoritma K-Nearest Neighbor Untuk Klasifikasi Jenis Mangga Menggunakan Berdasarkan Fitur Gray Level Co-Occurrence Matric dan Fitur Warna
Indonesia merupakan negara dengan sumber daya manusia serta sumber daya alam yang memiliki pontesial untuk dapat membangun industri buah nusantra, serta mata pencaharian sebagian besar penduduk indonesia yakni petani. Produksi pertanian diantaranya padi, jagung dan lain-lain [1][2]. Budidaya tanaman kebun jenis buah-buahan di indonesiaa seperti alpukat, nanas, kelengkeng, pisang, mangga dan lain-lain. Sebagian besar penduduk indonesia sangat gemar menanam pohon mangga di halaman rimah atau kebun mereka. Akan tetapi dari kegemaran mereka menanam pohon mangga tidak jarang masyarakat tertipu dengan jenis mangga yang ditanam. Oleh sebab itu dibutuhkan suatu model atau metode untuk dapat mengklasifikasikan jenis mangga serta untuk mengetahui jenis mangga tersebut dapat dilihat dari ciri yang ada seperti bentuk tekstur dan warna. Terdapat beberapa metode yang telah diusulkan serta telah dikerjakan utnuk mengklasifikasikan jenis mangga, akan tetapi hasil rata-rata akurasi yang diperoleh kurang dari 80%. Dalam penelitian ini mengusulkan pendekatan menggunakan k-nearest neighbor dengan optimasi algoritma genetika serta menggunakan fitur gray level co-occurrence matrix dan fitur warna daun mangga jumlah dataset yang digunakan sebanyak 800 daun citra. Penggunaan algoritma genetika untuk optimasi berhasil meningkatkan nilai akurasi pada metode k-nearest neighbor. Akurasi tertinggi terdapat pada nilai k=3 yakni 93.50%. Sedangkan metode k-nearest neighbor tanpa menggunakan optimasi memperoleh akurasi sebesar 93.00% dengan nilai k=1
Recommended from our members
Healthcare Event and Activity Logging.
The health of patients in the intensive care unit (ICU) can change frequently and inexplicably. Crucial events and activities responsible for these changes often go unnoticed. This paper introduces healthcare event and action logging (HEAL) which automatically and unobtrusively monitors and reports on events and activities that occur in a medical ICU room. HEAL uses a multimodal distributed camera network to monitor and identify ICU activities and estimate sanitation-event qualifiers. At the core is a novel approach to infer person roles based on semantic interactions, a critical requirement in many healthcare settings where individuals' identities must not be identified. The proposed approach for activity representation identifies contextual aspects basis and estimates aspect weights for proper action representation and reconstruction. The flexibility of the proposed algorithms enables the identification of people roles by associating them with inferred interactions and detected activities. A fully working prototype system is developed, tested in a mock ICU room and then deployed in two ICU rooms at a community hospital, thus offering unique capabilities for data gathering and analytics. The proposed method achieves a role identification accuracy of 84% and a backtracking role identification of 79% for obscured roles using interaction and appearance features on real ICU data. Detailed experimental results are provided in the context of four event-sanitation qualifiers: clean, transmission, contamination, and unclean
Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search
Mobile landmark search (MLS) recently receives increasing attention for its
great practical values. However, it still remains unsolved due to two important
challenges. One is high bandwidth consumption of query transmission, and the
other is the huge visual variations of query images sent from mobile devices.
In this paper, we propose a novel hashing scheme, named as canonical view based
discrete multi-modal hashing (CV-DMH), to handle these problems via a novel
three-stage learning procedure. First, a submodular function is designed to
measure visual representativeness and redundancy of a view set. With it,
canonical views, which capture key visual appearances of landmark with limited
redundancy, are efficiently discovered with an iterative mining strategy.
Second, multi-modal sparse coding is applied to transform visual features from
multiple modalities into an intermediate representation. It can robustly and
adaptively characterize visual contents of varied landmark images with certain
canonical views. Finally, compact binary codes are learned on intermediate
representation within a tailored discrete binary embedding model which
preserves visual relations of images measured with canonical views and removes
the involved noises. In this part, we develop a new augmented Lagrangian
multiplier (ALM) based optimization method to directly solve the discrete
binary codes. We can not only explicitly deal with the discrete constraint, but
also consider the bit-uncorrelated constraint and balance constraint together.
Experiments on real world landmark datasets demonstrate the superior
performance of CV-DMH over several state-of-the-art methods
Image Retrieval Based on Fuzzy Edge and Trum Fuzzy Histogram
ABSTRACT In recent years, many image retrieval systems based on color feature like fuzzy color histogram, have been applied in image retrieval systems based on content (CBIR). Most of this methods are not able to determine pixels accurate colors, especially in combined manner, and only determine whole distribution of color factor in image; therefore they are not efficient in image retrieval. We have suggested weight vector factor in trum fuzzy histogram in this paper to remove these problems. But these methods only demonstrate total distribution of color feature in image and do not consider any kind of place data, like relative positions of objects in image. Therefore do not prepare strong techniques for image retrievals with complex place ornament. since the edge pixels are important places in image and determine objects in an image and often similar images have similar backgrounds, we use competitive fuzzy edge finder algorithm which effectively categorizes image pixels into 5 classes ,including 4 edge classes in different directions and 1 background class. after categorizing pixels, feature vector for each class would be determined, that includes Trum fuzzy color histogram and place position. we compared our suggested method to fuzzy histogram method and compound neighborhood fuzzy entropy method with color _place feature, as tests results show high efficiency of our suggested method for image retrievals from COREL database, including 3000 images
Image-Based Airborne Sensors: A Combined Approach for Spectral Signatures Classification through Deterministic Simulated Annealing
The increasing technology of high-resolution image airborne sensors, including those on board Unmanned Aerial Vehicles, demands automatic solutions for processing, either on-line or off-line, the huge amountds of image data sensed during the flights. The classification of natural spectral signatures in images is one potential application. The actual tendency in classification is oriented towards the combination of simple classifiers. In this paper we propose a combined strategy based on the Deterministic Simulated Annealing (DSA) framework. The simple classifiers used are the well tested supervised parametric Bayesian estimator and the Fuzzy Clustering. The DSA is an optimization approach, which minimizes an energy function. The main contribution of DSA is its ability to avoid local minima during the optimization process thanks to the annealing scheme. It outperforms simple classifiers used for the combination and some combined strategies, including a scheme based on the fuzzy cognitive maps and an optimization approach based on the Hopfield neural network paradigm