Search CORE

20,068 research outputs found

Image mining: trends and developments

Author: Hsu Wynne
Lee Mong Li
Zhang Ji
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

[Abstract]: Advances in image acquisition and storage technology have led to tremendous growth in very large and detailed image databases. These images, if analyzed, can reveal useful information to the human users. Image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the images. Image mining is more than just an extension of data mining to image domain. It is an interdisciplinary endeavor that draws upon expertise in computer vision, image processing, image retrieval, data mining, machine learning, database, and artificial intelligence. In this paper, we will examine the research issues in image mining, current developments in image mining, particularly, image mining frameworks, state-of-the-art techniques and systems. We will also identify some future research directions for image mining

University of Southern Queensland ePrints

Recommended from our members

Olfaction-enhanced multimedia: Perspectives and challenges

Author: Ademoye OA
Ghinea G
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 06/08/2010
Field of study

This is the post-print version of the Article. The official published version can be accessed from the link below - Copyright @ 2011 Springer VerlagOlfaction—or smell—is one of the last challenges which multimedia and multimodal applications have to conquer. Enhancing such applications with olfactory stimuli has the potential to create a more complex—and richer—user multimedia experience, by heightening the sense of reality and diversifying user interaction modalities. Nonetheless, olfaction-enhanced multimedia still remains a challenging research area. More recently, however, there have been initial signs of olfactory-enhanced applications in multimedia, with olfaction being used towards a variety of goals, including notification alerts, enhancing the sense of reality in immersive applications, and branding, to name but a few. However, as the goal of a multimedia application is to inform and/or entertain users, achieving quality olfaction-enhanced multimedia applications from the users’ perspective is vital to the success and continuity of these applications. Accordingly, in this paper we have focused on investigating the user perceived experience of olfaction-enhanced multimedia applications, with the aim of discovering the quality evaluation factors that are important from a user’s perspective of these applications, and consequently ensure the continued advancement and success of olfaction-enhanced multimedia applications

Brunel University Research Archive

High-level feature detection from video in TRECVid: a 5-year retrospective of achievements

Author: A. F. Smeaton
A. F. Smeaton
A. Loui
A. P. Natsev
A. Smeulders
C. G. M. Snoek
C. G. Snoek
E. Yilmaz
M. G. Christel
M. Naphade
M. R. Naphade
P. Joly
P. Over
T. Volkmer
W. Kraaij
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Successful and effective content-based access to digital video requires fast, accurate and scalable methods to determine the video content automatically. A variety of contemporary approaches to this rely on text taken from speech within the video, or on matching one video frame against others using low-level characteristics like colour, texture, or shapes, or on determining and matching objects appearing within the video. Possibly the most important technique, however, is one which determines the presence or absence of a high-level or semantic feature, within a video clip or shot. By utilizing dozens, hundreds or even thousands of such semantic features we can support many kinds of content-based video navigation. Critically however, this depends on being able to determine whether each feature is or is not present in a video clip. The last 5 years have seen much progress in the development of techniques to determine the presence of semantic features within video. This progress can be tracked in the annual TRECVid benchmarking activity where dozens of research groups measure the effectiveness of their techniques on common data and using an open, metrics-based approach. In this chapter we summarise the work done on the TRECVid high-level feature task, showing the progress made year-on-year. This provides a fairly comprehensive statement on where the state-of-the-art is regarding this important task, not just for one research group or for one approach, but across the spectrum. We then use this past and on-going work as a basis for highlighting the trends that are emerging in this area, and the questions which remain to be addressed before we can achieve large-scale, fast and reliable high-level feature detection on video

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

An information assistant system for the prevention of tunnel vision in crisis management

Author: Cao Yujia
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2008
Field of study

In the crisis management environment, tunnel vision is a set of bias in decision makers’ cognitive process which often leads to incorrect understanding of the real crisis situation, biased perception of information, and improper decisions. The tunnel vision phenomenon is a consequence of both the challenges in the task and the natural limitation in a human being’s cognitive process. An information assistant system is proposed with the purpose of preventing tunnel vision. The system serves as a platform for monitoring the on-going crisis event. All information goes through the system before arrives at the user. The system enhances the data quality, reduces the data quantity and presents the crisis information in a manner that prevents or repairs the user’s cognitive overload. While working with such a system, the users (crisis managers) are expected to be more likely to stay aware of the actual situation, stay open minded to possibilities, and make proper decisions

University of Twente Research Information

K-Space at TRECVid 2008

Author: Adamek Tomasz
Byrne Daragh
Jones Gareth J.F.
Keenan Gordon
Lee Hyowon
McGuinness Kevin
O'Connor Noel E.
O'Hare Neil
Smeaton Alan F.
Wilkins Peter
Publication venue: 'University of Aden - Faculty of Economics and Administration'
Publication date: 01/11/2008
Field of study

In this paper we describe K-Space’s participation in TRECVid 2008 in the interactive search task. For 2008 the K-Space group performed one of the largest interactive video information retrieval experiments conducted in a laboratory setting. We had three institutions participating in a multi-site multi-system experiment. In total 36 users participated, 12 each from Dublin City University (DCU, Ireland), University of Glasgow (GU, Scotland) and Centrum Wiskunde & Informatica (CWI, the Netherlands). Three user interfaces were developed, two from DCU which were also used in 2007 as well as an interface from GU. All interfaces leveraged the same search service. Using a latin squares arrangement, each user conducted 12 topics, leading in total to 6 runs per site, 18 in total. We officially submitted for evaluation 3 of these runs to NIST with an additional expert run using a 4th system. Our submitted runs performed around the median. In this paper we will present an overview of the search system utilized, the experimental setup and a preliminary analysis of our results

Irish Universities

DCU Online Research Access Service

K-Space at TRECVID 2008

Author: Adamek T.
Amin A.
Avrithis Y.
Bailer W.
Benmokhtar R.
Byrne D.
Chandramouli K.
Cobet A.
Dumont E.
Goldmann L.
Goyal A.
Haller M.
Halvey M.
Hannah D.
Hopfgartner F.
Huet B.
Izquierdo E.
Jones G.
Jose J.M.
Keenan G.
Kompatsiaris I.
Lee H.
McGuinness K.
Merialdo B.
Mezaris V.
Moerzinger R.
O'Connor N.
O'Hare N.
Papadopoulous G.
Praks P.
Punitha P.
Samour A.
Schallauer P.
Sikora T.
Smeaton A.F.
Spyrou E.
Tolias G.
Troncy R.
Villa R.
Wilkins P.
Publication venue
Publication date: 01/01/2008
Field of study

In this paper we describe K-Space’s participation in TRECVid 2008 in the interactive search task. For 2008 the K-Space group performed one of the largest interactive video information retrieval experiments conducted in a laboratory setting. We had three institutions participating in a multi-site multi-system experiment. In total 36 users participated, 12 each from Dublin City University (DCU, Ireland), University of Glasgow (GU, Scotland) and Centrum Wiskunde and Informatica (CWI, the Netherlands). Three user interfaces were developed, two from DCU which were also used in 2007 as well as an interface from GU. All interfaces leveraged the same search service. Using a latin squares arrangement, each user conducted 12 topics, leading in total to 6 runs per site, 18 in total. We officially submitted for evaluation 3 of these runs to NIST with an additional expert run using a 4th system. Our submitted runs performed around the median. In this paper we will present an overview of the search system utilized, the experimental setup and a preliminary analysis of our results

DSpace at NTUA

Enlighten

Using humanoid robots to study human behavior

Author: Atkeson C.G.
Hale J.G.
Kawato E.
Kawato M.
Kotosaka S.
Pollick F.E.
Riley M.
Schaul S.
Shibata T.
Tevatia G.
Ude A.
Vijayakumar S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2000
Field of study

Our understanding of human behavior advances as our humanoid robotics work progresses-and vice versa. This team's work focuses on trajectory formation and planning, learning from demonstration, oculomotor control and interactive behaviors. They are programming robotic behavior based on how we humans “program” behavior in-or train-each other

CiteSeerX

Crossref

Enlighten

Project RISE: Recognizing Industrial Smoke Emissions

Author: Dille Paul
Hoffman Ryan
Hsu Yen-Chia
Hu Ting-Yao
Huang Ting-Hao 'Kenneth'
Nourbakhsh Illah
Pachuta Jessica
Prendi Sean
Sargent Randy
Tsuhlares Anastasia
Publication venue
Publication date: 14/09/2020
Field of study

Industrial smoke emissions pose a significant concern to human health. Prior works have shown that using Computer Vision (CV) techniques to identify smoke as visual evidence can influence the attitude of regulators and empower citizens to pursue environmental justice. However, existing datasets are not of sufficient quality nor quantity to train the robust CV models needed to support air quality advocacy. We introduce RISE, the first large-scale video dataset for Recognizing Industrial Smoke Emissions. We adopted a citizen science approach to collaborate with local community members to annotate whether a video clip has smoke emissions. Our dataset contains 12,567 clips from 19 distinct views from cameras that monitored three industrial facilities. These daytime clips span 30 days over two years, including all four seasons. We ran experiments using deep neural networks to establish a strong performance baseline and reveal smoke recognition challenges. Our survey study discussed community feedback, and our data analysis displayed opportunities for integrating citizen scientists and crowd workers into the application of Artificial Intelligence for social good.Comment: Technical repor

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

A Multimedia Approach to Game-Based Training: Exploring the Effects of the Modality and Temporal Contiguity Principles on Learning in a Virtual Environment

Author: Serge Stephen
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2014
Field of study

There is an increasing interest in using video games as a means to deliver training to individuals learning new skills or tasks. However, current research lacks a clear method of developing effective instructional material when these games are used as training tools and explaining how gameplay may affect learning. The literature contains multiple approaches to training and GBT but generally lacks a foundational-level and theoretically relevant approach to how people learn specifically from video games and how to design instructional guidance within these gaming environments. This study investigated instructional delivery within GBT. Video games are a form of multimedia, consisting of both imagery and sounds. The Cognitive Theory of Multimedia Learning (CTML; Mayer 2005) explicitly describes how people learn from multimedia information, consisting of a combination of narration (words) and animation (pictures). This study empirically examined the effects of the modality and temporal contiguity principles on learning in a game-based virtual environment. Based on these principles, it was hypothesized that receiving either voice or embedded training would result in better performance on learning measures. Additionally, receiving a combination of voice and embedded training would lead to better performance on learning measures than all other instructional conditions. A total of 128 participants received training on the role and procedures related to the combat lifesaver - a non-medical soldier who receives additional training on combat-relevant lifesaving medical procedures. Training sessions involved an instructional presentation manipulated along the modality (voice or text) and temporal contiguity (embedded in the game or presented before gameplay) principles. Instructional delivery was manipulated in a 2x2 between-subjects design with four instructional conditions: Upfront-Voice, Upfront-Text, Embedded-Voice, and Embedded-Text. Results indicated that: (1) upfront instruction led to significantly better retention performance than embedded instructional regardless of delivery modality; (2) receiving voice-based instruction led to better transfer performance than text-based instruction regardless of presentation timing; (3) no differences in performance were observed on the simple application test between any instructional conditions; and (4) a significant interaction of modality-by-temporal contiguity was obtained. Simple effects analysis indicated differing effects along modality within the embedded instruction group, with voice recipients performing better than text (p = .012). Individual group comparisons revealed that the upfront-voice group performed better on retention than both embedded groups (p = .006), the embedded-voice group performed better on transfer than the upfront text group (p = .002), and the embedded-voice group performed better on the complex application test than the embedded-text group (p =.012). Findings indicated partial support for the application of the modality and temporal contiguity principles of CTML in interactive GBT. Combining gameplay (i.e., practice) with instructional presentation both helps and hinders working memory\u27s ability to process information. Findings also explain how expanding CTML into game-based training may fundamentally change how a person processes information as a function of the specific type of knowledge being taught. Results will drive future systematic research to test and determine the most effective means of designing instruction for interactive GBT. Further theoretical and practical implications will be discussed

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)