Search CORE

448 research outputs found

Fast concurrent object classification and localization

Author: Darrell Trevor
Lee John J.
Yeh Tom
Publication venue
Publication date: 10/06/2008
Field of study

Object localization and classification are important problems incomputer vision. However, in many applications, exhaustive searchover all class labels and image locations is computationallyprohibitive. While several methods have been proposed to makeeither classification or localization more efficient, few havedealt with both tasks simultaneously. This paper proposes anefficient method for concurrent object localization andclassification based on a data-dependent multi-classbranch-and-bound formalism. Existing bag-of-featuresclassification schemes, which can be expressed as weightedcombinations of feature counts can be readily adapted to ourmethod. We present experimental results that demonstrate the meritof our algorithm in terms of classification accuracy, localizationaccuracy, and speed, compared to baseline approaches includingexhaustive search, the ISM method, and single-class branch andbound

DSpace@MIT

GUI Testing Using Computer Vision

Author: Chang Tsung-Hsiang
Miller Robert C.
Yeh Tom
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Testing a GUI's visual behavior typically requires human testers to interact with the GUI and to observe whether the expected results of interaction are presented. This paper presents a new approach to GUI testing using computer vision for testers to automate their tasks. Testers can write a visual test script that uses images to specify which GUI components to interact with and what visual feedback to be observed. Testers can also generate visual test scripts by demonstration. By recording both input events and screen images, it is possible to extract the images of components interacted with and the visual feedback seen by the demonstrator, and generate a visual test script automatically. We show that a variety of GUI behavior can be tested using this approach. Also, we show how this approach can facilitate good testing practices such as unit testing, regression testing, and test-driven development.National Science Foundation (U.S.). (Grant number IIS-0447800)Quanta Computer (Firm) (TParty project

DSpace@MIT

Crossref

Empower Children in Nigeria to Design the Future of Artificial Intelligence (AI) through Writing

Author: Adejoro Cornelius
Arn Luise
Schwartz Larissa
Yeh Tom
Publication venue
Publication date: 15/03/2023
Field of study

This paper presents a new approach to engaging children in Nigeria to share their views of AI. This approach is centered on an inclusive writing contest for children in a secondary school in Abuja to write about AI to compete for prizes and share their writings with others. A preliminary analysis of the first 11 articles we received exhibits diverse gender and ethnic representation that conveys cultural values and perspectives distinct from those of the children in the western countries. This finding suggests future work to conduct in-depth cross-cultural analysis of the articles and to replicate similar writing contests to engage children in other underrepresented countries

arXiv.org e-Print Archive

Sikuli: Using GUI screenshots for search and automation

Author: Chang Tsung-Hsiang
Miller Robert C.
Yeh Tom
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

We present Sikuli, a visual approach to search and automation of graphical user interfaces using screenshots. Sikuli allows users to take a screenshot of a GUI element (such as a toolbar button, icon, or dialog box) and query a help system using the screenshot instead of the element's name. Sikuli also provides a visual scripting API for automating GUI interactions, using screenshot patterns to direct mouse and keyboard events. We report a web-based user study showing that searching by screenshot is easy to learn and faster to specify than keywords. We also demonstrate several automation tasks suitable for visual scripting, such as map navigation and bus tracking, and show how visual scripting can improve interactive help systems previously proposed in the literature

CiteSeerX

DSpace@MIT

Crossref