2,583 research outputs found

    DocMIR: An automatic document-based indexing system for meeting retrieval

    Get PDF
    This paper describes the DocMIR system which captures, analyzes and indexes automatically meetings, conferences, lectures, etc. by taking advantage of the documents projected (e.g. slideshows, budget tables, figures, etc.) during the events. For instance, the system can automatically apply the above-mentioned procedures to a lecture and automatically index the event according to the presented slides and their contents. For indexing, the system requires neither specific software installed on the presenter's computer nor any conscious intervention of the speaker throughout the presentation. The only material required by the system is the electronic presentation file of the speaker. Even if not provided, the system would temporally segment the presentation and offer a simple storyboard-like browsing interface. The system runs on several capture boxes connected to cameras and microphones that records events, synchronously. Once the recording is over, indexing is automatically performed by analyzing the content of the captured video containing projected documents and detects the scene changes, identifies the documents, computes their duration and extracts their textual content. Each of the captured images is identified from a repository containing all original electronic documents, captured audio-visual data and metadata created during post-production. The identification is based on documents' signatures, which hierarchically structure features from both layout structure and color distributions of the document images. Video segments are finally enriched with textual content of the identified original documents, which further facilitate the query and retrieval without using OCR. The signature-based indexing method proposed in this article is robust and works with low-resolution images and can be applied to several other applications including real-time document recognition, multimedia IR and augmented reality system

    Penyelenggaraan struktur penahan cerun rock shed: langkah mitigasi runtuhan tanah di Simpang Pulai - Blue Valley, Perak

    Get PDF
    Industri pembinaan merupakan industri yang sangat mencabar bukan sahaja di Malaysia malah di seluruh dunia yang merangkumi skop 3D dirty, difficult and dangerous. Industri ini juga meruapakan antara penyumbang terbesar KDNK iaitu sebanyak 7.4 peratus pada tahun 2016, walaupun industri ini antara penyumbang terbesar dari aspek keselamatan iaitu kemalangan (CIDB, 2017). Justeru itu, pihak yang bertanggungjawab seharusnya memandang serius mengenai masalah-masalah yang dihadapi supaya industri ini mampu bersaing di peringkat antarabangsa

    Information Technology and Human Factors to Enhance Design and Constructability Review Processes in Construction

    Get PDF
    abstract: Emerging information and communication technology (ICT) has had an enormous effect on the building architecture, engineering, construction and operation (AECO) fields in recent decades. The effects have resonated in several disciplines, such as project information flow, design representation and communication, and Building Information Modeling (BIM) approaches. However, these effects can potentially impact communication and coordination of the virtual design contents in both design and construction phases. Therefore, and with the great potential for emerging technologies in construction projects, it is essential to understand how these technologies influence virtual design information within the organizations as well as individuals’ behaviors. This research focusses on understanding current emerging technologies and its impacts on projects virtual design information and communication among projects stakeholders within the AECO organizations.Dissertation/ThesisDoctoral Dissertation Civil and Environmental Engineering 201

    Realidade aumentada para produção assistida em ambiente industrial

    Get PDF
    Smart factories are becoming more and more common and Augmented Reality (AR) is a pillar of the transition to Industry 4.0 and smart manufacturing. AR can improve many industrial processes such as training, maintenance, assembly, quality control, remote collaboration and others. AR has the potential to revolutionize the way information is accessed, used and exchanged, extending user’s perception and improving their performance. This work proposes a Pervasive AR tool, created in collaboration with industrial partners, to support the training of operators on industrial shop floors while performing production operations. A Human-Centered Design (HCD) methodology was used to identify operators’ difficulties, challenges, and define requirements. After initial meetings with stakeholders, an AR prototype was designed and developed to allow the configuration and visualization of AR content on the shop floor. Several meetings and user studies were conducted to evaluate the developed tools and improve their usability and features. Comparisons between the proposed Head Mounted Display (HMD) solution, the method currently being used in the shopfloor and alternative AR solutions (mobile based) were conducted. The results of user studies suggest that the proposed AR system can significantly improve the performance (up to 70% when compared with the method currently used in the shop floor) of novice operators.Fábricas inteligentes estão a tornar-se cada vez mais comuns e a Realidade Aumentada (Augmented Reality) é essencial para a transição para a Indústria 4.0 e para a produção inteligente. A AR pode ser usada para melhorar muitos processos industriais, tais como treino, assistência, montagem, controlo de qualidade, colaboração remota, entre outros. A AR tem potencial para revolucionar a maneira como a informação é acedida, usada e partilhada, expandindo a perceção do utilizador e melhorando a sua performance. Este trabalho propõe uma ferramenta de AR Pervasiva, criada em colaboração com parceiros da indústria, para ajudar no treino de operadores de chão de fábrica em tarefas de produção fabril. Para identificar as dificuldades, desafios e definir requisitos, foi seguida uma metodologia de Desenho Centrada no Utilizador (HCD). Depois de vários encontros com o público-alvo, um protótipo de AR foi desenhado e desenvolvido para permitir a configuração e visualização de conteúdo em AR na linha de montagem de uma fábrica. Diversas reuniões e testes com utilizadores foram realizados de modo a avaliar as ferramentas desenvolvidas e melhorar a usabilidade e as suas funcionalidades. Foram também realizadas comparações entre a solução de AR proposta, o método atualmente utilizado na linha de produção e uma solução alternativa de AR para dispositivos móveis. Os resultados dos testes de utilizador realizados sugerem que a solução proposta pode melhorar substancialmente a eficiência (até 70% quando comparado com método atualmente utilizado na linha de produção) de novos operadores.Mestrado em Engenharia de Computadores e Telemátic

    MIDV-2019: Challenges of the modern mobile-based document OCR

    Full text link
    Recognition of identity documents using mobile devices has become a topic of a wide range of computer vision research. The portfolio of methods and algorithms for solving such tasks as face detection, document detection and rectification, text field recognition, and other, is growing, and the scarcity of datasets has become an important issue. One of the openly accessible datasets for evaluating such methods is MIDV-500, containing video clips of 50 identity document types in various conditions. However, the variability of capturing conditions in MIDV-500 did not address some of the key issues, mainly significant projective distortions and different lighting conditions. In this paper we present a MIDV-2019 dataset, containing video clips shot with modern high-resolution mobile cameras, with strong projective distortions and with low lighting conditions. The description of the added data is presented, and experimental baselines for text field recognition in different conditions. The dataset is available for download at ftp://smartengines.com/midv-500/extra/midv-2019/.Comment: 6 pages, 3 figures, 3 tables, 18 references, submitted and accepted to the 12th International Conference on Machine Vision (ICMV 2019

    Educational Handheld Video: Examining Shot Composition, Graphic Design, And Their Impact On Learning

    Get PDF
    Formal features of video such as shot composition and graphic design can weigh heavily on the success or failure of educational videos. Many studies have assessed the proper use of these techniques given the psychological expectations that viewers have for video programming (Hawkins et al., 2002; Kenny, 2002; Lang, Zhou, Schwardtz, Bolls, & Potter, 2000; McCain, Chilberg, & Wakshlag, 1977; McCain & Repensky, 1972; Miller, 2005; Morris, 1984; Roe, 1998; Schmitt, Anderson, & Collins, 1999; Sherman & Etling, 1991; Tannenbaum & Fosdick, 1960; Wagner, 1953). This study examined formal features within the context of the newly emerging distribution method of viewing video productions on mobile handheld devices. Shot composition and graphic design were examined in the context of an educational video to measure whether or not they had any influence on user perceptions of learning and learning outcomes. The two formal features were modified for display on 24 inch screens and on 3.5 inch or smaller screens. Participants were shown one of the four modified treatments, then presented with a test to measure whether or not the modified formal features had any impact or influence on learning outcomes from a sample of 132 undergraduate college students. No significant differences were found to occur as a result of manipulation of formal features between the treatment groups

    A novel video game peripheral for detecting fine hand motion and providing haptic feedback

    Get PDF
    Thesis (S.B.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 51-53).This thesis documents the design and implementation of a game controller glove that employs optical tracking technology to detect movement of the hand and fingers. The vision algorithm captures an image from a webcam in real-time and determines the centroids of colored sections on a glove worn by the player; assigning a distinctive identifier for each section which is associated with a 3D model retrieved from a preexisting library. A Vivitouch artificial muscle module is also mounted to the top of the glove to provide vibratory haptic feedback to the user. The system has been user tested and a number of potential use scenarios have been conceived for integration of the controller in various gaming applications.by Samantha N. Powers and Lauren K. Gust.S.B

    Efficiency Testing of an Electronic Speed Controller

    Get PDF
    This project required the development of a rig that could experimentally determine the efficiency of an Electronic Speed Controller (ESC). The selected design focuses on measuring the losses due to heat from the device and comparing this to its input power. The selected design is a flow rig that utilizes the heat equation q=ṁcpΔT. The rig provides a steady state measurement of the ESC heat output by passing a known mass flow rate of air across the ESC and measuring the temperature difference. It uses a flowmeter to determine ṁ, thermocouples to determine ΔT, and a table lookup to determine cp. After testing with a known heat source of a DC-powered silicon heater, It was found that at a flowrate of 30 L/min, the rig is able to capture greater than 90% of the heat emitted from a range of 2.5 to 10 Watts. This rig is recommended for use in testing the heat losses of an Electronic Speed Controller as well as any other small form factor, constant heat-emitting devices

    HandSight: A Touch-Based Wearable System to Increase Information Accessibility for People with Visual Impairments

    Get PDF
    Many activities of daily living such as getting dressed, preparing food, wayfinding, or shopping rely heavily on visual information, and the inability to access that information can negatively impact the quality of life for people with vision impairments. While numerous researchers have explored solutions for assisting with visual tasks that can be performed at a distance, such as identifying landmarks for navigation or recognizing people and objects, few have attempted to provide access to nearby visual information through touch. Touch is a highly attuned means of acquiring tactile and spatial information, especially for people with vision impairments. By supporting touch-based access to information, we may help users to better understand how a surface appears (e.g., document layout, clothing patterns), thereby improving the quality of life. To address this gap in research, this dissertation explores methods to augment a visually impaired user’s sense of touch with interactive, real-time computer vision to access information about the physical world. These explorations span three application areas: reading and exploring printed documents, controlling mobile devices, and identifying colors and visual textures. At the core of each application is a system called HandSight that uses wearable cameras and other sensors to detect touch events and identify surface content beneath the user’s finger. To create HandSight, we designed and implemented the physical hardware, developed signal processing and computer vision algorithms, and designed real-time feedback that enables users to interpret visual or digital content. We involve visually impaired users throughout the design and development process, conducting several user studies to assess usability and robustness and to improve our prototype designs. The contributions of this dissertation include: (i) developing and iteratively refining HandSight, a novel wearable system to assist visually impaired users in their daily lives; (ii) evaluating HandSight across a diverse set of tasks, and identifying tradeoffs of a finger-worn approach in terms of physical design, algorithmic complexity and robustness, and usability; and (iii) identifying broader design implications for future wearable systems and for the fields of accessibility, computer vision, augmented and virtual reality, and human-computer interaction

    Content-based indexing of low resolution documents

    Get PDF
    In any multimedia presentation, the trend for attendees taking pictures of slides that interest them during the presentation using capturing devices is gaining popularity. To enhance the image usefulness, the images captured could be linked to image or video database. The database can be used for the purpose of file archiving, teaching and learning, research and knowledge management, which concern image search. However, the above-mentioned devices include cameras or mobiles phones have low resolution resulted from poor lighting and noise. Content-Based Image Retrieval (CBIR) is considered among the most interesting and promising fields as far as image search is concerned. Image search is related with finding images that are similar for the known query image found in a given image database. This thesis concerns with the methods used for the purpose of identifying documents that are captured using image capturing devices. In addition, the thesis also concerns with a technique that can be used to retrieve images from an indexed image database. Both concerns above apply digital image processing technique. To build an indexed structure for fast and high quality content-based retrieval of an image, some existing representative signatures and the key indexes used have been revised. The retrieval performance is very much relying on how the indexing is done. The retrieval approaches that are currently in existence including making use of shape, colour and texture features. Putting into consideration these features relative to individual databases, the majority of retrievals approaches have poor results on low resolution documents, consuming a lot of time and in the some cases, for the given query image, irrelevant images are obtained. The proposed identification and indexing method in the thesis uses a Visual Signature (VS). VS consists of the captures slides textual layout’s graphical information, shape’s moment and spatial distribution of colour. This approach, which is signature-based are considered for fast and efficient matching to fulfil the needs of real-time applications. The approach also has the capability to overcome the problem low resolution document such as noisy image, the environment’s varying lighting conditions and complex backgrounds. We present hierarchy indexing techniques, whose foundation are tree and clustering. K-means clustering are used for visual features like colour since their spatial distribution give a good image’s global information. Tree indexing for extracted layout and shape features are structured hierarchically and Euclidean distance is used to get similarity image for CBIR. The assessment of the proposed indexing scheme is conducted based on recall and precision, a standard CBIR retrieval performance evaluation. We develop CBIR system and conduct various retrieval experiments with the fundamental aim of comparing the accuracy during image retrieval. A new algorithm that can be used with integrated visual signatures, especially in late fusion query was introduced. The algorithm has the capability of reducing any shortcoming associated with normalisation in initial fusion technique. Slides from conferences, lectures and meetings presentation are used for comparing the proposed technique’s performances with that of the existing approaches with the help of real data. This finding of the thesis presents exciting possibilities as the CBIR systems is able to produce high quality result even for a query, which uses low resolution documents. In the future, the utilization of multimodal signatures, relevance feedback and artificial intelligence technique are recommended to be used in CBIR system to further enhance the performance
    corecore