4,481 research outputs found

    Investigating non-visual eye movements non-intrusively: Comparing manual and automatic annotation styles

    Get PDF
    Non-visual eye-movements (NVEMs) are eye movements that do not serve the provision of visual information. As of yet, their cognitive origins and meaning remain under-explored in eye-movement research. The first problem presenting itself in pursuit of their study is one of annotation: in virtue of their being non-visual, they are not necessarily bound to a specific surface or object of interest, rendering conventional eye-trackers nonideal for their study. This, however, makes it potentially viable to investigate them without requiring high resolution data. In this report, we present two approaches to annotating NVEM data – one of them grid-based, involving manual annotation in ELAN (Max Planck Institute for Psycholinguistics: The Language Archive, 2019), the other one Cartesian coordinate-based, derived algorithmically through OpenFace (Baltrušaitis et al., 2018). We evaluated a) the two approaches in themselves, e.g. in terms of consistency, as well as b) their compatibility, i.e. the possibilities of mapping one to the other. In the case of a), we found good overall consistency in both approaches, in the case of b), there is evidence for the eventual possibility of mapping the OpenFace gaze estimations onto the manual coding grid

    Grasp-sensitive surfaces

    Get PDF
    Grasping objects with our hands allows us to skillfully move and manipulate them. Hand-held tools further extend our capabilities by adapting precision, power, and shape of our hands to the task at hand. Some of these tools, such as mobile phones or computer mice, already incorporate information processing capabilities. Many other tools may be augmented with small, energy-efficient digital sensors and processors. This allows for graspable objects to learn about the user grasping them - and supporting the user's goals. For example, the way we grasp a mobile phone might indicate whether we want to take a photo or call a friend with it - and thus serve as a shortcut to that action. A power drill might sense whether the user is grasping it firmly enough and refuse to turn on if this is not the case. And a computer mouse could distinguish between intentional and unintentional movement and ignore the latter. This dissertation gives an overview of grasp sensing for human-computer interaction, focusing on technologies for building grasp-sensitive surfaces and challenges in designing grasp-sensitive user interfaces. It comprises three major contributions: a comprehensive review of existing research on human grasping and grasp sensing, a detailed description of three novel prototyping tools for grasp-sensitive surfaces, and a framework for analyzing and designing grasp interaction: For nearly a century, scientists have analyzed human grasping. My literature review gives an overview of definitions, classifications, and models of human grasping. A small number of studies have investigated grasping in everyday situations. They found a much greater diversity of grasps than described by existing taxonomies. This diversity makes it difficult to directly associate certain grasps with users' goals. In order to structure related work and own research, I formalize a generic workflow for grasp sensing. It comprises *capturing* of sensor values, *identifying* the associated grasp, and *interpreting* the meaning of the grasp. A comprehensive overview of related work shows that implementation of grasp-sensitive surfaces is still hard, researchers often are not aware of related work from other disciplines, and intuitive grasp interaction has not yet received much attention. In order to address the first issue, I developed three novel sensor technologies designed for grasp-sensitive surfaces. These mitigate one or more limitations of traditional sensing techniques: **HandSense** uses four strategically positioned capacitive sensors for detecting and classifying grasp patterns on mobile phones. The use of custom-built high-resolution sensors allows detecting proximity and avoids the need to cover the whole device surface with sensors. User tests showed a recognition rate of 81%, comparable to that of a system with 72 binary sensors. **FlyEye** uses optical fiber bundles connected to a camera for detecting touch and proximity on arbitrarily shaped surfaces. It allows rapid prototyping of touch- and grasp-sensitive objects and requires only very limited electronics knowledge. For FlyEye I developed a *relative calibration* algorithm that allows determining the locations of groups of sensors whose arrangement is not known. **TDRtouch** extends Time Domain Reflectometry (TDR), a technique traditionally used for inspecting cable faults, for touch and grasp sensing. TDRtouch is able to locate touches along a wire, allowing designers to rapidly prototype and implement modular, extremely thin, and flexible grasp-sensitive surfaces. I summarize how these technologies cater to different requirements and significantly expand the design space for grasp-sensitive objects. Furthermore, I discuss challenges for making sense of raw grasp information and categorize interactions. Traditional application scenarios for grasp sensing use only the grasp sensor's data, and only for mode-switching. I argue that data from grasp sensors is part of the general usage context and should be only used in combination with other context information. For analyzing and discussing the possible meanings of grasp types, I created the GRASP model. It describes five categories of influencing factors that determine how we grasp an object: *Goal* -- what we want to do with the object, *Relationship* -- what we know and feel about the object we want to grasp, *Anatomy* -- hand shape and learned movement patterns, *Setting* -- surrounding and environmental conditions, and *Properties* -- texture, shape, weight, and other intrinsics of the object I conclude the dissertation with a discussion of upcoming challenges in grasp sensing and grasp interaction, and provide suggestions for implementing robust and usable grasp interaction.Die Fähigkeit, Gegenstände mit unseren Händen zu greifen, erlaubt uns, diese vielfältig zu manipulieren. Werkzeuge erweitern unsere Fähigkeiten noch, indem sie Genauigkeit, Kraft und Form unserer Hände an die Aufgabe anpassen. Digitale Werkzeuge, beispielsweise Mobiltelefone oder Computermäuse, erlauben uns auch, die Fähigkeiten unseres Gehirns und unserer Sinnesorgane zu erweitern. Diese Geräte verfügen bereits über Sensoren und Recheneinheiten. Aber auch viele andere Werkzeuge und Objekte lassen sich mit winzigen, effizienten Sensoren und Recheneinheiten erweitern. Dies erlaubt greifbaren Objekten, mehr über den Benutzer zu erfahren, der sie greift - und ermöglicht es, ihn bei der Erreichung seines Ziels zu unterstützen. Zum Beispiel könnte die Art und Weise, in der wir ein Mobiltelefon halten, verraten, ob wir ein Foto aufnehmen oder einen Freund anrufen wollen - und damit als Shortcut für diese Aktionen dienen. Eine Bohrmaschine könnte erkennen, ob der Benutzer sie auch wirklich sicher hält und den Dienst verweigern, falls dem nicht so ist. Und eine Computermaus könnte zwischen absichtlichen und unabsichtlichen Mausbewegungen unterscheiden und letztere ignorieren. Diese Dissertation gibt einen Überblick über Grifferkennung (*grasp sensing*) für die Mensch-Maschine-Interaktion, mit einem Fokus auf Technologien zur Implementierung griffempfindlicher Oberflächen und auf Herausforderungen beim Design griffempfindlicher Benutzerschnittstellen. Sie umfasst drei primäre Beiträge zum wissenschaftlichen Forschungsstand: einen umfassenden Überblick über die bisherige Forschung zu menschlichem Greifen und Grifferkennung, eine detaillierte Beschreibung dreier neuer Prototyping-Werkzeuge für griffempfindliche Oberflächen und ein Framework für Analyse und Design von griff-basierter Interaktion (*grasp interaction*). Seit nahezu einem Jahrhundert erforschen Wissenschaftler menschliches Greifen. Mein Überblick über den Forschungsstand beschreibt Definitionen, Klassifikationen und Modelle menschlichen Greifens. In einigen wenigen Studien wurde bisher Greifen in alltäglichen Situationen untersucht. Diese fanden eine deutlich größere Diversität in den Griffmuster als in existierenden Taxonomien beschreibbar. Diese Diversität erschwert es, bestimmten Griffmustern eine Absicht des Benutzers zuzuordnen. Um verwandte Arbeiten und eigene Forschungsergebnisse zu strukturieren, formalisiere ich einen allgemeinen Ablauf der Grifferkennung. Dieser besteht aus dem *Erfassen* von Sensorwerten, der *Identifizierung* der damit verknüpften Griffe und der *Interpretation* der Bedeutung des Griffes. In einem umfassenden Überblick über verwandte Arbeiten zeige ich, dass die Implementierung von griffempfindlichen Oberflächen immer noch ein herausforderndes Problem ist, dass Forscher regelmäßig keine Ahnung von verwandten Arbeiten in benachbarten Forschungsfeldern haben, und dass intuitive Griffinteraktion bislang wenig Aufmerksamkeit erhalten hat. Um das erstgenannte Problem zu lösen, habe ich drei neuartige Sensortechniken für griffempfindliche Oberflächen entwickelt. Diese mindern jeweils eine oder mehrere Schwächen traditioneller Sensortechniken: **HandSense** verwendet vier strategisch positionierte kapazitive Sensoren um Griffmuster zu erkennen. Durch die Verwendung von selbst entwickelten, hochauflösenden Sensoren ist es möglich, schon die Annäherung an das Objekt zu erkennen. Außerdem muss nicht die komplette Oberfläche des Objekts mit Sensoren bedeckt werden. Benutzertests ergaben eine Erkennungsrate, die vergleichbar mit einem System mit 72 binären Sensoren ist. **FlyEye** verwendet Lichtwellenleiterbündel, die an eine Kamera angeschlossen werden, um Annäherung und Berührung auf beliebig geformten Oberflächen zu erkennen. Es ermöglicht auch Designern mit begrenzter Elektronikerfahrung das Rapid Prototyping von berührungs- und griffempfindlichen Objekten. Für FlyEye entwickelte ich einen *relative-calibration*-Algorithmus, der verwendet werden kann um Gruppen von Sensoren, deren Anordnung unbekannt ist, semi-automatisch anzuordnen. **TDRtouch** erweitert Time Domain Reflectometry (TDR), eine Technik die üblicherweise zur Analyse von Kabelbeschädigungen eingesetzt wird. TDRtouch erlaubt es, Berührungen entlang eines Drahtes zu lokalisieren. Dies ermöglicht es, schnell modulare, extrem dünne und flexible griffempfindliche Oberflächen zu entwickeln. Ich beschreibe, wie diese Techniken verschiedene Anforderungen erfüllen und den *design space* für griffempfindliche Objekte deutlich erweitern. Desweiteren bespreche ich die Herausforderungen beim Verstehen von Griffinformationen und stelle eine Einteilung von Interaktionsmöglichkeiten vor. Bisherige Anwendungsbeispiele für die Grifferkennung nutzen nur Daten der Griffsensoren und beschränken sich auf Moduswechsel. Ich argumentiere, dass diese Sensordaten Teil des allgemeinen Benutzungskontexts sind und nur in Kombination mit anderer Kontextinformation verwendet werden sollten. Um die möglichen Bedeutungen von Griffarten analysieren und diskutieren zu können, entwickelte ich das GRASP-Modell. Dieses beschreibt fünf Kategorien von Einflussfaktoren, die bestimmen wie wir ein Objekt greifen: *Goal* -- das Ziel, das wir mit dem Griff erreichen wollen, *Relationship* -- das Verhältnis zum Objekt, *Anatomy* -- Handform und Bewegungsmuster, *Setting* -- Umgebungsfaktoren und *Properties* -- Eigenschaften des Objekts, wie Oberflächenbeschaffenheit, Form oder Gewicht. Ich schließe mit einer Besprechung neuer Herausforderungen bei der Grifferkennung und Griffinteraktion und mache Vorschläge zur Entwicklung von zuverlässiger und benutzbarer Griffinteraktion

    Developing a Robust Migration Workflow for Preserving and Curating Hand-held Media

    Full text link
    Many memory institutions hold large collections of hand-held media, which can comprise hundreds of terabytes of data spread over many thousands of data-carriers. Many of these carriers are at risk of significant physical degradation over time, depending on their composition. Unfortunately, handling them manually is enormously time consuming and so a full and frequent evaluation of their condition is extremely expensive. It is, therefore, important to develop scalable processes for stabilizing them onto backed-up online storage where they can be subject to highquality digital preservation management. This goes hand in hand with the need to establish efficient, standardized ways of recording metadata and to deal with defective data-carriers. This paper discusses processing approaches, workflows, technical set-up, software solutions and touches on staffing needs for the stabilization process. We have experimented with different disk copying robots, defined our metadata, and addressed storage issues to scale stabilization to the vast quantities of digital objects on hand-held data-carriers that need to be preserved. Working closely with the content curators, we have been able to build a robust data migration workflow and have stabilized over 16 terabytes of data in a scalable and economical manner.Comment: 11 pages, presented at iPres 2011. Also publishing in corresponding conference proceeding

    Simultaneous Camera Path Optimization and Distraction Removal for Improving Amateur Video

    Get PDF
    A major difference between amateur and professional video lies in the quality of camera paths. Previous work on video stabilization has considered how to improve amateur video by smoothing the camera path. In this paper, we show that additional changes to the camera path can further improve video aesthetics. Our new optimization method achieves multiple simultaneous goals: 1) stabilizing video content over short time scales; 2) ensuring simple and consistent camera paths over longer time scales; and 3) improving scene composition by automatically removing distractions, a common occurrence in amateur video. Our approach uses an L1 camera path optimization framework, extended to handle multiple constraints. Two passes of optimization are used to address both low-level and high-level constraints on the camera path. The experimental and user study results show that our approach outputs video that is perceptually better than the input, or the results of using stabilization only

    Simultaneous camera path optimization and distraction removal for improving amateur video

    Get PDF
    A major difference between amateur and professional video lies in the quality of camera paths. Previous work on video stabilization has considered how to improve amateur video by smoothing the camera path. In this paper, we show that additional changes to the camera path can further improve video aesthetics. Our new optimization method achieves multiple simultaneous goals: 1) stabilizing video content over short time scales; 2) ensuring simple and consistent camera paths over longer time scales; and 3) improving scene composition by automatically removing distractions, a common occurrence in amateur video. Our approach uses an L1 camera path optimization framework, extended to handle multiple constraints. Two passes of optimization are used to address both low-level and high-level constraints on the camera path. The experimental and user study results show that our approach outputs video that is perceptually better than the input, or the results of using stabilization only

    Social Intelligence Design 2007. Proceedings Sixth Workshop on Social Intelligence Design

    Get PDF

    Computational Modeling of Human Dorsal Pathway for Motion Processing

    Get PDF
    Reliable motion estimation in videos is of crucial importance for background iden- tification, object tracking, action recognition, event analysis, self-navigation, etc. Re- constructing the motion field in the 2D image plane is very challenging, due to variations in image quality, scene geometry, lighting condition, and most importantly, camera jit- tering. Traditional optical flow models assume consistent image brightness and smooth motion field, which are violated by unstable illumination and motion discontinuities that are common in real world videos. To recognize observer (or camera) motion robustly in complex, realistic scenarios, we propose a biologically-inspired motion estimation system to overcome issues posed by real world videos. The bottom-up model is inspired from the infrastructure as well as functionalities of human dorsal pathway, and the hierarchical processing stream can be divided into three stages: 1) spatio-temporal processing for local motion, 2) recogni- tion for global motion patterns (camera motion), and 3) preemptive estimation of object motion. To extract effective and meaningful motion features, we apply a series of steer- able, spatio-temporal filters to detect local motion at different speeds and directions, in a way that\u27s selective of motion velocity. The intermediate response maps are cal- ibrated and combined to estimate dense motion fields in local regions, and then, local motions along two orthogonal axes are aggregated for recognizing planar, radial and circular patterns of global motion. We evaluate the model with an extensive, realistic video database that collected by hand with a mobile device (iPad) and the video content varies in scene geometry, lighting condition, view perspective and depth. We achieved high quality result and demonstrated that this bottom-up model is capable of extracting high-level semantic knowledge regarding self motion in realistic scenes. Once the global motion is known, we segment objects from moving backgrounds by compensating for camera motion. For videos captured with non-stationary cam- eras, we consider global motion as a combination of camera motion (background) and object motion (foreground). To estimate foreground motion, we exploit corollary dis- charge mechanism of biological systems and estimate motion preemptively. Since back- ground motions for each pixel are collectively introduced by camera movements, we apply spatial-temporal averaging to estimate the background motion at pixel level, and the initial estimation of foreground motion is derived by comparing global motion and background motion at multiple spatial levels. The real frame signals are compared with those derived by forward predictions, refining estimations for object motion. This mo- tion detection system is applied to detect objects with cluttered, moving backgrounds and is proved to be efficient in locating independently moving, non-rigid regions. The core contribution of this thesis is the invention of a robust motion estimation system for complicated real world videos, with challenges by real sensor noise, complex natural scenes, variations in illumination and depth, and motion discontinuities. The overall system demonstrates biological plausibility and holds great potential for other applications, such as camera motion removal, heading estimation, obstacle avoidance, route planning, and vision-based navigational assistance, etc

    3D Medical Image Lossless Compressor Using Deep Learning Approaches

    Get PDF
    The ever-increasing importance of accelerated information processing, communica-tion, and storing are major requirements within the big-data era revolution. With the extensive rise in data availability, handy information acquisition, and growing data rate, a critical challenge emerges in efficient handling. Even with advanced technical hardware developments and multiple Graphics Processing Units (GPUs) availability, this demand is still highly promoted to utilise these technologies effectively. Health-care systems are one of the domains yielding explosive data growth. Especially when considering their modern scanners abilities, which annually produce higher-resolution and more densely sampled medical images, with increasing requirements for massive storage capacity. The bottleneck in data transmission and storage would essentially be handled with an effective compression method. Since medical information is critical and imposes an influential role in diagnosis accuracy, it is strongly encouraged to guarantee exact reconstruction with no loss in quality, which is the main objective of any lossless compression algorithm. Given the revolutionary impact of Deep Learning (DL) methods in solving many tasks while achieving the state of the art results, includ-ing data compression, this opens tremendous opportunities for contributions. While considerable efforts have been made to address lossy performance using learning-based approaches, less attention was paid to address lossless compression. This PhD thesis investigates and proposes novel learning-based approaches for compressing 3D medical images losslessly.Firstly, we formulate the lossless compression task as a supervised sequential prediction problem, whereby a model learns a projection function to predict a target voxel given sequence of samples from its spatially surrounding voxels. Using such 3D local sampling information efficiently exploits spatial similarities and redundancies in a volumetric medical context by utilising such a prediction paradigm. The proposed NN-based data predictor is trained to minimise the differences with the original data values while the residual errors are encoded using arithmetic coding to allow lossless reconstruction.Following this, we explore the effectiveness of Recurrent Neural Networks (RNNs) as a 3D predictor for learning the mapping function from the spatial medical domain (16 bit-depths). We analyse Long Short-Term Memory (LSTM) models’ generalisabil-ity and robustness in capturing the 3D spatial dependencies of a voxel’s neighbourhood while utilising samples taken from various scanning settings. We evaluate our proposed MedZip models in compressing unseen Computerized Tomography (CT) and Magnetic Resonance Imaging (MRI) modalities losslessly, compared to other state-of-the-art lossless compression standards.This work investigates input configurations and sampling schemes for a many-to-one sequence prediction model, specifically for compressing 3D medical images (16 bit-depths) losslessly. The main objective is to determine the optimal practice for enabling the proposed LSTM model to achieve a high compression ratio and fast encoding-decoding performance. A solution for a non-deterministic environments problem was also proposed, allowing models to run in parallel form without much compression performance drop. Compared to well-known lossless codecs, experimental evaluations were carried out on datasets acquired by different hospitals, representing different body segments, and have distinct scanning modalities (i.e. CT and MRI).To conclude, we present a novel data-driven sampling scheme utilising weighted gradient scores for training LSTM prediction-based models. The objective is to determine whether some training samples are significantly more informative than others, specifically in medical domains where samples are available on a scale of billions. The effectiveness of models trained on the presented importance sampling scheme was evaluated compared to alternative strategies such as uniform, Gaussian, and sliced-based sampling
    corecore