    Recent Advances in Signal Processing

    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Change blindness: eradication of gestalt strategies

    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Principles and Guidelines for Advancement of Touchscreen-Based Non-visual Access to 2D Spatial Information

    Graphical materials such as graphs and maps are often inaccessible to millions of blind and visually-impaired (BVI) people, which negatively impacts their educational prospects, ability to travel, and vocational opportunities. To address this longstanding issue, a three-phase research program was conducted that builds on and extends previous work establishing touchscreen-based haptic cuing as a viable alternative for conveying digital graphics to BVI users. Although promising, this approach poses unique challenges that can only be addressed by schematizing the underlying graphical information based on perceptual and spatio-cognitive characteristics pertinent to touchscreen-based haptic access. Towards this end, this dissertation empirically identified a set of design parameters and guidelines through a logical progression of seven experiments. Phase I investigated perceptual characteristics related to touchscreen-based graphical access using vibrotactile stimuli, with results establishing three core perceptual guidelines: (1) a minimum line width of 1mm should be maintained for accurate line-detection (Exp-1), (2) a minimum interline gap of 4mm should be used for accurate discrimination of parallel vibrotactile lines (Exp-2), and (3) a minimum angular separation of 4mm should be used for accurate discrimination of oriented vibrotactile lines (Exp-3). Building on these parameters, Phase II studied the core spatio-cognitive characteristics pertinent to touchscreen-based non-visual learning of graphical information, with results leading to the specification of three design guidelines: (1) a minimum width of 4mm should be used for supporting tasks that require tracing of vibrotactile lines and judging their orientation (Exp-4), (2) a minimum width of 4mm should be maintained for accurate line tracing and learning of complex spatial path patterns (Exp-5), and (3) vibrotactile feedback should be used as a guiding cue to support the most accurate line tracing performance (Exp-6). Finally, Phase III demonstrated that schematizing line-based maps based on these design guidelines leads to development of an accurate cognitive map. Results from Experiment-7 provide theoretical evidence in support of learning from vision and touch as leading to the development of functionally equivalent amodal spatial representations in memory. Findings from all seven experiments contribute to new theories of haptic information processing that can guide the development of new touchscreen-based non-visual graphical access solutions

    Coping with Data Scarcity in Deep Learning and Applications for Social Good

    The recent years are experiencing an extremely fast evolution of the Computer Vision and Machine Learning fields: several application domains benefit from the newly developed technologies and industries are investing a growing amount of money in Artificial Intelligence. Convolutional Neural Networks and Deep Learning substantially contributed to the rise and the diffusion of AI-based solutions, creating the potential for many disruptive new businesses. The effectiveness of Deep Learning models is grounded by the availability of a huge amount of training data. Unfortunately, data collection and labeling is an extremely expensive task in terms of both time and costs; moreover, it frequently requires the collaboration of domain experts. In the first part of the thesis, I will investigate some methods for reducing the cost of data acquisition for Deep Learning applications in the relatively constrained industrial scenarios related to visual inspection. I will primarily assess the effectiveness of Deep Neural Networks in comparison with several classical Machine Learning algorithms requiring a smaller amount of data to be trained. Hereafter, I will introduce a hardware-based data augmentation approach, which leads to a considerable performance boost taking advantage of a novel illumination setup designed for this purpose. Finally, I will investigate the situation in which acquiring a sufficient number of training samples is not possible, in particular the most extreme situation: zero-shot learning (ZSL), which is the problem of multi-class classification when no training data is available for some of the classes. Visual features designed for image classification and trained offline have been shown to be useful for ZSL to generalize towards classes not seen during training. Nevertheless, I will show that recognition performances on unseen classes can be sharply improved by learning ad hoc semantic embedding (the pre-defined list of present and absent attributes that represent a class) and visual features, to increase the correlation between the two geometrical spaces and ease the metric learning process for ZSL. In the second part of the thesis, I will present some successful applications of state-of-the- art Computer Vision, Data Analysis and Artificial Intelligence methods. I will illustrate some solutions developed during the 2020 Coronavirus Pandemic for controlling the disease vii evolution and for reducing virus spreading. I will describe the first publicly available dataset for the analysis of face-touching behavior that we annotated and distributed, and I will illustrate an extensive evaluation of several computer vision methods applied to the produced dataset. Moreover, I will describe the privacy-preserving solution we developed for estimating the \u201cSocial Distance\u201d and its violations, given a single uncalibrated image in unconstrained scenarios. I will conclude the thesis with a Computer Vision solution developed in collaboration with the Egyptian Museum of Turin for digitally unwrapping mummies analyzing their CT scan, to support the archaeologists during mummy analysis and avoiding the devastating and irreversible process of physically unwrapping the bandages for removing amulets and jewels from the body

    Saliency Map for Visual Perception

    Human and other primates move their eyes to select visual information from the scene, psycho-visual experiments (Constantinidis, 2005) suggest that attention is directed to visually salient locations in the image. This allows human beings to bring the fovea onto the relevant parts of the image, to interpret complex scenes in real time. In visual perception, an important result was the discovery of a limited set of visual properties (called pre attentive), detected in the first 200-300 milliseconds of observation of a scene, by the low-level visual system. In last decades many progresses have been made into research of visual perception by analyzing both bottom up (stimulus driven) and top down (task dependent) processes involved in human attention. Visual Saliency deals with identifying fixation points that a human viewer would focus on the first seconds of the observation of a scene