35 research outputs found

    Dünaamiline kiiruse jaotamine interaktiivses mitmevaatelises video vaatevahetuse ennustamineses

    Get PDF
    In Interactive Multi-View Video (IMVV), the video has been captured by numbers of cameras positioned in array and transmitted those camera views to users. The user can interact with the transmitted video content by choosing viewpoints (views from different cameras in the array) with the expectation of minimum transmission delay while changing among various views. View switching delay is one of the primary concern that is dealt in this thesis work, where the contribution is to minimize the transmission delay of new view switch frame through a novel process of selection of the predicted view and compression considering the transmission efficiency. Mainly considered a realtime IMVV streaming, and the view switch is mapped as discrete Markov chain, where the transition probability is derived using Zipf distribution, which provides information regarding view switch prediction. To eliminate Round-Trip Time (RTT) transmission delay, Quantization Parameters (QP) are adaptively allocated to the remaining redundant transmitted frames to maintain view switching time minimum, trading off with the quality of the video till RTT time-span. The experimental results of the proposed method show superior performance on PSNR and view switching delay for better viewing quality over the existing methods

    "Enriching 360-degree technologies through human-computer interaction: psychometric validation of two memory tasks"

    Get PDF
    This doctoral dissertation explores the domain of neuropsychological assessment, with the objective of gaining a comprehensive understanding of an individual's cognitive functioning and detecting possible impairments. Traditional assessment tools, while possessing inherent value, frequently exhibit a deficiency in ecological validity when evaluating memory, as they predominantly concentrate on short-term, regulated tasks. To overcome this constraint, immersive technologies, specifically virtual reality and 360° videos, have surfaced as promising instruments for augmenting the ecological validity of cognitive assessments. This work examines the potential advantages of immersive technologies, particularly 360° videos, in enhancing memory evaluation. First, a comprehensive overview of contemporary virtual reality tools employed in the assessment of memory, as well as their convergence with conventional assessment measures has been provided. Then, the present study utilizes cluster and network analysis techniques to categorize 360° videos according to their content and applications, thereby offering significant insights into the potential of this nascent medium. The study introduces then a novel platform, Mindscape, that aims to address the existing technological disparity, thereby enhancing the accessibility of clinicians and researchers in developing cognitive tasks within immersive environments. The conclusion of the thesis encompasses the psychometric validation of two memory tasks, which have been specifically developed with Mindscape to assess episodic and spatial memory. The findings demonstrate disparities in cognitive performance between individuals diagnosed with Mild Cognitive Impairment and those without cognitive impairments, underscoring the interrelated nature of cognitive processes and the promising prospects of virtual reality technology in improving the authenticity of real-world experiences. Overall, this dissertation aims to respond to the demand for practical and ecologically valid neuropsychological assessments within the dynamic field of neuropsychology. It achieves this by integrating user-friendly platforms and immersive cognitive tasks into its methodology. By highlighting a shift in the field of neuropsychology towards prioritizing functional and practical assessments over theoretical frameworks, this work indicates a changing perspective within the discipline. This study highlights the potential of comprehensive and purpose-oriented assessment methods in cognitive evaluations, emphasizing the ongoing significance of research in fully comprehending the capabilities of immersive technologies

    Beyond Transmitting Bits: Context, Semantics, and Task-Oriented Communications

    Full text link
    Communication systems to date primarily aim at reliably communicating bit sequences. Such an approach provides efficient engineering designs that are agnostic to the meanings of the messages or to the goal that the message exchange aims to achieve. Next generation systems, however, can be potentially enriched by folding message semantics and goals of communication into their design. Further, these systems can be made cognizant of the context in which communication exchange takes place, providing avenues for novel design insights. This tutorial summarizes the efforts to date, starting from its early adaptations, semantic-aware and task-oriented communications, covering the foundations, algorithms and potential implementations. The focus is on approaches that utilize information theory to provide the foundations, as well as the significant role of learning in semantics and task-aware communications.Comment: 28 pages, 14 figure

    Real-time 3D human body pose estimation from monocular RGB input

    Get PDF
    Human motion capture finds extensive application in movies, games, sports and biomechanical analysis. However, existing motion capture solutions require cumbersome external and/or on-body instrumentation, or use active sensors with limits on the possible capture volume dictated by power consumption. The ubiquity and ease of deployment of RGB cameras makes monocular RGB based human motion capture an extremely useful problem to solve, which would lower the barrier-to entry for content creators to employ motion capture tools, and enable newer applications of human motion capture. This thesis demonstrates the first real-time monocular RGB based motion-capture solutions that work in general scene settings. They are based on developing neural network based approaches to address the ill-posed problem of estimating 3D human pose from a single RGB image, in combination with model based fitting. In particular, the contributions of this work make advances towards three key aspects of real-time monocular RGB based motion capture, namely speed, accuracy, and the ability to work for general scenes. New training datasets are proposed, for single-person and multi-person scenarios, which, together with the proposed transfer learning based training pipeline, allow learning based approaches to be appearance invariant. The training datasets are accompanied by evaluation benchmarks with multiple avenues of fine-grained evaluation. The evaluation benchmarks differ visually from the training datasets, so as to promote efforts towards solutions that generalize to in-the-wild scenes. The proposed task formulations for the single-person and multi-person case allow higher accuracy, and incorporate additional qualities such as occlusion robustness, that are helpful in the context of a full motion capture solution. The multi-person formulations are designed to have a nearly constant inference time regardless of the number of subjects in the scene, and combined with contributions towards fast neural network inference, enable real-time 3D pose estimation for multiple subjects. Combining the proposed learning-based approaches with a model-based kinematic skeleton fitting step provides temporally stable joint angle estimates, which can be readily employed for driving virtual characters.Menschlicher Motion Capture findet umfangreiche Anwendung in Filmen, Spielen, Sport und biomechanischen Analysen. Bestehende Motion-Capture-Lösungen erfordern jedoch umständliche externe Instrumentierung und / oder Instrumentierung am Körper, oder verwenden aktive Sensoren deren begrenztes Erfassungsvolumen durch den Stromverbrauch begrenzt wird. Die Allgegenwart und einfache Bereitstellung von RGB-Kameras macht die monokulare RGB-basierte Motion Capture zu einem äußerst nützlichen Problem. Dies würde die Eintrittsbarriere für Inhaltsersteller für die Verwendung der Motion Capture verringern und neuere Anwendungen dieser Tools zur Analyse menschlicher Bewegungen ermöglichen. Diese Arbeit zeigt die ersten monokularen RGB-basierten Motion-Capture-Lösungen in Echtzeit, die in allgemeinen Szeneneinstellungen funktionieren. Sie basieren auf der Entwicklung neuronaler netzwerkbasierter Ansätze, um das schlecht gestellte Problem der Schätzung der menschlichen 3D-Pose aus einem einzelnen RGB-Bild in Kombination mit einer modellbasierten Anpassung anzugehen. Insbesondere machen die Beiträge dieser Arbeit Fortschritte in Richtung drei Schlüsselaspekte der monokularen RGB-basierten Echtzeit-Bewegungserfassung, nämlich Geschwindigkeit, Genauigkeit und die Fähigkeit, für allgemeine Szenen zu arbeiten. Es werden neue Trainingsdatensätze für Einzel- und Mehrpersonen-Szenarien vorgeschlagen, die zusammen mit der vorgeschlagenen Trainingspipeline, die auf Transferlernen basiert, ermöglichen, dass lernbasierte Ansätze nicht von Unterschieden im Erscheinungsbild des Bildes beeinflusst werden. Die Trainingsdatensätze werden von Bewertungsbenchmarks mit mehreren Möglichkeiten einer feinkörnigen Bewertung begleitet. Die angegebenen Benchmarks unterscheiden sich visuell von den Trainingsaufzeichnungen, um die Entwicklung von Lösungen zu fördern, die sich auf verschiedene Szenen verallgemeinern lassen. Die vorgeschlagenen Aufgabenformulierungen für den Einzel- und Mehrpersonenfall ermöglichen eine höhere Genauigkeit und enthalten zusätzliche Eigenschaften wie die Robustheit der Okklusion, die im Kontext einer vollständigen Bewegungserfassungslösung hilfreich sind. Die Mehrpersonenformulierungen sind so konzipiert, dass sie unabhängig von der Anzahl der Subjekte in der Szene eine nahezu konstante Inferenzzeit haben. In Kombination mit Beiträgen zur schnellen Inferenz neuronaler Netze ermöglichen sie eine 3D-Posenschätzung in Echtzeit für mehrere Subjekte. Die Kombination der vorgeschlagenen lernbasierten Ansätze mit einem modellbasierten kinematischen Skelettanpassungsschritt liefert zeitlich stabile Gelenkwinkelschätzungen, die leicht zum Ansteuern virtueller Charaktere verwendet werden können

    Screening TED: A rhetorical analysis of the intersections of rhetoric, digital media, and pedagogy

    Get PDF
    The presence of expertise resonates across our daily lives. Experts are called upon to consult us about which candidate is ideal for office, which type of wood is the best choice for a carpentry project, which scientist has optimal data on the effects of air pollution, which speech teacher is the best one to take for proper credit hours, and more. An expert is typically conceived as an individual who knows more about a given topic and can create stronger identification than an average person. The struggle to achieve expert status is one that is fundamentally tied to power and is reliant on the establishment of authenticity and legitimacy from audiences. It is, at its core, a struggle that utilizes rhetoric. Begun in 1984, the TED (Technology, Entertainment, and Design) conference has become a critical player in an architectonic movement to manufacture expertise. Modeled on the Lyceum and Chautauqua movements of the early American 20th century, the TED conferences have spread rapidly into public culture, but most notably in field of education via social media and online video. TED “talks” are classroom artifacts. They are teaching tools and aid in increasing learning for a more digital native student population. Likewise, the TED conferences have become models of community engagement that work rhetorically to demonstrate the attribution and manufacturing of expertise amidst a 21st century digital world. In short, we have acknowledged TED’s growth and expansion as credible and sanctioned their identity as the harbinger of expert and inspirational ideas. The democratization of digital media, particularly video, has made it possible to increase the sharing and collaboration of ideas faster than ever before, and as our world becomes more reliant on digital devices for the receiving and sending of information, the consumption and production of information, and the attribution of expertise, the precise role of technology within pedagogy becomes increasingly complex. My dissertation posits that TED employs current uses of digital media technologies in order to manufacture its ethos of expertise within public culture

    An integrative computational modelling of music structure apprehension

    Get PDF

    The University Defence Research Collaboration In Signal Processing

    Get PDF
    This chapter describes the development of algorithms for automatic detection of anomalies from multi-dimensional, undersampled and incomplete datasets. The challenge in this work is to identify and classify behaviours as normal or abnormal, safe or threatening, from an irregular and often heterogeneous sensor network. Many defence and civilian applications can be modelled as complex networks of interconnected nodes with unknown or uncertain spatio-temporal relations. The behavior of such heterogeneous networks can exhibit dynamic properties, reflecting evolution in both network structure (new nodes appearing and existing nodes disappearing), as well as inter-node relations. The UDRC work has addressed not only the detection of anomalies, but also the identification of their nature and their statistical characteristics. Normal patterns and changes in behavior have been incorporated to provide an acceptable balance between true positive rate, false positive rate, performance and computational cost. Data quality measures have been used to ensure the models of normality are not corrupted by unreliable and ambiguous data. The context for the activity of each node in complex networks offers an even more efficient anomaly detection mechanism. This has allowed the development of efficient approaches which not only detect anomalies but which also go on to classify their behaviour

    Global Shipping Container Monitoring Using Machine Learning with Multi-Sensor Hubs and Catadioptric Imaging

    Get PDF
    We describe a framework for global shipping container monitoring using machine learning with multi-sensor hubs and infrared catadioptric imaging. A wireless mesh radio satellite tag architecture provides connectivity anywhere in the world which is a significant improvement to legacy methods. We discuss the design and testing of a low-cost long-wave infrared catadioptric imaging device and multi-sensor hub combination as an intelligent edge computing system that, when equipped with physics-based machine learning algorithms, can interpret the scene inside a shipping container to make efficient use of expensive communications bandwidth. The histogram of oriented gradients and T-channel (HOG+) feature as introduced for human detection on low-resolution infrared catadioptric images is shown to be effective for various mirror shapes designed to give wide volume coverage with controlled distortion. Initial results for through-metal communication with ultrasonic guided waves show promise using the Dynamic Wavelet Fingerprint Technique (DWFT) to identify Lamb waves in a complicated ultrasonic signal

    The University Defence Research Collaboration In Signal Processing: 2013-2018

    Get PDF
    Signal processing is an enabling technology crucial to all areas of defence and security. It is called for whenever humans and autonomous systems are required to interpret data (i.e. the signal) output from sensors. This leads to the production of the intelligence on which military outcomes depend. Signal processing should be timely, accurate and suited to the decisions to be made. When performed well it is critical, battle-winning and probably the most important weapon which you’ve never heard of. With the plethora of sensors and data sources that are emerging in the future network-enabled battlespace, sensing is becoming ubiquitous. This makes signal processing more complicated but also brings great opportunities. The second phase of the University Defence Research Collaboration in Signal Processing was set up to meet these complex problems head-on while taking advantage of the opportunities. Its unique structure combines two multi-disciplinary academic consortia, in which many researchers can approach different aspects of a problem, with baked-in industrial collaboration enabling early commercial exploitation. This phase of the UDRC will have been running for 5 years by the time it completes in March 2018, with remarkable results. This book aims to present those accomplishments and advances in a style accessible to stakeholders, collaborators and exploiters
    corecore