Search CORE

69 research outputs found

Training-free Diffusion Model Adaptation for Variable-Sized Text-to-Image Synthesis

Author: Jin Zhiyu
Li Bin
Shen Xuli
Xue Xiangyang
Publication venue
Publication date: 26/10/2023
Field of study

Diffusion models (DMs) have recently gained attention with state-of-the-art performance in text-to-image synthesis. Abiding by the tradition in deep learning, DMs are trained and evaluated on the images with fixed sizes. However, users are demanding for various images with specific sizes and various aspect ratio. This paper focuses on adapting text-to-image diffusion models to handle such variety while maintaining visual fidelity. First we observe that, during the synthesis, lower resolution images suffer from incomplete object portrayal, while higher resolution images exhibit repetitively disordered presentation. Next, we establish a statistical relationship indicating that attention entropy changes with token quantity, suggesting that models aggregate spatial information in proportion to image resolution. The subsequent interpretation on our observations is that objects are incompletely depicted due to limited spatial information for low resolutions, while repetitively disorganized presentation arises from redundant spatial information for high resolutions. From this perspective, we propose a scaling factor to alleviate the change of attention entropy and mitigate the defective pattern observed. Extensive experimental results validate the efficacy of the proposed scaling factor, enabling models to achieve better visual effects, image quality, and text alignment. Notably, these improvements are achieved without additional training or fine-tuning techniques.Comment: Accepted by NeurIPS 2023. 23 pages, 13 figure

arXiv.org e-Print Archive

Automated rapid thermal imaging systems technology

Author: Phan Long N., 1976-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mechanical Engineering, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 266-276).A major source of energy savings occurs on the thermal envelop of buildings, which amounts to approximately 10% of annual energy usage in the United States. To pursue these savings, energy auditors use closed loop energy auditing processes that include infrared thermography inspection as an important tool to assess deficiencies and identify hot thermal gradients. This process is prohibitively expensive and time consuming. I propose fundamentally changing this approach by designing, developing, and deploying an Automated Rapid Thermal Imaging Systems Technology (ARTIST) which is capable of street level drive-by scanning in real-time. I am doing for thermal imaging what Google Earth did for visual imaging. I am mapping the world's temperature, window by window, house by house, street by street, city by city, and country by country. In doing so, I will be able to provide detailed information on where and how we are wasting energy, providing the information needed for sound economic and environmental energy policies and identifying what corrective measures can and should be taken. The fundamental contributions of this thesis relates to the ARTIST. This thesis will focus on the following topics: * Multi-camera synthetic aperture imaging system * 3D Radiometry * Non-radiometric infrared camera calibration techniques * Image enhancement algorithms - Hyper Resolution o Kinetic Super Resolution - Thermal Signature Identification - Low-Light Signal-to-Noise Enhancement using KSRby Long N. Phan.Ph.D

DSpace@MIT

Framing Emotive and Perspective Space : the Sundance Center for the Exhibition and Study of Film

Author: Stiling Joshua
Publication venue: DOCS@RWU
Publication date: 01/01/2012
Field of study

A “Bauhaus” of academic programs including film studies, neurology, and psychology, use a museum and exhibition venue for the Sundance Film Festival in order to study the effects of visual recognition on the way we perceive and how it affects emotion, framing architectural perspective using film making techniques

DOCS@RWU

HELIN Digital Commons

Multi-sensor human action recognition with particular application to tennis event-based indexing

Author: Connaghan Damien
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/03/2013
Field of study

The ability to automatically classify human actions and activities using vi- sual sensors or by analysing body worn sensor data has been an active re- search area for many years. Only recently with advancements in both fields and the ubiquitous nature of low cost sensors in our everyday lives has auto- matic human action recognition become a reality. While traditional sports coaching systems rely on manual indexing of events from a single modality, such as visual or inertial sensors, this thesis investigates the possibility of cap- turing and automatically indexing events from multimodal sensor streams. In this work, we detail a novel approach to infer human actions by fusing multimodal sensors to improve recognition accuracy. State of the art visual action recognition approaches are also investigated. Firstly we apply these action recognition detectors to basic human actions in a non-sporting con- text. We then perform action recognition to infer tennis events in a tennis court instrumented with cameras and inertial sensing infrastructure. The system proposed in this thesis can use either visual or inertial sensors to au- tomatically recognise the main tennis events during play. A complete event retrieval system is also presented to allow coaches to build advanced queries, which existing sports coaching solutions cannot facilitate, without an inordi- nate amount of manual indexing. The event retrieval interface is evaluated against a leading commercial sports coaching tool in terms of both usability and efficiency

Irish Universities

DCU Online Research Access Service

Clasificación automática de vídeos por género

Author: Felipe Sánchez-Infante Loreto
Publication venue
Publication date: 01/01/2011
Field of study

Biblos-e Archivo

The Wooster Voice (Wooster, OH), 1998-01-22

Author: Editors Wooster Voice
Publication venue: Open Works
Publication date: 22/01/1998
Field of study

The American Red Cross Blood Drive hosted by Xi Chi Psi is indicative of Greek organizations\u27 volunteering requirements. The Wooster Volunteer Network coordinates the small house program (alternatives to dorms), and holds a meeting for the next year house programs. The Soup and Bread Program is short of the required number of participants. Raymond C. Pierce, deputy assistant secretary at the Office for Civil Rights at the Department of Education speaks on the history of affirmative action. Karl Robillard writes about the IS panic and worry on page four. Tom Cvjetkovic speaks on being born and growing up in Croatia. Dale Seeds, Professor of Theater, discusses his Native American performance class. Sally Thelen writes a review of the newly released movie Titanic. Athletic updates for the week are on pages ten through twelve.https://openworks.wooster.edu/voice1991-2000/1186/thumbnail.jp

The College of Wooster

Learning visual similarities robust to bias

Author: Thong W.
Publication venue
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

Grasp-sensitive surfaces

Author: Wimmer Raphael
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 27/03/2015
Field of study

Grasping objects with our hands allows us to skillfully move and manipulate them. Hand-held tools further extend our capabilities by adapting precision, power, and shape of our hands to the task at hand. Some of these tools, such as mobile phones or computer mice, already incorporate information processing capabilities. Many other tools may be augmented with small, energy-efficient digital sensors and processors. This allows for graspable objects to learn about the user grasping them - and supporting the user's goals. For example, the way we grasp a mobile phone might indicate whether we want to take a photo or call a friend with it - and thus serve as a shortcut to that action. A power drill might sense whether the user is grasping it firmly enough and refuse to turn on if this is not the case. And a computer mouse could distinguish between intentional and unintentional movement and ignore the latter. This dissertation gives an overview of grasp sensing for human-computer interaction, focusing on technologies for building grasp-sensitive surfaces and challenges in designing grasp-sensitive user interfaces. It comprises three major contributions: a comprehensive review of existing research on human grasping and grasp sensing, a detailed description of three novel prototyping tools for grasp-sensitive surfaces, and a framework for analyzing and designing grasp interaction: For nearly a century, scientists have analyzed human grasping. My literature review gives an overview of definitions, classifications, and models of human grasping. A small number of studies have investigated grasping in everyday situations. They found a much greater diversity of grasps than described by existing taxonomies. This diversity makes it difficult to directly associate certain grasps with users' goals. In order to structure related work and own research, I formalize a generic workflow for grasp sensing. It comprises *capturing* of sensor values, *identifying* the associated grasp, and *interpreting* the meaning of the grasp. A comprehensive overview of related work shows that implementation of grasp-sensitive surfaces is still hard, researchers often are not aware of related work from other disciplines, and intuitive grasp interaction has not yet received much attention. In order to address the first issue, I developed three novel sensor technologies designed for grasp-sensitive surfaces. These mitigate one or more limitations of traditional sensing techniques: **HandSense** uses four strategically positioned capacitive sensors for detecting and classifying grasp patterns on mobile phones. The use of custom-built high-resolution sensors allows detecting proximity and avoids the need to cover the whole device surface with sensors. User tests showed a recognition rate of 81%, comparable to that of a system with 72 binary sensors. **FlyEye** uses optical fiber bundles connected to a camera for detecting touch and proximity on arbitrarily shaped surfaces. It allows rapid prototyping of touch- and grasp-sensitive objects and requires only very limited electronics knowledge. For FlyEye I developed a *relative calibration* algorithm that allows determining the locations of groups of sensors whose arrangement is not known. **TDRtouch** extends Time Domain Reflectometry (TDR), a technique traditionally used for inspecting cable faults, for touch and grasp sensing. TDRtouch is able to locate touches along a wire, allowing designers to rapidly prototype and implement modular, extremely thin, and flexible grasp-sensitive surfaces. I summarize how these technologies cater to different requirements and significantly expand the design space for grasp-sensitive objects. Furthermore, I discuss challenges for making sense of raw grasp information and categorize interactions. Traditional application scenarios for grasp sensing use only the grasp sensor's data, and only for mode-switching. I argue that data from grasp sensors is part of the general usage context and should be only used in combination with other context information. For analyzing and discussing the possible meanings of grasp types, I created the GRASP model. It describes five categories of influencing factors that determine how we grasp an object: *Goal* -- what we want to do with the object, *Relationship* -- what we know and feel about the object we want to grasp, *Anatomy* -- hand shape and learned movement patterns, *Setting* -- surrounding and environmental conditions, and *Properties* -- texture, shape, weight, and other intrinsics of the object I conclude the dissertation with a discussion of upcoming challenges in grasp sensing and grasp interaction, and provide suggestions for implementing robust and usable grasp interaction.Die Fähigkeit, Gegenstände mit unseren Händen zu greifen, erlaubt uns, diese vielfältig zu manipulieren. Werkzeuge erweitern unsere Fähigkeiten noch, indem sie Genauigkeit, Kraft und Form unserer Hände an die Aufgabe anpassen. Digitale Werkzeuge, beispielsweise Mobiltelefone oder Computermäuse, erlauben uns auch, die Fähigkeiten unseres Gehirns und unserer Sinnesorgane zu erweitern. Diese Geräte verfügen bereits über Sensoren und Recheneinheiten. Aber auch viele andere Werkzeuge und Objekte lassen sich mit winzigen, effizienten Sensoren und Recheneinheiten erweitern. Dies erlaubt greifbaren Objekten, mehr über den Benutzer zu erfahren, der sie greift - und ermöglicht es, ihn bei der Erreichung seines Ziels zu unterstützen. Zum Beispiel könnte die Art und Weise, in der wir ein Mobiltelefon halten, verraten, ob wir ein Foto aufnehmen oder einen Freund anrufen wollen - und damit als Shortcut für diese Aktionen dienen. Eine Bohrmaschine könnte erkennen, ob der Benutzer sie auch wirklich sicher hält und den Dienst verweigern, falls dem nicht so ist. Und eine Computermaus könnte zwischen absichtlichen und unabsichtlichen Mausbewegungen unterscheiden und letztere ignorieren. Diese Dissertation gibt einen Überblick über Grifferkennung (*grasp sensing*) für die Mensch-Maschine-Interaktion, mit einem Fokus auf Technologien zur Implementierung griffempfindlicher Oberflächen und auf Herausforderungen beim Design griffempfindlicher Benutzerschnittstellen. Sie umfasst drei primäre Beiträge zum wissenschaftlichen Forschungsstand: einen umfassenden Überblick über die bisherige Forschung zu menschlichem Greifen und Grifferkennung, eine detaillierte Beschreibung dreier neuer Prototyping-Werkzeuge für griffempfindliche Oberflächen und ein Framework für Analyse und Design von griff-basierter Interaktion (*grasp interaction*). Seit nahezu einem Jahrhundert erforschen Wissenschaftler menschliches Greifen. Mein Überblick über den Forschungsstand beschreibt Definitionen, Klassifikationen und Modelle menschlichen Greifens. In einigen wenigen Studien wurde bisher Greifen in alltäglichen Situationen untersucht. Diese fanden eine deutlich größere Diversität in den Griffmuster als in existierenden Taxonomien beschreibbar. Diese Diversität erschwert es, bestimmten Griffmustern eine Absicht des Benutzers zuzuordnen. Um verwandte Arbeiten und eigene Forschungsergebnisse zu strukturieren, formalisiere ich einen allgemeinen Ablauf der Grifferkennung. Dieser besteht aus dem *Erfassen* von Sensorwerten, der *Identifizierung* der damit verknüpften Griffe und der *Interpretation* der Bedeutung des Griffes. In einem umfassenden Überblick über verwandte Arbeiten zeige ich, dass die Implementierung von griffempfindlichen Oberflächen immer noch ein herausforderndes Problem ist, dass Forscher regelmäßig keine Ahnung von verwandten Arbeiten in benachbarten Forschungsfeldern haben, und dass intuitive Griffinteraktion bislang wenig Aufmerksamkeit erhalten hat. Um das erstgenannte Problem zu lösen, habe ich drei neuartige Sensortechniken für griffempfindliche Oberflächen entwickelt. Diese mindern jeweils eine oder mehrere Schwächen traditioneller Sensortechniken: **HandSense** verwendet vier strategisch positionierte kapazitive Sensoren um Griffmuster zu erkennen. Durch die Verwendung von selbst entwickelten, hochauflösenden Sensoren ist es möglich, schon die Annäherung an das Objekt zu erkennen. Außerdem muss nicht die komplette Oberfläche des Objekts mit Sensoren bedeckt werden. Benutzertests ergaben eine Erkennungsrate, die vergleichbar mit einem System mit 72 binären Sensoren ist. **FlyEye** verwendet Lichtwellenleiterbündel, die an eine Kamera angeschlossen werden, um Annäherung und Berührung auf beliebig geformten Oberflächen zu erkennen. Es ermöglicht auch Designern mit begrenzter Elektronikerfahrung das Rapid Prototyping von berührungs- und griffempfindlichen Objekten. Für FlyEye entwickelte ich einen *relative-calibration*-Algorithmus, der verwendet werden kann um Gruppen von Sensoren, deren Anordnung unbekannt ist, semi-automatisch anzuordnen. **TDRtouch** erweitert Time Domain Reflectometry (TDR), eine Technik die üblicherweise zur Analyse von Kabelbeschädigungen eingesetzt wird. TDRtouch erlaubt es, Berührungen entlang eines Drahtes zu lokalisieren. Dies ermöglicht es, schnell modulare, extrem dünne und flexible griffempfindliche Oberflächen zu entwickeln. Ich beschreibe, wie diese Techniken verschiedene Anforderungen erfüllen und den *design space* für griffempfindliche Objekte deutlich erweitern. Desweiteren bespreche ich die Herausforderungen beim Verstehen von Griffinformationen und stelle eine Einteilung von Interaktionsmöglichkeiten vor. Bisherige Anwendungsbeispiele für die Grifferkennung nutzen nur Daten der Griffsensoren und beschränken sich auf Moduswechsel. Ich argumentiere, dass diese Sensordaten Teil des allgemeinen Benutzungskontexts sind und nur in Kombination mit anderer Kontextinformation verwendet werden sollten. Um die möglichen Bedeutungen von Griffarten analysieren und diskutieren zu können, entwickelte ich das GRASP-Modell. Dieses beschreibt fünf Kategorien von Einflussfaktoren, die bestimmen wie wir ein Objekt greifen: *Goal* -- das Ziel, das wir mit dem Griff erreichen wollen, *Relationship* -- das Verhältnis zum Objekt, *Anatomy* -- Handform und Bewegungsmuster, *Setting* -- Umgebungsfaktoren und *Properties* -- Eigenschaften des Objekts, wie Oberflächenbeschaffenheit, Form oder Gewicht. Ich schließe mit einer Besprechung neuer Herausforderungen bei der Grifferkennung und Griffinteraktion und mache Vorschläge zur Entwicklung von zuverlässiger und benutzbarer Griffinteraktion