13,128 research outputs found

    Multimedia information technology and the annotation of video

    Get PDF
    The state of the art in multimedia information technology has not progressed to the point where a single solution is available to meet all reasonable needs of documentalists and users of video archives. In general, we do not have an optimistic view of the usability of new technology in this domain, but digitization and digital power can be expected to cause a small revolution in the area of video archiving. The volume of data leads to two views of the future: on the pessimistic side, overload of data will cause lack of annotation capacity, and on the optimistic side, there will be enough data from which to learn selected concepts that can be deployed to support automatic annotation. At the threshold of this interesting era, we make an attempt to describe the state of the art in technology. We sample the progress in text, sound, and image processing, as well as in machine learning

    Feeling the beat where it counts: fostering multi-limb rhythm skills with the haptic drum kit

    Get PDF
    This paper introduces and explores a tool known as the Haptic Drum Kit. The Haptic Drum Kit employs four computer-controlled vibrotactile devices, one attached to each limb via the wrists and ankles. In the mode of use discussed in this paper, haptic pulses are used to guide the playing, on a drum kit, of rhythmic patterns that require multi-limb co-ordination. The immediate aim is to foster rhythm skills and multi-limb coordination. A broader aim is to systematically develop skills in recognizing, identifying, memorizing, retaining, analyzing, reproducing and composing monophonic and polyphonic rhythms. We consider the implications of three different theories for this approach: the work of the music educator Dalcroze (1865-1950 [1]; the entrainment theory of human rhythm perception and production [2,3]; and sensory motor contingency theory [4]. In this paper we introduce the Haptic Drum Kit; consider the implications of the above theories for this approach; report on a design study; and identify and discuss a variety of emerging design issues. As part of the design study, audio and haptic guidance was compared for five people learning to play polyphonic drum patterns of varying complexity. The results indicate that beginning drummers are able to learn intricate drum patterns from the haptic stimuli alone, although haptic plus audio is the mode of presentation preferred by subjects

    Detecção de eventos complexos em vídeos baseada em ritmos visuais

    Get PDF
    Orientador: HĂ©lio PedriniDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O reconhecimento de eventos complexos em vĂ­deos possui vĂĄrias aplicaçÔes prĂĄticas relevantes, alavancadas pela grande disponibilidade de cĂąmeras digitais instaladas em aeroportos, estaçÔes de ĂŽnibus e trens, centros de compras, estĂĄdios, hospitais, escolas, prĂ©dios, estradas, entre vĂĄrios outros locais. Avanços na tecnologia digital tĂȘm aumentado as capacidades dos sistemas em reconhecer eventos em vĂ­deos por meio do desenvolvimento de dispositivos com alta resolução, dimensĂ”es fĂ­sicas pequenas e altas taxas de amostragem. Muitos trabalhos disponĂ­veis na literatura tĂȘm explorado o tema a partir de diferentes pontos de vista. Este trabalho apresenta e avalia uma metodologia para extrair caracterĂ­sticas dos ritmos visuais no contexto de detecção de eventos em vĂ­deos. Um ritmo visual pode ser visto com a projeção de um vĂ­deo em uma imagem, tal que a tarefa de anĂĄlise de vĂ­deos Ă© reduzida a um problema de anĂĄlise de imagens, beneficiando-se de seu baixo custo de processamento em termos de tempo e complexidade. Para demonstrar o potencial do ritmo visual na anĂĄlise de vĂ­deos complexos, trĂȘs problemas da ĂĄrea de visĂŁo computacional sĂŁo selecionados: detecção de eventos anĂŽmalos, classificação de açÔes humanas e reconhecimento de gestos. No primeiro problema, um modelo e? aprendido com situaçÔes de normalidade a partir dos rastros deixados pelas pessoas ao andar, enquanto padro?es representativos das açÔes sĂŁo extraĂ­dos nos outros dois problemas. Nossa hipo?tese e? de que vĂ­deos similares produzem padro?es semelhantes, tal que o problema de classificação de açÔes pode ser reduzido a uma tarefa de classificação de imagens. Experimentos realizados em bases pĂșblicas de dados demonstram que o mĂ©todo proposto produz resultados promissores com baixo custo de processamento, tornando-o possĂ­vel aplicar em tempo real. Embora os padro?es dos ritmos visuais sejam extrai?dos como histograma de gradientes, algumas tentativas para adicionar caracterĂ­sticas do fluxo o?tico sĂŁo discutidas, alĂ©m de estratĂ©gias para obter ritmos visuais alternativosAbstract: The recognition of complex events in videos has currently several important applications, particularly due to the wide availability of digital cameras in environments such as airports, train and bus stations, shopping centers, stadiums, hospitals, schools, buildings, roads, among others. Moreover, advances in digital technology have enhanced the capabilities for detection of video events through the development of devices with high resolution, small physical size, and high sampling rates. Many works available in the literature have explored the subject from different perspectives. This work presents and evaluates a methodology for extracting a feature descriptor from visual rhythms of video sequences in order to address the video event detection problem. A visual rhythm can be seen as the projection of a video onto an image, such that the video analysis task can be reduced into an image analysis problem, benefiting from its low processing cost in terms of time and complexity. To demonstrate the potential of the visual rhythm in the analysis of complex videos, three computer vision problems are selected in this work: abnormal event detection, human action classification, and gesture recognition. The former problem learns a normalcy model from the traces that people leave when they walk, whereas the other two problems extract representative patterns from actions. Our hypothesis is that similar videos produce similar patterns, therefore, the action classification problem is reduced into an image classification task. Experiments conducted on well-known public datasets demonstrate that the method produces promising results at high processing rates, making it possible to work in real time. Even though the visual rhythm features are mainly extracted as histogram of gradients, some attempts for adding optical flow features are discussed, as well as strategies for obtaining alternative visual rhythmsMestradoCiĂȘncia da ComputaçãoMestre em CiĂȘncia da Computação1570507, 1406910, 1374943CAPE

    Understanding \u3cem\u3eDance Understanding\u3c/em\u3e

    Get PDF

    Action-based effects on music perception

    Get PDF
    The classical, disembodied approach to music cognition conceptualizes action and perception as separate, peripheral processes. In contrast, embodied accounts of music cognition emphasize the central role of the close coupling of action and perception. It is a commonly established fact that perception spurs action tendencies. We present a theoretical framework that captures the ways in which the human motor system and its actions can reciprocally influence the perception of music. The cornerstone of this framework is the common coding theory, postulating a representational overlap in the brain between the planning, the execution, and the perception of movement. The integration of action and perception in so-called internal models is explained as a result of associative learning processes. Characteristic of internal models is that they allow intended or perceived sensory states to be transferred into corresponding motor commands (inverse modeling), and vice versa, to predict the sensory outcomes of planned actions (forward modeling). Embodied accounts typically refer to inverse modeling to explain action effects on music perception (Leman, 2007). We extend this account by pinpointing forward modeling as an alternative mechanism by which action can modulate perception. We provide an extensive overview of recent empirical evidence in support of this idea. Additionally, we demonstrate that motor dysfunctions can cause perceptual disabilities, supporting the main idea of the paper that the human motor system plays a functional role in auditory perception. The finding that music perception is shaped by the human motor system and its actions suggests that the musical mind is highly embodied. However, we advocate for a more radical approach to embodied (music) cognition in the sense that it needs to be considered as a dynamical process, in which aspects of action, perception, introspection, and social interaction are of crucial importance

    Understanding Dance Understanding

    Get PDF
    &nbsp

    The Temporality and Rythmicity of Lived Street Space

    Get PDF
    TĂ€mĂ€ vĂ€itöskirja, lyhyesti ilmaistuna, tarkastelee arjen katutilan ja kaupunkiliikkumisen ajallisuuksia ja rytmisyyksiĂ€. Kadut ja muut liikkumisen tilat kaupungissa ovat urbaanin arkielĂ€mĂ€n tĂ€rkeimpiĂ€ tapahtumapaikkoja – ne ovat keskeisessĂ€ roolissa siinĂ€, miten (rutiininomaisesti) kĂ€ytĂ€mme ja olemme vuorovaikutuksessa rakennetun ympĂ€ristön kanssa, miten juurrumme asuinympĂ€ristöihimme, ja miten kohtaamme muita ihmisiĂ€ kaupunkitilassa – ja nĂ€in ollen ovat olennaisessa roolissa elĂ€vien, kestĂ€vien ja tasa-arvoisten kaupunkien muodostumisessa. Tarkastellen katua mobiilina kokoutumana (mobile assemblage), tutkimus selvittÀÀ ja kĂ€sitteellistÀÀ erĂ€itĂ€ keskeisimpiĂ€ liikkumisen ja katutilan rytmejĂ€, ja pyrkii tuottamaan yksityiskohtaisen kuvan kaupunkiympĂ€ristön toistuvista (mikro-)ajallisuuksista liikkumisen nĂ€kökulmasta, mitkĂ€ osaltaan mÀÀrittĂ€vĂ€t kaupunkiympĂ€ristöÀ jokapĂ€ivĂ€isenĂ€ ’elettynÀ’ tilana. Työn teoreettinen kehys ammentaa useista eri kaupunkien ajallisuutta kĂ€sitteellistĂ€vistĂ€ perinteistĂ€, erityisesti LefebvrelĂ€isestĂ€ rytmianalyysistĂ€, ja mÀÀrittelee tarkasteltavat liikkumisen rytmit tilan, ajan ja kehollisen liikkumisen erottamattomiksi keskinĂ€issuhteiksi. Tutkimuksen empiirisessĂ€ keskiössĂ€ on ruohonjuuritason liikkuminen. Liikkuminen, tai mobiliteetti, ymmĂ€rretÀÀn tĂ€ssĂ€ laajasti (seuraten uutta mobiliteetin paradigmaa) toimintoina, jotka muodostavat merkityksiĂ€, kokemuksia, kuulumisen tunteita, sosiaalis-materiaalisia vuorovaikutuksia, mielikuvia ja (liikkumisen) kulttuureita samalla, kun ne siirtĂ€vĂ€t ihmisiĂ€ paikasta A paikkaan B. Tutkimuksessa on tarkasteltu arjessa toistuvia kĂ€vely- ja ajoreittejĂ€ sekĂ€ liikkumisen tapahtumaa tavanomaisissa katuympĂ€ristöissĂ€ kahdessa suuressa suomalaisessa kaupungissa eri liikkumisen tutkimuksen menetelmiĂ€ (mobile methods) (mukaan menemiseen perustuvia syvĂ€haastatteluita, valokuvia, reittivideoita ja reittikarttoja; videoituja paikkahavainnointeja) sekĂ€ jĂ€lkifenomenologista tutkimusotetta hyödyntĂ€en. Tutkimusaineiston analyysi – mikĂ€ on tarkemmin esitelty sisĂ€llytetyissĂ€ tutkimusartikkeleissa (#01–04) – tuo esiin, yhtÀÀltĂ€, miten ihmiset (inter)subjektiivisesti hahmottavat, kokevat ja toiminnallaan muokkaavat kadun (ja laajemmin kaupungin) rytmisyyksiĂ€ omien liikkumisrutiiniensa konteksteissa, ja toisaalta, miten tilallisen toiminnan ja liikkeen kautta tilassa liikkujat tuottavat ajallista, tai hetkellistĂ€, kadun arkkitehtuuria sopeutumalla tai haastamalla muualta asetettuja rytmisyyksiĂ€. Analyysi tuo lisĂ€ksi esiin erilaisia rytmien vĂ€lillisyyksiĂ€ (#01) ja rytmityksen prosesseja (#02), kaupunkiympĂ€ristön morfologian vaikutuksia nĂ€iden rytmien muodostumiseen (#03), sekĂ€ katutilan haltuunoton ajallisesti mÀÀrityviĂ€ rytmisiĂ€ muotoja (#04). Työ esittÀÀ, ettĂ€ nouseva rytmianalyyttinen tutkimusote on soveltuva ja hyödyllinen tapa lĂ€hestyĂ€ ja kartoittaa dynaamisia ja alati muuttuvia kaupunki- ilmiöitĂ€. Arjen katutilan suhteen rytmianalyysi paljastaa erilaisia mikrotason ajallisuuksia (yhdessĂ€ makrotason kanssa), joiden valossa katuympĂ€ristö nĂ€yttĂ€ytyy monien heterogeenisten ja samanaikaisten ajallisuuksien tilana. Rytmianalyysi auttaa myös ymmĂ€rtĂ€mÀÀn kaupunkiliikkumisen moniulotteisuutta sekĂ€ arjen reittien merkityksiĂ€ funktionaalisten tekijöiden ohella, tuoden esiin ajallisten keho-ympĂ€ristö suhteiden moninaisuuden kirjoa. YhdessĂ€ ne piirtĂ€vĂ€t vivahteikkaan kuvan kaupunkirakenteista kartoittaen sekĂ€ formaaleja (suunnitellut, ’ylhÀÀltÀ’ asetetut) ettĂ€ informaaleja (sattumanvaraiset tai rutiininomaiset, ’alhaalta’ asetetut) liikkumisen rakenteita. Ne korostavat ihmistoiminnan jatkuvaa, niin rytmistĂ€ kuin kitkaista sykettĂ€, kaupunkikudoksen intensiteettiĂ€. Toisin sanoen, ne tuovat esiin kaupungin ja katuympĂ€ristöjen tahdin moninaisuuden sekĂ€ ennalta suunniteltuna ettĂ€ liikkeellĂ€ olevien ihmisten tuottamana.This dissertation, in short, examines the temporalities and rhythmicities of day-to- day urban mobility practices on the city street. Streets, and other mobility-centred spaces of the city, are the main stages of public urban life – they are essential to how we (routinely) use and interact with the built environment, connect to our neighbourhoods, and encounter other city dwellers – and thus play a key part in the making of liveable, sustainable and just cities. Examining the street as a mobile assemblage, the study probes and conceptualizes some of the key rhythms that emerge from such daily mobility patterns of the street, aiming to draw a detailed picture of the recurring urban (micro)temporalities from a mobilities perspective that partially constitute the ‘lived’ aspects of the day-to-day built environments. The theoretical framework on temporalities draws from various conceptual lineages, notably a Lefebvrian rhythmanalytical framework, and defines the studied mobility rhythms of the street as the inseparable relations between spaces, times and mobile embodied practices. The practical research focus is set on the grassroot-level embodied mobilities. Here mobility practices are understood in a broad sense (following a new mobilities paradigm) as activities that, whilst physically moving people from place A to place B, also produce meanings, experiences, sense of belonging, socio-material interactions, imageries, and (mobile) cultures in the process. Utilizing various mobile research methods (in-depth go-along interviews, participant-produced photographs, route videos and route maps; extensive videoed site observations), and by taking a postphenomenological research perspective, the dissertation examines recurring walking and driving routes, and the mobile event of day-to-day street space in two major Finnish cities. The analysis of the data – presented in four research articles (#01–04)– reveals, on one hand, how people (inter)subjectively make sense of and modify the rhythmicities of the street (and the city in general) inside their own mobile daily routines, and, on the other, how people – through their (mobile) uses of the space – produce temporal, or momentarily perceivable, architecture of the street by adapting to, or contesting, pre-set rhythmicities. The analysis further reveals different mediacies (#01) and processes of pacing (#02) of such rhythmicities, the role of urban morphologies in the formation of these rhythmicities (#03), and the time-sensitive rhythmic modes of appropriating the street through mobile uses (#04). The work proposes that the emerging rhythmanalytical research framework is an applicable and advantageous mode for approaching and mapping the urban phenomena that are inherently caught in a continuous flux and flow. In the case of the day-to-day street space, rhythmanalysis can be used to reveal micro-level (next to macro-level) temporalities that depict the street as a site of multiple heterogeneous and simultaneous temporalities and timings. Likewise, rhythmanalysis, helps us to understand the complexity of urban mobilities and day-to-day routes beyond their strictly functional means, revealing the multiplicities of temporal relations in such recurring body-environment relations. Together, they are able to draw a nuanced picture of some of the key urban structures, mapping both formal (planned and designed, set from the ‘above’) as well as informal (accidental and routine-like, set from the ‘below’) mobility structures of the city. They highlight the continuous, rhythmic and arrhythmic, pulses of human activity in the city, the intensities of the urban fabric. In other words, they reveal multiplicities of the beat of the city and its streets, both the planned and designed as well as the ones produced by their inhabitants on the move

    The Psychophysics of Brain Rhythms

    Get PDF
    It is becoming increasingly apparent that brain oscillations in various frequency bands play important roles in perceptual and attentional processes. Understandably, most of the associated experimental evidence comes from human or animal electrophysiological studies, allowing direct access to the oscillatory activities. However, such periodicities in perception and attention should, in theory, also be observable using the proper psychophysical tools. Here, we review a number of psychophysical techniques that have been used by us and other authors, in successful and sometimes unsuccessful attempts, to reveal the rhythmic nature of perceptual and attentional processes. We argue that the two existing and largely distinct debates about discrete vs. continuous perception and parallel vs. sequential attention should in fact be regarded as two facets of the same question: how do brain rhythms shape the psychological operations of perception and attention

    Reconhecimento de padrÔes em expressÔes faciais : algoritmos e aplicaçÔes

    Get PDF
    Orientador: HĂ©lio PedriniTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: O reconhecimento de emoçÔes tem-se tornado um tĂłpico relevante de pesquisa pela comunidade cientĂ­fica, uma vez que desempenha um papel essencial na melhoria contĂ­nua dos sistemas de interação humano-computador. Ele pode ser aplicado em diversas ĂĄreas, tais como medicina, entretenimento, vigilĂąncia, biometria, educação, redes sociais e computação afetiva. HĂĄ alguns desafios em aberto relacionados ao desenvolvimento de sistemas emocionais baseados em expressĂ”es faciais, como dados que refletem emoçÔes mais espontĂąneas e cenĂĄrios reais. Nesta tese de doutorado, apresentamos diferentes metodologias para o desenvolvimento de sistemas de reconhecimento de emoçÔes baseado em expressĂ”es faciais, bem como sua aplicabilidade na resolução de outros problemas semelhantes. A primeira metodologia Ă© apresentada para o reconhecimento de emoçÔes em expressĂ”es faciais ocluĂ­das baseada no Histograma da Transformada Census (CENTRIST). ExpressĂ”es faciais ocluĂ­das sĂŁo reconstruĂ­das usando a AnĂĄlise Robusta de Componentes Principais (RPCA). A extração de caracterĂ­sticas das expressĂ”es faciais Ă© realizada pelo CENTRIST, bem como pelos PadrĂ”es BinĂĄrios Locais (LBP), pela Codificação Local do Gradiente (LGC) e por uma extensĂŁo do LGC. O espaço de caracterĂ­sticas gerado Ă© reduzido aplicando-se a AnĂĄlise de Componentes Principais (PCA) e a AnĂĄlise Discriminante Linear (LDA). Os algoritmos K-Vizinhos mais PrĂłximos (KNN) e MĂĄquinas de Vetores de Suporte (SVM) sĂŁo usados para classificação. O mĂ©todo alcançou taxas de acerto competitivas para expressĂ”es faciais ocluĂ­das e nĂŁo ocluĂ­das. A segunda Ă© proposta para o reconhecimento dinĂąmico de expressĂ”es faciais baseado em Ritmos Visuais (VR) e Imagens da HistĂłria do Movimento (MHI), de modo que uma fusĂŁo de ambos descritores codifique informaçÔes de aparĂȘncia, forma e movimento dos vĂ­deos. Para extração das caracterĂ­sticas, o Descritor Local de Weber (WLD), o CENTRIST, o Histograma de Gradientes Orientados (HOG) e a Matriz de CoocorrĂȘncia em NĂ­vel de Cinza (GLCM) sĂŁo empregados. A abordagem apresenta uma nova proposta para o reconhecimento dinĂąmico de expressĂ”es faciais e uma anĂĄlise da relevĂąncia das partes faciais. A terceira Ă© um mĂ©todo eficaz apresentado para o reconhecimento de emoçÔes audiovisuais com base na fala e nas expressĂ”es faciais. A metodologia envolve uma rede neural hĂ­brida para extrair caracterĂ­sticas visuais e de ĂĄudio dos vĂ­deos. Para extração de ĂĄudio, uma Rede Neural Convolucional (CNN) baseada no log-espectrograma de Mel Ă© usada, enquanto uma CNN construĂ­da sobre a Transformada de Census Ă© empregada para a extração das caracterĂ­sticas visuais. Os atributos audiovisuais sĂŁo reduzidos por PCA e LDA, entĂŁo classificados por KNN, SVM, RegressĂŁo LogĂ­stica (LR) e Gaussian NaĂŻve Bayes (GNB). A abordagem obteve taxas de reconhecimento competitivas, especialmente em dados espontĂąneos. A penĂșltima investiga o problema de detectar a sĂ­ndrome de Down a partir de fotografias. Um descritor geomĂ©trico Ă© proposto para extrair caracterĂ­sticas faciais. Experimentos realizados em uma base de dados pĂșblica mostram a eficĂĄcia da metodologia desenvolvida. A Ășltima metodologia trata do reconhecimento de sĂ­ndromes genĂ©ticas em fotografias. O mĂ©todo visa extrair atributos faciais usando caracterĂ­sticas de uma rede neural profunda e medidas antropomĂ©tricas. Experimentos sĂŁo realizados em uma base de dados pĂșblica, alcançando taxas de reconhecimento competitivasAbstract: Emotion recognition has become a relevant research topic by the scientific community, since it plays an essential role in the continuous improvement of human-computer interaction systems. It can be applied in various areas, for instance, medicine, entertainment, surveillance, biometrics, education, social networks, and affective computing. There are some open challenges related to the development of emotion systems based on facial expressions, such as data that reflect more spontaneous emotions and real scenarios. In this doctoral dissertation, we propose different methodologies to the development of emotion recognition systems based on facial expressions, as well as their applicability in the development of other similar problems. The first is an emotion recognition methodology for occluded facial expressions based on the Census Transform Histogram (CENTRIST). Occluded facial expressions are reconstructed using an algorithm based on Robust Principal Component Analysis (RPCA). Extraction of facial expression features is then performed by CENTRIST, as well as Local Binary Patterns (LBP), Local Gradient Coding (LGC), and an LGC extension. The generated feature space is reduced by applying Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) algorithms are used for classification. This method reached competitive accuracy rates for occluded and non-occluded facial expressions. The second proposes a dynamic facial expression recognition based on Visual Rhythms (VR) and Motion History Images (MHI), such that a fusion of both encodes appearance, shape, and motion information of the video sequences. For feature extraction, Weber Local Descriptor (WLD), CENTRIST, Histogram of Oriented Gradients (HOG), and Gray-Level Co-occurrence Matrix (GLCM) are employed. This approach shows a new direction for performing dynamic facial expression recognition, and an analysis of the relevance of facial parts. The third is an effective method for audio-visual emotion recognition based on speech and facial expressions. The methodology involves a hybrid neural network to extract audio and visual features from videos. For audio extraction, a Convolutional Neural Network (CNN) based on log Mel-spectrogram is used, whereas a CNN built on Census Transform is employed for visual extraction. The audio and visual features are reduced by PCA and LDA, and classified through KNN, SVM, Logistic Regression (LR), and Gaussian NaĂŻve Bayes (GNB). This approach achieves competitive recognition rates, especially in a spontaneous data set. The second last investigates the problem of detecting Down syndrome from photographs. A geometric descriptor is proposed to extract facial features. Experiments performed on a public data set show the effectiveness of the developed methodology. The last methodology is about recognizing genetic disorders in photos. This method focuses on extracting facial features using deep features and anthropometric measurements. Experiments are conducted on a public data set, achieving competitive recognition ratesDoutoradoCiĂȘncia da ComputaçãoDoutora em CiĂȘncia da Computação140532/2019-6CNPQCAPE
    • 

    corecore