6,736 research outputs found
Serving to secure "Global Korea": Gender, mobility, and flight attendant labor migrants
This dissertation is an ethnography of mobility and modernity in contemporary South Korea (the Republic of Korea) following neoliberal restructuring precipitated by the Asian Financial Crisis (1997). It focuses on how comparative “service,” “security,” and “safety” fashioned “Global Korea”: an ongoing state-sponsored project aimed at promoting the economic, political, and cultural maturation of South Korea from a once notoriously inhospitable, “backward” country (hujin’guk) to a now welcoming, “advanced country” (sŏnjin’guk). Through physical embodiments of the culturally-specific idiom of “superior” service (sŏbisŭ), I argue that aspiring, current, and former Korean flight attendants have driven the production and maintenance of this national project.
More broadly, as a driver of this national project, this occupation has emerged out of the country’s own aspirational flights from an earlier history of authoritarian rule, labor violence, and xenophobia. Against the backdrop of the Korean state’s aggressive neoliberal restructuring, globalization efforts, and current “Hell Chosun” (Helchosŏn) economy, a group of largely academically and/or class disadvantaged young women have been able secure individualized modes of pleasure, self-fulfillment, and class advancement via what I deem “service mobilities.” Service mobilities refers to the participation of mostly women in a traditionally devalued but growing sector of the global labor market, the “pink collar” economy centered around “feminine” care labor. Korean female flight attendants share labor skills resembling those of other foreign labor migrants (chiefly from the “Global South”), who perform care work deemed less desirable. Yet, Korean female flight attendants elude the stigmatizing, classed, and racialized category of “labor migrant.” Moreover, within the context of South Korea’s unique history of rapid modernization, the flight attendant occupation also commands considerable social prestige.
Based on ethnographic and archival research on aspiring, current, and former Korean flight attendants, this dissertation asks how these unique care laborers negotiate a metaphorical and literal series of sustained border crossings and inspections between Korean flight attendants’ contingent status as lowly care-laboring migrants, on the one hand, and ostensibly glamorous, globetrotting elites, on the other. This study contends the following: first, the flight attendant occupation in South Korea represents new politics of pleasure and pain in contemporary East Asia. Second, Korean female flight attendants’ enactments of soft, sanitized, and glamorous (hwaryŏhada) service help to purify South Korea’s less savory past. In so doing, Korean flight attendants reconstitute the historical role of female laborers as burden bearers and caretakers of the Korean state.U of I OnlyAuthor submitted a 2-year U of I restriction extension request
Multi-modal Facial Affective Analysis based on Masked Autoencoder
Human affective behavior analysis focuses on analyzing human expressions or
other behaviors to enhance the understanding of human psychology. The CVPR 2023
Competition on Affective Behavior Analysis in-the-wild (ABAW) is dedicated to
providing high-quality and large-scale Aff-wild2 for the recognition of
commonly used emotion representations, such as Action Units (AU), basic
expression categories(EXPR), and Valence-Arousal (VA). The competition is
committed to making significant strides in improving the accuracy and
practicality of affective analysis research in real-world scenarios. In this
paper, we introduce our submission to the CVPR 2023: ABAW5. Our approach
involves several key components. First, we utilize the visual information from
a Masked Autoencoder(MAE) model that has been pre-trained on a large-scale face
image dataset in a self-supervised manner. Next, we finetune the MAE encoder on
the image frames from the Aff-wild2 for AU, EXPR and VA tasks, which can be
regarded as a static and uni-modal training. Additionally, we leverage the
multi-modal and temporal information from the videos and implement a
transformer-based framework to fuse the multi-modal features. Our approach
achieves impressive results in the ABAW5 competition, with an average F1 score
of 55.49\% and 41.21\% in the AU and EXPR tracks, respectively, and an average
CCC of 0.6372 in the VA track. Our approach ranks first in the EXPR and AU
tracks, and second in the VA track. Extensive quantitative experiments and
ablation studies demonstrate the effectiveness of our proposed method
Examples of works to practice staccato technique in clarinet instrument
Klarnetin staccato tekniğini güçlendirme aşamaları eser çalışmalarıyla uygulanmıştır. Staccato
geçişlerini hızlandıracak ritim ve nüans çalışmalarına yer verilmiştir. Çalışmanın en önemli amacı
sadece staccato çalışması değil parmak-dilin eş zamanlı uyumunun hassasiyeti üzerinde de
durulmasıdır. Staccato çalışmalarını daha verimli hale getirmek için eser çalışmasının içinde etüt
çalışmasına da yer verilmiştir. Çalışmaların üzerinde titizlikle durulması staccato çalışmasının ilham
verici etkisi ile müzikal kimliğe yeni bir boyut kazandırmıştır. Sekiz özgün eser çalışmasının her
aşaması anlatılmıştır. Her aşamanın bir sonraki performans ve tekniği güçlendirmesi esas alınmıştır.
Bu çalışmada staccato tekniğinin hangi alanlarda kullanıldığı, nasıl sonuçlar elde edildiği bilgisine
yer verilmiştir. Notaların parmak ve dil uyumu ile nasıl şekilleneceği ve nasıl bir çalışma disiplini
içinde gerçekleşeceği planlanmıştır. Kamış-nota-diyafram-parmak-dil-nüans ve disiplin
kavramlarının staccato tekniğinde ayrılmaz bir bütün olduğu saptanmıştır. Araştırmada literatür
taraması yapılarak staccato ile ilgili çalışmalar taranmıştır. Tarama sonucunda klarnet tekniğin de
kullanılan staccato eser çalışmasının az olduğu tespit edilmiştir. Metot taramasında da etüt
çalışmasının daha çok olduğu saptanmıştır. Böylelikle klarnetin staccato tekniğini hızlandırma ve
güçlendirme çalışmaları sunulmuştur. Staccato etüt çalışmaları yapılırken, araya eser çalışmasının
girmesi beyni rahatlattığı ve istekliliği daha arttırdığı gözlemlenmiştir. Staccato çalışmasını yaparken
doğru bir kamış seçimi üzerinde de durulmuştur. Staccato tekniğini doğru çalışmak için doğru bir
kamışın dil hızını arttırdığı saptanmıştır. Doğru bir kamış seçimi kamıştan rahat ses çıkmasına
bağlıdır. Kamış, dil atma gücünü vermiyorsa daha doğru bir kamış seçiminin yapılması gerekliliği
vurgulanmıştır. Staccato çalışmalarında baştan sona bir eseri yorumlamak zor olabilir. Bu açıdan
çalışma, verilen müzikal nüanslara uymanın, dil atış performansını rahatlattığını ortaya koymuştur.
Gelecek nesillere edinilen bilgi ve birikimlerin aktarılması ve geliştirici olması teşvik edilmiştir.
Çıkacak eserlerin nasıl çözüleceği, staccato tekniğinin nasıl üstesinden gelinebileceği anlatılmıştır.
Staccato tekniğinin daha kısa sürede çözüme kavuşturulması amaç edinilmiştir. Parmakların
yerlerini öğrettiğimiz kadar belleğimize de çalışmaların kaydedilmesi önemlidir. Gösterilen azmin ve
sabrın sonucu olarak ortaya çıkan yapıt başarıyı daha da yukarı seviyelere çıkaracaktır
XKD: Cross-modal Knowledge Distillation with Domain Alignment for Video Representation Learning
We present XKD, a novel self-supervised framework to learn meaningful
representations from unlabelled video clips. XKD is trained with two pseudo
tasks. First, masked data reconstruction is performed to learn individual
representations from audio and visual streams. Next, self-supervised
cross-modal knowledge distillation is performed between the two modalities
through teacher-student setups to learn complementary information. To identify
the most effective information to transfer and also to tackle the domain gap
between audio and visual modalities which could hinder knowledge transfer, we
introduce a domain alignment and feature refinement strategy for effective
cross-modal knowledge distillation. Lastly, to develop a general-purpose
network capable of handling both audio and visual streams, modality-agnostic
variants of our proposed framework are introduced, which use the same backbone
for both audio and visual modalities. Our proposed cross-modal knowledge
distillation improves linear evaluation top-1 accuracy of video action
classification by 8.6% on UCF101, 8.2% on HMDB51, 13.9% on Kinetics-Sound, and
15.7% on Kinetics400. Additionally, our modality-agnostic variant shows
promising results in developing a general-purpose network capable of learning
both data streams for solving different downstream tasks
Learning disentangled speech representations
A variety of informational factors are contained within the speech signal and a single short recording of speech reveals much more than the spoken words. The best method to extract and represent informational factors from the speech signal ultimately depends on which informational factors are desired and how they will be used. In addition, sometimes methods will capture more than one informational factor at the same time such as speaker identity, spoken content, and speaker prosody.
The goal of this dissertation is to explore different ways to deconstruct the speech signal into abstract representations that can be learned and later reused in various speech technology tasks. This task of deconstructing, also known as disentanglement, is a form of distributed representation learning. As a general approach to disentanglement, there are some guiding principles that elaborate what a learned representation should contain as well as how it should function. In particular, learned representations should contain all of the requisite information in a more compact manner, be interpretable, remove nuisance factors of irrelevant information, be useful in downstream tasks, and independent of the task at hand. The learned representations should also be able to answer counter-factual questions.
In some cases, learned speech representations can be re-assembled in different ways according to the requirements of downstream applications. For example, in a voice conversion task, the speech content is retained while the speaker identity is changed. And in a content-privacy task, some targeted content may be concealed without affecting how surrounding words sound. While there is no single-best method to disentangle all types of factors, some end-to-end approaches demonstrate a promising degree of generalization to diverse speech tasks.
This thesis explores a variety of use-cases for disentangled representations including phone recognition, speaker diarization, linguistic code-switching, voice conversion, and content-based privacy masking. Speech representations can also be utilised for automatically assessing the quality and authenticity of speech, such as automatic MOS ratings or detecting deep fakes. The meaning of the term "disentanglement" is not well defined in previous work, and it has acquired several meanings depending on the domain (e.g. image vs. speech). Sometimes the term "disentanglement" is used interchangeably with the term "factorization". This thesis proposes that disentanglement of speech is distinct, and offers a viewpoint of disentanglement that can be considered both theoretically and practically
How to Be a God
When it comes to questions concerning the nature of Reality, Philosophers and Theologians have the answers.
Philosophers have the answers that can’t be proven right. Theologians have the answers that can’t be proven wrong.
Today’s designers of Massively-Multiplayer Online Role-Playing Games create realities for a living. They can’t spend centuries mulling over the issues: they have to face them head-on. Their practical experiences can indicate which theoretical proposals actually work in practice.
That’s today’s designers. Tomorrow’s will have a whole new set of questions to answer.
The designers of virtual worlds are the literal gods of those realities. Suppose Artificial Intelligence comes through and allows us to create non-player characters as smart as us. What are our responsibilities as gods? How should we, as gods, conduct ourselves?
How should we be gods
Recommended from our members
Brain signal recognition using deep learning
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel UniversityBrain Computer Interface (BCI) has the potential to offer a new generation of applications independent of
muscular activity and controlled by the human brain. Brain imaging technologies are used to transfer the
cognitive tasks into control commands for a BCI system. The electroencephalography (EEG) technology
serves as the best available non-invasive solution for extracting signals from the brain. On the other hand,
speech is the primary means of communication, but for patients suffering from locked-in syndrome, there
is no easy way to communicate. Therefore, an ideal communication system for locked-in patients is a
thought-to-speech BCI system.
This research aims to investigate methods for the recognition of imagined speech from EEG signals
using deep learning techniques. In order to design an optimal imagined speech recognition BCI, variety
of issues have been solved. These include 1) proposing new feature extraction and classification
framework for recognition of imagined speech from EEG signals, 2) grammatical class recognition of
imagined words from EEG signals, 3) discriminating different cognitive tasks associated with speech in
the brain such as overt speech, covert speech, and visual imagery. In this work machine learning, deep
learning methods were used to analyze EEG signals.
For recognition of imagined speech from EEG signals, a new EEG database was collected while the
participants mentally spoke (imagined speech) the presented words. Along with imagined speech, EEG
data was recorded for visual imagery (imagining a scene or an image) and overt speech (verbal speech).
Spectro-temporal and spatio-temporal domain features were investigated for the classification of imagined
words from EEG signals. Further, a deep learning framework using the convolutional network
and attention mechanism was implemented for learning features in the spatial, temporal, and spectral
domains. The method achieved a recognition rate of 76.6% for three binary word pairs. These experiments
show that deep learning algorithms are ideal for imagined speech recognition from EEG signals
due to their ability to interpret features from non-linear and non-stationary signals. Grammatical classes
of imagined words from EEG signals were also recognized using a multi-channel convolution network
framework. This method was extended to a multi-level recognition system for multi-class classification
of imagined words which achieved an accuracy of 52.9% for 10 words, which is much better in
comparison to previous work.
In order to investigate the difference between imagined speech with verbal speech and visual imagery
from EEG signals, we used multivariate pattern analysis (MVPA). MVPA provided the time segments
when the neural oscillation for the different cognitive tasks was linearly separable. Further, frequencies
that result in most discrimination between the different cognitive tasks were also explored. A framework
was proposed to discriminate two cognitive tasks based on the spatio-temporal patterns in EEG signals.
The proposed method used the K-means clustering algorithm to find the best electrode combination and
convolutional-attention network for feature extraction and classification. The proposed method achieved
a high recognition rate of 82.9% and 77.7%.
The results in this research suggest that a communication based BCI system can be designed using
deep learning methods. Further, this work add knowledge to the existing work in the field of communication
based BCI system
Linguistic- and Acoustic-based Automatic Dementia Detection using Deep Learning Methods
Dementia can affect a person's speech and language abilities, even in the early stages. Dementia is incurable, but early detection can enable treatment that can slow down and maintain mental function. Therefore, early diagnosis of dementia is of great importance. However, current dementia detection procedures in clinical practice are expensive, invasive, and sometimes inaccurate. In comparison, computational tools based on the automatic analysis of spoken language have the potential to be applied as a cheap, easy-to-use, and objective clinical assistance tool for dementia detection.
In recent years, several studies have shown promise in this area. However, most studies focus heavily on the machine learning aspects and, as a consequence, often lack sufficient incorporation of clinical knowledge. Many studies also concentrate on clinically less relevant tasks such as the distinction between HC and people with AD which is relatively easy and therefore less interesting both in terms of the machine learning and the clinical application.
The studies in this thesis concentrate on automatically identifying signs of neurodegenerative dementia in the early stages and distinguishing them from other clinical, diagnostic categories related to memory problems: (FMD, MCI, and HC). A key focus, when designing the proposed systems has been to better consider (and incorporate) currently used clinical knowledge and also to bear in mind how these machine-learning based systems could be translated for use in real clinical settings.
Firstly, a state-of-the-art end-to-end system is constructed for extracting linguistic information from automatically transcribed spontaneous speech. The system's architecture is based on hierarchical principles thereby mimicking those used in clinical practice where information at both word-, sentence- and paragraph-level is used when extracting information to be used for diagnosis. Secondly, hand-crafted features are designed that are based on clinical knowledge of the importance of pausing and rhythm. These are successfully joined with features extracted from the end-to-end system. Thirdly, different classification tasks are explored, each set up so as to represent the types of diagnostic decision-making that is relevant in clinical practice. Finally, experiments are conducted to explore how to better deal with the known problem of confounding and overlapping symptoms on speech and language from age and cognitive decline. A multi-task system is constructed that takes age into account while predicting cognitive decline. The studies use the publicly available DementiaBank dataset as well as the IVA dataset, which has been collected by our collaborators at the Royal Hallamshire Hospital, UK. In conclusion, this thesis proposes multiple methods of using speech and language information for dementia detection with state-of-the-art deep learning technologies, confirming the automatic system's potential for dementia detection
Chinese Benteng Women’s Participation in Local Development Affairs in Indonesia: Appropriate means for struggle and a pathway to claim citizen’ right?
It had been more than two decades passing by aftermath the devastating Asia’s Financial Crisis in 1997, subsequently followed by Suharto’s step down from his presidential throne which he occupied for more than three decades. The financial turmoil turned to a political disaster furthermore has led to massive looting that severely impacted Indonesians of Chinese descendant, including unresolved mystery of the most atrocious sexual violation against women and covert killings of students and democracy activists in this country. Since then, precisely aftermath May 1998, which publicly known as “Reformasi”1, Indonesia underwent political reform that eventually corresponded positively to its macroeconomic growth. Twenty years later, in 2018, Indonesia captured worldwide attention because it has successfully hosted two internationally renowned events, namely the Asian Games 2018 – the most prestigious sport events in Asia – conducted in Jakarta and Palembang; and the IMF/World Bank Annual Meeting 2018 in Bali. Particularly in the IMF/World Bank Annual Meeting, this event has significantly elevated Indonesia’s credibility and international prestige in the global economic powerplay as one of the nations with promising growth and openness. However, the narrative about poverty and inequality, including increasing racial tension, religious conservatism, and sexual violation against women are superseded by friendly climate for foreign investment and eventually excessive glorification of the nation’s economic growth. By portraying the image of promising new economic power, as rhetorically promised by President Joko Widodo during his presidential terms, Indonesia has swept the growing inequality in this highly stratified society that historically compounded with religious and racial tension under the carpet of digital economy.Arte y Humanidade
- …