3,461 research outputs found

    Speech-based automatic depression detection via biomarkers identification and artificial intelligence approaches

    Get PDF
    Depression has become one of the most prevalent mental health issues, affecting more than 300 million people all over the world. However, due to factors such as limited medical resources and accessibility to health care, there are still a large number of patients undiagnosed. In addition, the traditional approaches to depression diagnosis have limitations because they are usually time-consuming, and depend on clinical experience that varies across different clinicians. From this perspective, the use of automatic depression detection can make the diagnosis process much faster and more accessible. In this thesis, we present the possibility of using speech for automatic depression detection. This is based on the findings in neuroscience that depressed patients have abnormal cognition mechanisms thus leading to the speech differs from that of healthy people. Therefore, in this thesis, we show two ways of benefiting from automatic depression detection, i.e., identifying speech markers of depression and constructing novel deep learning models to improve detection accuracy. The identification of speech markers tries to capture measurable depression traces left in speech. From this perspective, speech markers such as speech duration, pauses and correlation matrices are proposed. Speech duration and pauses take speech fluency into account, while correlation matrices represent the relationship between acoustic features and aim at capturing psychomotor retardation in depressed patients. Experimental results demonstrate that these proposed markers are effective at improving the performance in recognizing depressed speakers. In addition, such markers show statistically significant differences between depressed patients and non-depressed individuals, which explains the possibility of using these markers for depression detection and further confirms that depression leaves detectable traces in speech. In addition to the above, we propose an attention mechanism, Multi-local Attention (MLA), to emphasize depression-relevant information locally. Then we analyse the effectiveness of MLA on performance and efficiency. According to the experimental results, such a model can significantly improve performance and confidence in the detection while reducing the time required for recognition. Furthermore, we propose Cross-Data Multilevel Attention (CDMA) to emphasize different types of depression-relevant information, i.e., specific to each type of speech and common to both, by using multiple attention mechanisms. Experimental results demonstrate that the proposed model is effective to integrate different types of depression-relevant information in speech, improving the performance significantly for depression detection

    Improving Cross-Lingual Transfer Learning for Event Detection

    Get PDF
    The widespread adoption of applications powered by Artificial Intelligence (AI) backbones has unquestionably changed the way we interact with the world around us. Applications such as automated personal assistants, automatic question answering, and machine-based translation systems have become mainstays of modern culture thanks to the recent considerable advances in Natural Language Processing (NLP) research. Nonetheless, with over 7000 spoken languages in the world, there still remain a considerable number of marginalized communities that are unable to benefit from these technological advancements largely due to the language they speak. Cross-Lingual Learning (CLL) looks to address this issue by transferring the knowledge acquired from a popular, high-resource source language (e.g., English, Chinese, or Spanish) to a less favored, lower-resourced target language (e.g., Urdu or Swahili). This dissertation leverages the Event Detection (ED) sub-task of Information Extraction (IE) as a testbed and presents three novel approaches that improve cross-lingual transfer learning from distinct perspectives: (1) direct knowledge transfer, (2) hybrid knowledge transfer, and (3) few-shot learning

    Self-supervised learning for transferable representations

    Get PDF
    Machine learning has undeniably achieved remarkable advances thanks to large labelled datasets and supervised learning. However, this progress is constrained by the labour-intensive annotation process. It is not feasible to generate extensive labelled datasets for every problem we aim to address. Consequently, there has been a notable shift in recent times toward approaches that solely leverage raw data. Among these, self-supervised learning has emerged as a particularly powerful approach, offering scalability to massive datasets and showcasing considerable potential for effective knowledge transfer. This thesis investigates self-supervised representation learning with a strong focus on computer vision applications. We provide a comprehensive survey of self-supervised methods across various modalities, introducing a taxonomy that categorises them into four distinct families while also highlighting practical considerations for real-world implementation. Our focus thenceforth is on the computer vision modality, where we perform a comprehensive benchmark evaluation of state-of-the-art self supervised models against many diverse downstream transfer tasks. Our findings reveal that self-supervised models often outperform supervised learning across a spectrum of tasks, albeit with correlations weakening as tasks transition beyond classification, particularly for datasets with distribution shifts. Digging deeper, we investigate the influence of data augmentation on the transferability of contrastive learners, uncovering a trade-off between spatial and appearance-based invariances that generalise to real-world transformations. This begins to explain the differing empirical performances achieved by self-supervised learners on different downstream tasks, and it showcases the advantages of specialised representations produced with tailored augmentation. Finally, we introduce a novel self-supervised pre-training algorithm for object detection, aligning pre-training with downstream architecture and objectives, leading to reduced localisation errors and improved label efficiency. In conclusion, this thesis contributes a comprehensive understanding of self-supervised representation learning and its role in enabling effective transfer across computer vision tasks

    Multidisciplinary perspectives on Artificial Intelligence and the law

    Get PDF
    This open access book presents an interdisciplinary, multi-authored, edited collection of chapters on Artificial Intelligence (‘AI’) and the Law. AI technology has come to play a central role in the modern data economy. Through a combination of increased computing power, the growing availability of data and the advancement of algorithms, AI has now become an umbrella term for some of the most transformational technological breakthroughs of this age. The importance of AI stems from both the opportunities that it offers and the challenges that it entails. While AI applications hold the promise of economic growth and efficiency gains, they also create significant risks and uncertainty. The potential and perils of AI have thus come to dominate modern discussions of technology and ethics – and although AI was initially allowed to largely develop without guidelines or rules, few would deny that the law is set to play a fundamental role in shaping the future of AI. As the debate over AI is far from over, the need for rigorous analysis has never been greater. This book thus brings together contributors from different fields and backgrounds to explore how the law might provide answers to some of the most pressing questions raised by AI. An outcome of the Católica Research Centre for the Future of Law and its interdisciplinary working group on Law and Artificial Intelligence, it includes contributions by leading scholars in the fields of technology, ethics and the law.info:eu-repo/semantics/publishedVersio

    Neuroimaging investigations of cortical specialisation for different types of semantic knowledge

    Get PDF
    Embodied theories proposed that semantic knowledge is grounded in motor and perceptual experiences. This leads to two questions: (1) whether the neural underpinnings of perception are also necessary for semantic cognition; (2) how do biases towards different sensorimotor experiences cause brain regions to specialise for particular types of semantic information. This thesis tackles these questions in a series of neuroimaging and behavioural investigations. Regarding question 1, strong embodiment theory holds that semantic representation is reenactment of corresponding experiences, and brain regions for perception are necessary for comprehending modality-specific concepts. However, the weak embodiment view argues that reenactment may not be necessary, and areas near to perceiving regions may be sufficient to support semantic representation. In the particular case of motion concepts, lateral occipital temporal cortex (LOTC) has been long identified as an important area, but the roles of its different subregions are still uncertain. Chapter 3 examined how different parts of LOTC reacted to written descriptions of motion and static events, using multiple analysis methods. A series of anterior to posterior sub-regions were analyzed through univariate, multivariate pattern analysis (MVPA), and psychophysical interaction (PPI) analyses. MVPA revealed strongest decoding effects for motion vs. static events in the posterior parts of LOTC, including both visual motion area (V5) and posterior middle temporal gyrus (pMTG). In contrast, only the middle portion of LOTC showed increased activation for motion sentences in univariate analyses. PPI analyses showed increased functional connectivity between posterior LOTC and the multiple demand network for motion events. These findings suggest that posterior LOTC, which overlapped with the motion perception V5 region, is selectively involved in comprehending motion events, while the anterior part of LOTC contributes to general semantic processing. Regarding question 2, the hub-and-spoke theory suggests that anterior temporal lobe (ATL) acts as a hub, using inputs from modality-specific regions to construct multimodal concepts. However, some researchers propose temporal parietal cortex (TPC) as an additional hub, specialised in processing and integrating interaction and contextual information (e.g., for actions and locations). These hypotheses are summarized as the "dual-hub theory" and different aspects of this theory were investigated in in Chapters 4 and 5. Chapter 4 focuses on taxonomic and thematic relations. Taxonomic relations (or categorical relations) occur when two concepts belong to the same category (e.g., ‘dog’ and ‘wolf’ are both canines). In contrast, thematic relations (or associative relations) refer to situations that two concepts co-occur in events or scenes (e.g., ‘dog’ and ‘bone’), focusing on the interaction or association between concepts. Some studies have indicated ATL specialization for taxonomic relations and TPC specialization for thematic relations, but others have reported inconsistent or even converse results. Thus Chapter 4 first conducted an activation likelihood estimation (ALE) meta-analysis of neuroimaging studies contrasting taxonomic and thematic relations. This found that thematic relations reliably engage action and location processing regions (left pMTG and SMG), while taxonomic relations only showed consistent effects in the right occipital lobe. A primed semantic judgement task was then used to test the dual-hub theory’s prediction that taxonomic relations are heavily reliant on colour and shape knowledge, while thematic relations rely on action and location knowledge. This behavioural experiment revealed that action or location priming facilitated thematic relation processing, but colour and shape did not lead to priming effects for taxonomic relations. This indicates that thematic relations rely more on action and location knowledge, which may explain why the preferentially engage TPC, whereas taxonomic relations are not specifically linked to shape and colour features. This may explain why they did not preferentially engage left ATL. Chapter 5 concentrates on event and object concepts. Previous studies suggest ATL specialization for coding similarity of objects’ semantics, and angular gyrus (AG) specialization for sentence and event structure representation. In addition, in neuroimaging studies, event semantics are usually investigated using complex temporally extended stimuli, unlike than the single-concept stimuli used to investigate object semantics. Thus chapter 5 used representational similarity analysis (RSA), univariate analysis, and PPI analysis to explore neural activation patterns for event and object concepts presented as static images. Bilateral AGs encoded semantic similarity for event concepts, with the left AG also coding object similarity. Bilateral ATLs encoded semantic similarity for object concepts but also for events. Left ATL exhibited stronger coding for events than objects. PPI analysis revealed stronger connections between left ATL and right pMTG, and between right AG and bilateral inferior temporal gyrus (ITG) and middle occipital gyrus, for event concepts compared to object concepts. Consistent with the meta-analysis in chapter 4, the results in chapter 5 support the idea of partial specialization in AG for event semantics but do not support ATL specialization for object semantics. In fact, both the meta-analysis and chapter 5 findings suggest greater ATL involvement in coding objects' associations compared to their similarity. To conclude, the thesis provides support for the idea that perceptual brain regions are engaged in conceptual processing, in the case of motion concepts. It also provides evidence for a specialised role for TPC regions in processing thematic relations (pMTG) and event concepts (AG). There was mixed evidence for specialisation within the ATLs and this remains an important target for future research

    Rules, frequency, and predictability in morphological generalization: behavioral and computational evidence from the German plural system

    Get PDF
    Morphological generalization, or the task of mapping an unknown word (such as a novel noun Raun) to an inflected form (such as the plural Rauns), has historically proven a contested topic within computational linguistics and cognitive science, e.g. within the past tense debate (Rumelhart and McClelland, 1986; Pinker and Prince, 1988; Seidenberg and Plaut, 2014). Marcus et al. (1995) identified German plural inflection as a key challenge domain to evaluate two competing accounts of morphological generalization: a rule generation view focused on linguistic features of input words, and a type frequency view focused on the distribution of output inflected forms, thought to reflect more domain-general cognitive processes. More recent behavioral and computational research developments support a new view based on predictability, which integrates both input and output distributions. My research uses these methodological innovations to revisit a core dispute of the past tense debate: how do German speakers generalize plural inflection, and can computational learners generalize similarly? This dissertation evaluates the rule generation, type frequency, and predictability accounts of morphological generalization in a series of behavioral and computational experiments with the stimuli developed by Marcus et al.. I assess predictions for three aspects of German plural generalization: distribution of infrequent plural classes, influence of grammatical gender, and within-item variability. Overall, I find that speaker behavior is best characterized as frequency-matching to a phonologically-conditioned lexical distribution. This result does not support the rule generation view, and qualifies the predictability view: speakers use some, but not all available information to reduce uncertainty in morphological generalization. Neural and symbolic model predictions are typically overconfident relative to speakers; simple Bayesian models show somewhat higher speaker-like variability and accuracy. All computational models are outperformed by a static phonologically-conditioned lexical baseline, suggesting these models have not learned the selective feature preferences that inform speaker generalization

    Mapping the Focal Points of WordPress: A Software and Critical Code Analysis

    Get PDF
    Programming languages or code can be examined through numerous analytical lenses. This project is a critical analysis of WordPress, a prevalent web content management system, applying four modes of inquiry. The project draws on theoretical perspectives and areas of study in media, software, platforms, code, language, and power structures. The applied research is based on Critical Code Studies, an interdisciplinary field of study that holds the potential as a theoretical lens and methodological toolkit to understand computational code beyond its function. The project begins with a critical code analysis of WordPress, examining its origins and source code and mapping selected vulnerabilities. An examination of the influence of digital and computational thinking follows this. The work also explores the intersection of code patching and vulnerability management and how code shapes our sense of control, trust, and empathy, ultimately arguing that a rhetorical-cultural lens can be used to better understand code\u27s controlling influence. Recurring themes throughout these analyses and observations are the connections to power and vulnerability in WordPress\u27 code and how cultural, processual, rhetorical, and ethical implications can be expressed through its code, creating a particular worldview. Code\u27s emergent properties help illustrate how human values and practices (e.g., empathy, aesthetics, language, and trust) become encoded in software design and how people perceive the software through its worldview. These connected analyses reveal cultural, processual, and vulnerability focal points and the influence these entanglements have concerning WordPress as code, software, and platform. WordPress is a complex sociotechnical platform worthy of further study, as is the interdisciplinary merging of theoretical perspectives and disciplines to critically examine code. Ultimately, this project helps further enrich the field by introducing focal points in code, examining sociocultural phenomena within the code, and offering techniques to apply critical code methods

    La traduzione specializzata all’opera per una piccola impresa in espansione: la mia esperienza di internazionalizzazione in cinese di Bioretics© S.r.l.

    Get PDF
    Global markets are currently immersed in two all-encompassing and unstoppable processes: internationalization and globalization. While the former pushes companies to look beyond the borders of their country of origin to forge relationships with foreign trading partners, the latter fosters the standardization in all countries, by reducing spatiotemporal distances and breaking down geographical, political, economic and socio-cultural barriers. In recent decades, another domain has appeared to propel these unifying drives: Artificial Intelligence, together with its high technologies aiming to implement human cognitive abilities in machinery. The “Language Toolkit – Le lingue straniere al servizio dell’internazionalizzazione dell’impresa” project, promoted by the Department of Interpreting and Translation (ForlĂŹ Campus) in collaboration with the Romagna Chamber of Commerce (ForlĂŹ-Cesena and Rimini), seeks to help Italian SMEs make their way into the global market. It is precisely within this project that this dissertation has been conceived. Indeed, its purpose is to present the translation and localization project from English into Chinese of a series of texts produced by Bioretics© S.r.l.: an investor deck, the company website and part of the installation and use manual of the Aliquis© framework software, its flagship product. This dissertation is structured as follows: Chapter 1 presents the project and the company in detail; Chapter 2 outlines the internationalization and globalization processes and the Artificial Intelligence market both in Italy and in China; Chapter 3 provides the theoretical foundations for every aspect related to Specialized Translation, including website localization; Chapter 4 describes the resources and tools used to perform the translations; Chapter 5 proposes an analysis of the source texts; Chapter 6 is a commentary on translation strategies and choices

    Under construction: infrastructure and modern fiction

    Full text link
    In this dissertation, I argue that infrastructural development, with its technological promises but widening geographic disparities and social and environmental consequences, informs both the narrative content and aesthetic forms of modernist and contemporary Anglophone fiction. Despite its prevalent material forms—roads, rails, pipes, and wires—infrastructure poses particular formal and narrative problems, often receding into the background as mere setting. To address how literary fiction theorizes the experience of infrastructure requires reading “infrastructurally”: that is, paying attention to the seemingly mundane interactions between characters and their built environments. The writers central to this project—James Joyce, William Faulkner, Karen Tei Yamashita, and Mohsin Hamid—take up the representational challenges posed by infrastructure by bringing transit networks, sanitation systems, and electrical grids and the histories of their development and use into the foreground. These writers call attention to the political dimensions of built environments, revealing the ways infrastructures produce, reinforce, and perpetuate racial and socioeconomic fault lines. They also attempt to formalize the material relations of power inscribed by and within infrastructure; the novel itself becomes an imaginary counterpart to the technologies of infrastructure, a form that shapes and constrains what types of social action and affiliation are possible
    • 

    corecore