72 research outputs found

    Resource Allocation based on Federated Learning for Next Generation Wireless Communication

    Get PDF
    Federated Learning (FL) stands out as a decentralized Machine Learning (ML) method that enables model training using distributed data while safeguarding data privacy. Its application in the next-generation wireless communication, especially the Internet of Things (IoT), realm has the potential to provide more intelligent and efficient solutions for addressing the challenges posed by massive data and security concerns. However, the performance of wireless FL is often hampered by constraints related to wireless communication resources and participant mobility. To address this issue, this thesis proposes a scheduling strategy based on model quality and communication quality. The strategy employs interpretable ML to assess the contribution of each local model to the convergence of the global model and adjusts the weights of model quality and training participant communication quality dynamically, called dynamical balance quality, to achieve a more efficient and fair resource allocation strategy. Finally, the thesis compares the proposed strategy with traditional ones, and simulation results demonstrate that analyzing the value of local models using Interpretable Machine Learning techniques can help maximize the overall learning efficiency of FL systems. Furthermore, this thesis delves into the practical applications of wireless IoT and federated systems. I have designed a distributed federated IoT system framework tailored to the healthcare sector, encompassing a contactless health self-check-in web application, thermal imaging cameras, and physical barriers. This integrated system streamlines the COVID-19 health screening and data recording process. Through collaboration with a major urban hospital in Australia, we implemented a pilot electronic gate solution

    Rationality in Artificial Intelligence Decision-making

    Get PDF
    Organisaatioiden päätöksenteossa käytetään enenevissä määrin tekoälyä, jonka odotetaan luovan kilpailuetua sitä käyttäville. Kuitenkin uusien mahdollisuuksien ja hyötyjen myötä päädytään myös uusien ongelmien ja haasteiden pariin. Tekoälyn osalta merkittävä osa näistä haasteista koskee rationaliteettia, jolla tässä tarkoitetaan päätöksenteon takana olevia syitä, niiden suhteita toisiinsa, sekä prosessia, jonka tuloksena ne saadaan. Tekoälyn luomat haasteet päästä näkemään ja ymmärtämään päätösten takana olevia rationaliteetteja luo huolta päätöksenteon reiluudesta, vastuusta, ja luottamuksesta päätöksentekoprosessiin. Lisäksi tekoälyn käyttämän rationaliteetin katsotaan luovan haasteita moraaliselle, refleksiiviselle harkintakyvylle päätöksenteossa. Rationaliteetti sekä toimijuus ovat molemmat oleellisia päätöksenteon kannalta, mutta toimijuus on käsitteenä kehittynyt suuntaan, jossa teknologia ja ihminen nähdään erottamattomia toimijuuden suhteen. Niiden katsotaan muodostavan yhdessä yhteinen toimijuus. Tekoälykeskusteluissa rationaaliteettiin sen sijaan on juurtunut syvälle dualistinen ajattelu, joka on toimijuuden suhteen jo hylätty. Dualistisen ajattelutavan rationaliteetin suhteen voidaan katsoa ylläpitävän tunnistettuja ongelmia tekoälyn suhteen. Tekoälyn rationaliteetin laatua on käsitelty teoreettisesti, mutta tutkimuskentältä puuttuu vielä empiirinen tutkimus aiheesta. Tämä väitöskirja käyttää postfenomenologiaa empiiriseen tutkimukseen siitä, miten tekoälyn käyttö muuttaa päätöksenteon rationaliteettia. Postfenomenologia on yhteensopiva toimijuuden kanssa, joka ymmärretään ei-dualistiseksi. Sen sijaan post- fenomenologia käsittää teknologian “välittäjänä” ihmisten toimijuudelle. Tämä väitöskirja käyttää vastaavaa näkemystä rationaliteetin tarkasteluun, ja siten tuo rationaliteetin ei-dualistisen tarkastelun tasa-arvoiseksi toimijuuden kanssa päätöksenteossa. Esitetty tutkimuskysymys on “Kuinka tekoäly toimii välittäjänä rationaliteetille päätöksenteossa?” Postfenomenologinen analyysi on tarkoitettu käytettäväksi kun tutkitaan tiettyjä teknologioita ja sitä, miten ne toimivat välittäjinä ihmisten olemiselle ja kokemuksille. Nämä välitykset voidaan jakaa ulottuvuuksiin, jotka tässä väitöskirjassa ovat piilottaminen–paljastaminen, mahdollistava–rajoittava, sekä vieraannuttava– osallistava. Empiiriset tutkimukset luovat postfenomenologiassa perustan filosofiselle ja konseptuaaliselle analyysille. Tyypillisesti nämä ovat tapaustutkimuksia konkreettisista teknologioista, jotka voivat olla primäärisiä omia tutkimuksia, perustua sekundääriseen materiaaliin, tai olla tutkijan omaa reflektiota. Vaikka väitöskirjan julkaisut eivät itsessään ole olleet tapaustutkimuksia, käytetty postfenomenologinen tutkimusote käsittää ne sellaisina muodostaen väitöskirjasta monitapaustutkimuksen. Neljä ensimmäistä julkaisua ovat empiirisiä tekoälysovelluksia erilaisilla, mutta verrattavissa olevilla datoilla ja tutkimusasetelmilla. Viimeinen julkaisu on teoreettinen, ja se täydentää aiempia julkaisuita tarjoamalla näkökulman tarkasteltavaan vieraannuttava– osallistava-ulottuvuuteen. Tekoälyn havaittiin piilottavan päätöksien rationaliteettia useissa eri päätöksentekoprosessin vaiheissa, mutta toisaalta myös paljastavan tiettyjä uusia rationaliteettimahdollisuuksia. Piilotuksesta löydettiin kaksi eri tasoa. Ensimmäisellä tasolla rationaliteetin sisältö on piilossa, mutta on nähtävissä, että jotain rationaliteettia on käytetty. Toisella tasolla on piilossa, että päätökseen on edes käytetty rationaliteettia. Sen sijaan päätös vaikuttaa tapahtuneen ilman syitä ikää kuin “automaattisesti.” Rationaliteeteista muodostui abstraktimpeja ja jäykempiä riippumatta tekoälyn käytöstä päätöksenteossa, mikä kuitenkin tyypillisesti paljasti rationaliteetin sisältöä kun päätöksenteko oli osallistavaa, kun taas vieraantuneessa päätöksenteossa tämä prosessi ja rationaliteetti jäi piiloon. Tekoäly luonteensa vuoksi rajoitti rationaliteetteja vertailemaan datan erilaisuuksia ja samanlaisuuksia. Tulokset vihjaavat, että ihmiset ovat itse osallistuvat omaan vieraantumiseensa päätöksenteossa tekoälyn kanssa erityisesti rationaliteetin piilottamisen kautta. Tämä väitöskirja tarjoaa uusia näkemyksiä ja tarkemman tarkastelutason rationaliteettiin ja sen moraaliin tekoälyavusteisessa päätöksenteossa. Väitöskirja myös tarjoaa testattavia väitteitä tekoälyn välityksistä, joita voidaan käyttää teorian kehittämiseen tekoälyn reiluuden ja vastuun näkökulmista. Lisäksi väitöskirja vie rationaliteetin ja organisaatioiden päätöksenteon tutkimuskenttää eteenpäin jättämällä tarpeettoman dualismin pois rationaliteetin osalta. Löydökset myös auttavat ammattilaisia löytämään oleellisia tekoälyn vaikutuksia, jotka on syytä huomioida onnistuneen tekoälyn käytön kannalta.Artificial intelligence (AI) has become increasingly ubiquitous in a variety of organizations for decision-making, and it promises competitive advantages to those who use it. However, with the novel insights and benefits of AI come unprecedented side-effects and externalities, which circle around a theme of rationality. A rationality for a decision is the reasons, the relationships between the reasons, and the process of their emergence. Lack of access to the decision rationality of AI is posed to cause issues with trust in AI due to lack of fairness and accountability. Moreover, AI rationality in moral decisions is seen to pose threats to reflective moral capabilities. While rationality and agency are both fundamental to decision-making, agency has seen a shift into more relational views in which the technical and social are seen as inseparable and co-constituting of each other. However, AI rationality discussions are still heavily entrenched in dualism that has been overcome regarding agency. This entrenchment can contribute to a variety of the issues noted around AI. Moreover, while the types of AI rationality have been considered theoretically, currently the field lacks empirical work to support the discussions revolving around AI rationality. This dissertation uses postphenomenology as a methodology to study empiri- cally how AI in decision-making impacts rationality. Postphenomenology honours anti-dualistic agency: Technology mediates and co-constitutes agency with people in intra-action. This dissertation uses this approach to study the mediation of rationality. Thus, it helps views on rationality to catch up with agency in terms of overcoming unnecessary dualism. The posed research question is “How does AI mediate rationality in decision-making?” Postphenomenological analysis is meant to be used at the level of the technological mediations of a specific technology, such as AI mediation of rationality in decision-making. Mediations can be considered in dimensions. This dissertation considers revealing–concealing, enabling–constraining, and involving–alienating dimensions of mediation to answer the posed research question. In postphenomenology a basis for analysis is provided by empirical works, which are typically case studies of concrete intra-actions between humans and technologies. Postphenomenology as a methodology allows secondary empirical work by others, primary self-conducted studies, and first-person reflection as basis for empirical case analysis. Thus, while the publications of this dissertation are not published as case studies, postphenomenology considers them as such, making this dissertation a multiple case study. The first four publications are empirical works of applied AI with various different types of combinations of human and AI decision-making tasks with different yet comparable data. Data and methodology remain similar across studies in the empirical publications and are well comparable for postphenomenological analysis as case studies. The last publication is a theoretical paper, which provides a complement to the empirical publications on the involving–alienating dimension. AI was found to conceal decision rationality in various stages of AI decision making, while in some cases AI also revealed possibilities for specific, novel rationalities. Two levels of rationality concealment were discovered: The contents of a rationality could become concealed, but also the presence of a rationality in the first place could become concealed. Rationality became more abstract and formalized regardless of whether the rationality was constructed with an AI or not. This formalization constrained rationality by ruling out other valid rationalities. Constraint also happened due to rationalities necessarily taking the specific form of similarities versus differences in the data. The results suggest that people can become involved in their alienation from rationality in AI decision-making. Study of the relationships between the mediation dimensions suggest that the constraint of formalization was revealing with involvement. Otherwise, formalization was both concealed because of and resulted in alienation from AI in decision-making. Results point to the direction that people may be involved in their own alienation via rationality concealment. This dissertation contributes new insights and levels of analysis for AI rationality in decision-making and its moral implications. It provides testable claims about technological mediations that can be used to develop theory and posits that they can be useful in theorizing how to increase AI fairness, accountability, and transparency. Moreover, the dissertation contributes to the field of rationality in management and organizational decision-making by developing rationality beyond unnecessary dualism. For practitioners, the findings guide them to identify relevant AI mediations in decision-making to consider to ensure successful AI adoption and mitigation of its issues in their specific contexts

    "Is There Choice in Non-Native Voice?" Linguistic Feature Engineering and a Variationist Perspective in Automatic Native Language Identification

    Get PDF
    Is it possible to infer the native language of an author from a non-native text? Can we perform this task fully automatically? The interest in answers to these questions led to the emergence of a research field called Native Language Identification (NLI) in the first decade of this century. The requirement to automatically identify a particular property based on some language data situates the task in the intersection between computer science and linguistics, or in the context of computational linguistics, which combines both disciplines. This thesis targets several relevant research questions in the context of NLI. In particular, what is the role of surface features and more abstract linguistic cues? How to combine different sets of features, and how to optimize the resulting large models? Do the findings generalize across different data sets? Can we benefit from considering the task in the light of the language variation theory? In order to approach these questions, we conduct a range of quantitative and qualitative explorations, employing different machine learning techniques. We show how linguistic insight can advance technology, and how technology can advance linguistic insight, constituting a fruitful and promising interplay

    Matriisihajotelmamenetelmiä tiedonlouhintaan : laskennallinen vaativuus ja algoritmeja

    Get PDF
    Matrix decompositions, where a given matrix is represented as a product of two other matrices, are regularly used in data mining. Most matrix decompositions have their roots in linear algebra, but the needs of data mining are not always those of linear algebra. In data mining one needs to have results that are interpretable -- and what is considered interpretable in data mining can be very different to what is considered interpretable in linear algebra. --- The purpose of this thesis is to study matrix decompositions that directly address the issue of interpretability. An example is a decomposition of binary matrices where the factor matrices are assumed to be binary and the matrix multiplication is Boolean. The restriction to binary factor matrices increases interpretability -- factor matrices are of the same type as the original matrix -- and allows the use of Boolean matrix multiplication, which is often more intuitive than normal matrix multiplication with binary matrices. Also several other decomposition methods are described, and the computational complexity of computing them is studied together with the hardness of approximating the related optimization problems. Based on these studies, algorithms for constructing the decompositions are proposed. Constructing the decompositions turns out to be computationally hard, and the proposed algorithms are mostly based on various heuristics. Nevertheless, the algorithms are shown to be capable of finding good results in empirical experiments conducted with both synthetic and real-world data.Ihmisten kyky tuottaa ja varastoida tietoa on kasvanut huimasti: yhä tarkemmat ja lukuisammat mittalaitteet tallentavat jatkuvasti tietoa ympäröivästä maailmasta ja yhtä lailla yhä useammat ihmiset tuottavat yhä enemmän sisältöä Internetiin esimerkiksi blogien ja keskustelupalstojen avulla. Mutta ihmisen kyky käsitellä informaatiota ei kasva samaan tahtiin informaation lisääntymisen kanssa. Internetin hakukoneet ovat tunnetuin menetelmä suurien tietomassojen hallintaan tarjoten käyttäjilleen mahdollisuuden hakea käyttäjää kiinnostavaa tietoa Internetistä. Mutta entä jos käyttäjä ei tiedä, minkälaista informaatiota hänellä on käytettävissään ja mikä siinä saattaisi kiinnostaa häntä? Tiedonlouhinta on tietojenkäsittelytieteen ala, joka pyrkii kehittämään menetelmiä sellaisen kiinnostavan tiedon löytämiseksi, josta käyttäjä ei ollut edes tietoinen. Väitöstyössä tutkitaan eräiden matriisihajotelmien käyttöä tiedonlouhinnassa. Matriiseja käytetään yleisesti tiedon esitys- ja tallennusmuotona. Mutta tällaiset matriisit ovat usein liian isoja ihmisten käsiteltäväksi. Matriisihajotelma esittää annetun matriisin useamman matriisin tulona. Jos nämä matriisit valitaan niin, että ne ovat riittävän pieniä ja helposti tulkittavia, voidaan alkuperäisestä datasta oppia paljon sellaista, minkä löytäminen dataa itseään tutkimalla olisi mahdollisesti ollut huomattavan vaikeaa. Väitöstyössä tutkitaan kolmea erilaista matriisihajotelmaa, jotka soveltuvat eri tilanteisiin. Työ on luonteeltaan perustutkimusta ja työn tulokset luonteeltaan kaksijakoisia. Yhtäältä väitöstyössä osoitetaan, että optimaalisten matriisihajotelmien löytäminen tehokkaasti on nykytietämyksen valossa mahdotonta, ja että jopa likimääräisten vastausten löytäminen on vaikeaa. Toisaalta tutkittujen matriisihajotelmien löytämiseksi esitetään tehokkaita algoritmeja, ja vaikka nämä algoritmit eivät edellisten tulosten nojalla voikaan olla optimaalisia, väitöstyössä suoritetut empiiriset kokeet osoittavat niiden toimivan hyvin sekä tarkoitusta varten luoduilla että todellisilla aineistoilla

    The Role of Linguistics in Probing Task Design

    Get PDF
    Over the past decades natural language processing has evolved from a niche research area into a fast-paced and multi-faceted discipline that attracts thousands of contributions from academia and industry and feeds into real-world applications. Despite the recent successes, natural language processing models still struggle to generalize across domains, suffer from biases and lack transparency. Aiming to get a better understanding of how and why modern NLP systems make their predictions for complex end tasks, a line of research in probing attempts to interpret the behavior of NLP models using basic probing tasks. Linguistic corpora are a natural source of such tasks, and linguistic phenomena like part of speech, syntax and role semantics are often used in probing studies. The goal of probing is to find out what information can be easily extracted from a pre-trained NLP model or representation. To ensure that the information is extracted from the NLP model and not learned during the probing study itself, probing models are kept as simple and transparent as possible, exposing and augmenting conceptual inconsistencies between NLP models and linguistic resources. In this thesis we investigate how linguistic conceptualization can affect probing models, setups and results. In Chapter 2 we investigate the gap between the targets of classical type-level word embedding models like word2vec, and the items of lexical resources and similarity benchmarks. We show that the lack of conceptual alignment between word embedding vocabularies and lexical resources penalizes the word embedding models in both benchmark-based and our novel resource-based evaluation scenario. We demonstrate that simple preprocessing techniques like lemmatization and POS tagging can partially mitigate the issue, leading to a better match between word embeddings and lexicons. Linguistics often has more than one way of describing a certain phenomenon. In Chapter 3 we conduct an extensive study of the effects of lingustic formalism on probing modern pre-trained contextualized encoders like BERT. We use role semantics as an excellent example of a data-rich multi-framework phenomenon. We show that the choice of linguistic formalism can affect the results of probing studies, and deliver additional insights on the impact of dataset size, domain, and task architecture on probing. Apart from mere labeling choices, linguistic theories might differ in the very way of conceptualizing the task. Whereas mainstream NLP has treated semantic roles as a categorical phenomenon, an alternative, prominence-based view opens new opportunities for probing. In Chapter 4 we investigate prominence-based probing models for role semantics, incl. semantic proto-roles and our novel regression-based role probe. Our results indicate that pre-trained language models like BERT might encode argument prominence. Finally, we propose an operationalization of thematic role hierarchy - a widely used linguistic tool to describe syntactic behavior of verbs, and show that thematic role hierarchies can be extracted from text corpora and transfer cross-lingually. The results of our work demonstrate the importance of linguistic conceptualization for probing studies, and highlight the dangers and the opportunities associated with using linguistics as a meta-langauge for NLP model interpretation

    An Analysis on Adversarial Machine Learning: Methods and Applications

    Get PDF
    Deep learning has witnessed astonishing advancement in the last decade and revolutionized many fields ranging from computer vision to natural language processing. A prominent field of research that enabled such achievements is adversarial learning, investigating the behavior and functionality of a learning model in presence of an adversary. Adversarial learning consists of two major trends. The first trend analyzes the susceptibility of machine learning models to manipulation in the decision-making process and aims to improve the robustness to such manipulations. The second trend exploits adversarial games between components of the model to enhance the learning process. This dissertation aims to provide an analysis on these two sides of adversarial learning and harness their potential for improving the robustness and generalization of deep models. In the first part of the dissertation, we study the adversarial susceptibility of deep learning models. We provide an empirical analysis on the extent of vulnerability by proposing two adversarial attacks that explore the geometric and frequency-domain characteristics of inputs to manipulate deep decisions. Afterward, we formalize the susceptibility of deep networks using the first-order approximation of the predictions and extend the theory to the ensemble classification scheme. Inspired by theoretical findings, we formalize a reliable and practical defense against adversarial examples to robustify ensembles. We extend this part by investigating the shortcomings of \gls{at} and highlight that the popular momentum stochastic gradient descent, developed essentially for natural training, is not proper for optimization in adversarial training since it is not designed to be robust against the chaotic behavior of gradients in this setup. Motivated by these observations, we develop an optimization method that is more suitable for adversarial training. In the second part of the dissertation, we harness adversarial learning to enhance the generalization and performance of deep networks in discriminative and generative tasks. We develop several models for biometric identification including fingerprint distortion rectification and latent fingerprint reconstruction. In particular, we develop a ridge reconstruction model based on generative adversarial networks that estimates the missing ridge information in latent fingerprints. We introduce a novel modification that enables the generator network to preserve the ID information during the reconstruction process. To address the scarcity of data, {\it e.g.}, in latent fingerprint analysis, we develop a supervised augmentation technique that combines input examples based on their salient regions. Our findings advocate that adversarial learning improves the performance and reliability of deep networks in a wide range of applications

    Machine learning approaches to identifying social determinants of health in electronic health record clinical notes

    Get PDF
    Social determinants of health (SDH) represent the complex set of circumstances in which individuals are born, or with which they live, that impact health. Relatively little attention has been given to processes needed to extract SDH data from electronic health records. Despite their importance, SDH data in the EHR remains sparse, typically collected only in clinical notes and thus largely unavailable for clinical decision making. I focus on developing and validating more efficient information extraction approaches to identifying and classifying SDH in clinical notes. In this dissertation, I have three goals: First, I develop a word embedding model to expand SDH terminology in the context of identifying SDH clinical text. Second, I examine the effectiveness of different machine learning algorithms and a neural network model to classify the SDH characteristics financial resource strain and poor social support. Third, I compare the highest performing approaches to simpler text mining techniques and evaluate the models based on performance, cost, and generalizability in the task of classifying SDH in two distinct data sources.Doctor of Philosoph

    Unsupervised Automatic Detection Of Transient Phenomena In InSAR Time-Series using Machine Learning

    Get PDF
    The detection and measurement of transient episodes of crustal deformation from global InSAR datasets are crucial for a wide range of solid earth and natural hazard applications. But the large volumes of unlabelled data captured by satellites preclude manual systematic analysis, and the small signal-to-noise ratio makes the task difficult. In this thesis, I present a state-of-the-art, unsupervised and event-agnostic deep-learning based approach for the automatic identification of transient deformation events in noisy time-series of unwrapped InSAR images. I adopt an anomaly detection framework that learns the ‘normal’ spatio-temporal pattern of noise in the data, and which therefore identifies any transient deformation phenomena that deviate from this pattern as ‘anomalies’. The deep-learning model is built around a bespoke autoencoder that includes convolutional and LSTM layers, as well as a neural network which acts as a bridge between the encoder and decoder. I train our model on real InSAR data from northern Turkey and find it has an overall accuracy and true positive rate of around 85% when trying to detect synthetic deformation signals of length-scale > 350 m and magnitude > 4 cm. Furthermore, I also show the method can detect (1) a real Mw 5.7 earthquake in InSAR data from an entirely different region- SW Turkey, (2) a volcanic deformation in Domuyo, Argentina, (3) a synthetic slow-slip event and (4) an interseismic deformation around NAF in a descending frame in northern Turkey. Overall I show that my method is suitable for automated analysis of large, global InSAR datasets, and for robust detection and separation of deformation signals from nuisance signals in InSAR data
    corecore