8,053 research outputs found

    Graduate Catalog of Studies, 2023-2024

    Get PDF

    DeepOnto: A Python Package for Ontology Engineering with Deep Learning

    Full text link
    Applying deep learning techniques, particularly language models (LMs), in ontology engineering has raised widespread attention. However, deep learning frameworks like PyTorch and Tensorflow are predominantly developed for Python programming, while widely-used ontology APIs, such as the OWL API and Jena, are primarily Java-based. To facilitate seamless integration of these frameworks and APIs, we present Deeponto, a Python package designed for ontology engineering. The package encompasses a core ontology processing module founded on the widely-recognised and reliable OWL API, encapsulating its fundamental features in a more "Pythonic" manner and extending its capabilities to include other essential components including reasoning, verbalisation, normalisation, projection, and more. Building on this module, Deeponto offers a suite of tools, resources, and algorithms that support various ontology engineering tasks, such as ontology alignment and completion, by harnessing deep learning methodologies, primarily pre-trained LMs. In this paper, we also demonstrate the practical utility of Deeponto through two use-cases: the Digital Health Coaching in Samsung Research UK and the Bio-ML track of the Ontology Alignment Evaluation Initiative (OAEI).Comment: under review at Semantic Web Journa

    Modular lifelong machine learning

    Get PDF
    Deep learning has drastically improved the state-of-the-art in many important fields, including computer vision and natural language processing (LeCun et al., 2015). However, it is expensive to train a deep neural network on a machine learning problem. The overall training cost further increases when one wants to solve additional problems. Lifelong machine learning (LML) develops algorithms that aim to efficiently learn to solve a sequence of problems, which become available one at a time. New problems are solved with less resources by transferring previously learned knowledge. At the same time, an LML algorithm needs to retain good performance on all encountered problems, thus avoiding catastrophic forgetting. Current approaches do not possess all the desired properties of an LML algorithm. First, they primarily focus on preventing catastrophic forgetting (Diaz-Rodriguez et al., 2018; Delange et al., 2021). As a result, they neglect some knowledge transfer properties. Furthermore, they assume that all problems in a sequence share the same input space. Finally, scaling these methods to a large sequence of problems remains a challenge. Modular approaches to deep learning decompose a deep neural network into sub-networks, referred to as modules. Each module can then be trained to perform an atomic transformation, specialised in processing a distinct subset of inputs. This modular approach to storing knowledge makes it easy to only reuse the subset of modules which are useful for the task at hand. This thesis introduces a line of research which demonstrates the merits of a modular approach to lifelong machine learning, and its ability to address the aforementioned shortcomings of other methods. Compared to previous work, we show that a modular approach can be used to achieve more LML properties than previously demonstrated. Furthermore, we develop tools which allow modular LML algorithms to scale in order to retain said properties on longer sequences of problems. First, we introduce HOUDINI, a neurosymbolic framework for modular LML. HOUDINI represents modular deep neural networks as functional programs and accumulates a library of pre-trained modules over a sequence of problems. Given a new problem, we use program synthesis to select a suitable neural architecture, as well as a high-performing combination of pre-trained and new modules. We show that our approach has most of the properties desired from an LML algorithm. Notably, it can perform forward transfer, avoid negative transfer and prevent catastrophic forgetting, even across problems with disparate input domains and problems which require different neural architectures. Second, we produce a modular LML algorithm which retains the properties of HOUDINI but can also scale to longer sequences of problems. To this end, we fix the choice of a neural architecture and introduce a probabilistic search framework, PICLE, for searching through different module combinations. To apply PICLE, we introduce two probabilistic models over neural modules which allows us to efficiently identify promising module combinations. Third, we phrase the search over module combinations in modular LML as black-box optimisation, which allows one to make use of methods from the setting of hyperparameter optimisation (HPO). We then develop a new HPO method which marries a multi-fidelity approach with model-based optimisation. We demonstrate that this leads to improvement in anytime performance in the HPO setting and discuss how this can in turn be used to augment modular LML methods. Overall, this thesis identifies a number of important LML properties, which have not all been attained in past methods, and presents an LML algorithm which can achieve all of them, apart from backward transfer

    Fuzzy Natural Logic in IFSA-EUSFLAT 2021

    Get PDF
    The present book contains five papers accepted and published in the Special Issue, “Fuzzy Natural Logic in IFSA-EUSFLAT 2021”, of the journal Mathematics (MDPI). These papers are extended versions of the contributions presented in the conference “The 19th World Congress of the International Fuzzy Systems Association and the 12th Conference of the European Society for Fuzzy Logic and Technology jointly with the AGOP, IJCRS, and FQAS conferences”, which took place in Bratislava (Slovakia) from September 19 to September 24, 2021. Fuzzy Natural Logic (FNL) is a system of mathematical fuzzy logic theories that enables us to model natural language terms and rules while accounting for their inherent vagueness and allows us to reason and argue using the tools developed in them. FNL includes, among others, the theory of evaluative linguistic expressions (e.g., small, very large, etc.), the theory of fuzzy and intermediate quantifiers (e.g., most, few, many, etc.), and the theory of fuzzy/linguistic IF–THEN rules and logical inference. The papers in this Special Issue use the various aspects and concepts of FNL mentioned above and apply them to a wide range of problems both theoretically and practically oriented. This book will be of interest for researchers working in the areas of fuzzy logic, applied linguistics, generalized quantifiers, and their applications

    A novel analysis of utility in privacy pipelines, using Kronecker products and quantitative information flow

    Full text link
    We combine Kronecker products, and quantitative information flow, to give a novel formal analysis for the fine-grained verification of utility in complex privacy pipelines. The combination explains a surprising anomaly in the behaviour of utility of privacy-preserving pipelines -- that sometimes a reduction in privacy results also in a decrease in utility. We use the standard measure of utility for Bayesian analysis, introduced by Ghosh at al., to produce tractable and rigorous proofs of the fine-grained statistical behaviour leading to the anomaly. More generally, we offer the prospect of formal-analysis tools for utility that complement extant formal analyses of privacy. We demonstrate our results on a number of common privacy-preserving designs

    Revisiting File Context for Source Code Summarization

    Full text link
    Source code summarization is the task of writing natural language descriptions of source code. A typical use case is generating short summaries of subroutines for use in API documentation. The heart of almost all current research into code summarization is the encoder-decoder neural architecture, and the encoder input is almost always a single subroutine or other short code snippet. The problem with this setup is that the information needed to describe the code is often not present in the code itself -- that information often resides in other nearby code. In this paper, we revisit the idea of ``file context'' for code summarization. File context is the idea of encoding select information from other subroutines in the same file. We propose a novel modification of the Transformer architecture that is purpose-built to encode file context and demonstrate its improvement over several baselines. We find that file context helps on a subset of challenging examples where traditional approaches struggle.Comment: 27 pages + references, Under peer revie

    2023-2024 Boise State University Undergraduate Catalog

    Get PDF
    This catalog is primarily for and directed at students. However, it serves many audiences, such as high school counselors, academic advisors, and the public. In this catalog you will find an overview of Boise State University and information on admission, registration, grades, tuition and fees, financial aid, housing, student services, and other important policies and procedures. However, most of this catalog is devoted to describing the various programs and courses offered at Boise State

    Integration of heterogeneous data sources and automated reasoning in healthcare and domotic IoT systems

    Get PDF
    In recent years, IoT technology has radically transformed many crucial industrial and service sectors such as healthcare. The multi-facets heterogeneity of the devices and the collected information provides important opportunities to develop innovative systems and services. However, the ubiquitous presence of data silos and the poor semantic interoperability in the IoT landscape constitute a significant obstacle in the pursuit of this goal. Moreover, achieving actionable knowledge from the collected data requires IoT information sources to be analysed using appropriate artificial intelligence techniques such as automated reasoning. In this thesis work, Semantic Web technologies have been investigated as an approach to address both the data integration and reasoning aspect in modern IoT systems. In particular, the contributions presented in this thesis are the following: (1) the IoT Fitness Ontology, an OWL ontology that has been developed in order to overcome the issue of data silos and enable semantic interoperability in the IoT fitness domain; (2) a Linked Open Data web portal for collecting and sharing IoT health datasets with the research community; (3) a novel methodology for embedding knowledge in rule-defined IoT smart home scenarios; and (4) a knowledge-based IoT home automation system that supports a seamless integration of heterogeneous devices and data sources

    Understanding the Code of Life: Holistic Conceptual Modeling of the Genome

    Full text link
    [ES] En las últimas décadas, los avances en la tecnología de secuenciación han producido cantidades significativas de datos genómicos, hecho que ha revolucionado nuestra comprensión de la biología. Sin embargo, la cantidad de datos generados ha superado con creces nuestra capacidad para interpretarlos. Descifrar el código de la vida es un gran reto. A pesar de los numerosos avances realizados, nuestra comprensión del mismo sigue siendo mínima, y apenas estamos empezando a descubrir todo su potencial, por ejemplo, en áreas como la medicina de precisión o la farmacogenómica. El objetivo principal de esta tesis es avanzar en nuestra comprensión de la vida proponiendo una aproximación holística mediante un enfoque basado en modelos que consta de tres artefactos: i) un esquema conceptual del genoma, ii) un método para su aplicación en el mundo real, y iii) el uso de ontologías fundacionales para representar el conocimiento del dominio de una forma más precisa y explícita. Las dos primeras contribuciones se han validado mediante la implementación de sistemas de información genómicos basados en modelos conceptuales. La tercera contribución se ha validado mediante experimentos empíricos que han evaluado si el uso de ontologías fundacionales conduce a una mejor comprensión del dominio genómico. Los artefactos generados ofrecen importantes beneficios. En primer lugar, se han generado procesos de gestión de datos más eficientes, lo que ha permitido mejorar los procesos de extracción de conocimientos. En segundo lugar, se ha logrado una mejor comprensión y comunicación del dominio.[CA] En les últimes dècades, els avanços en la tecnologia de seqüenciació han produït quantitats significatives de dades genòmiques, fet que ha revolucionat la nostra comprensió de la biologia. No obstant això, la quantitat de dades generades ha superat amb escreix la nostra capacitat per a interpretar-los. Desxifrar el codi de la vida és un gran repte. Malgrat els nombrosos avanços realitzats, la nostra comprensió del mateix continua sent mínima, i a penes estem començant a descobrir tot el seu potencial, per exemple, en àrees com la medicina de precisió o la farmacogenómica. L'objectiu principal d'aquesta tesi és avançar en la nostra comprensió de la vida proposant una aproximació holística mitjançant un enfocament basat en models que consta de tres artefactes: i) un esquema conceptual del genoma, ii) un mètode per a la seua aplicació en el món real, i iii) l'ús d'ontologies fundacionals per a representar el coneixement del domini d'una forma més precisa i explícita. Les dues primeres contribucions s'han validat mitjançant la implementació de sistemes d'informació genòmics basats en models conceptuals. La tercera contribució s'ha validat mitjançant experiments empírics que han avaluat si l'ús d'ontologies fundacionals condueix a una millor comprensió del domini genòmic. Els artefactes generats ofereixen importants beneficis. En primer lloc, s'han generat processos de gestió de dades més eficients, la qual cosa ha permés millorar els processos d'extracció de coneixements. En segon lloc, s'ha aconseguit una millor comprensió i comunicació del domini.[EN] Over the last few decades, advances in sequencing technology have produced significant amounts of genomic data, which has revolutionised our understanding of biology. However, the amount of data generated has far exceeded our ability to interpret it. Deciphering the code of life is a grand challenge. Despite our progress, our understanding of it remains minimal, and we are just beginning to uncover its full potential, for instance, in areas such as precision medicine or pharmacogenomics. The main objective of this thesis is to advance our understanding of life by proposing a holistic approach, using a model-based approach, consisting of three artifacts: i) a conceptual schema of the genome, ii) a method for its application in the real-world, and iii) the use of foundational ontologies to represent domain knowledge in a more unambiguous and explicit way. The first two contributions have been validated by implementing genome information systems based on conceptual models. The third contribution has been validated by empirical experiments assessing whether using foundational ontologies leads to a better understanding of the genomic domain. The artifacts generated offer significant benefits. First, more efficient data management processes were produced, leading to better knowledge extraction processes. Second, a better understanding and communication of the domain was achieved.Las fructíferas discusiones y los resultados derivados de los proyectos INNEST2021 /57, MICIN/AEI/10.13039/501100011033, PID2021-123824OB-I00, CIPROM/2021/023 y PDC2021- 121243-I00 han contribuido en gran medida a la calidad final de este tesis.García Simón, A. (2022). Understanding the Code of Life: Holistic Conceptual Modeling of the Genome [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/19143

    Blockchain Technology: Disruptor or Enhnancer to the Accounting and Auditing Profession

    Get PDF
    The unique features of blockchain technology (BCT) - peer-to-peer network, distribution ledger, consensus decision-making, transparency, immutability, auditability, and cryptographic security - coupled with the success enjoyed by Bitcoin and other cryptocurrencies have encouraged many to assume that the technology would revolutionise virtually all aspects of business. A growing body of scholarship suggests that BCT would disrupt the accounting and auditing fields by changing accounting practices, disintermediating auditors, and eliminating financial fraud. BCT disrupts audits (Lombard et al.,2021), reduces the role of audit firms (Yermack 2017), undermines accountants' roles with software developers and miners (Fortin & Pimentel 2022); eliminates many management functions, transforms businesses (Tapscott & Tapscott, 2017), facilitates a triple-entry accounting system (Cai, 2021), and prevents fraudulent transactions (Dai, et al., 2017; Rakshit et al., 2022). Despite these speculations, scholars have acknowledged that the application of BCT in the accounting and assurance industry is underexplored and many existing studies are said to lack engagement with practitioners (Dai & Vasarhelyi, 2017; Lombardi et al., 2021; Schmitz & Leoni, 2019). This study empirically explored whether BCT disrupts or enhances accounting and auditing fields. It also explored the relevance of audit in a BCT environment and the effectiveness of the BCT mechanism for fraud prevention and detection. The study further examined which technical skillsets accountants and auditors require in a BCT environment, and explored the incentives, barriers, and unintended consequences of the adoption of BCT in the accounting and auditing professions. The current COVID-19 environment was also investigated in terms of whether the pandemic has improved BCT adoption or not. A qualitative exploratory study used semi-structured interviews to engage practitioners from blockchain start-ups, IT experts, financial analysts, accountants, auditors, academics, organisational leaders, consultants, and editors who understood the technology. With the aid of NVIVO qualitative analysis software, the views of 44 participants from 13 countries: New Zealand, Australia, United States, United Kingdom, Canada, Germany, Italy, Ireland, Hong Kong, India, Pakistan, United Arab Emirates, and South Africa were analysed. The Technological, Organisational, and Environmental (TOE) framework with consequences of innovation context was adopted for this study. This expanded TOE framework was used as the theoretical lens to understand the disruption of BCT and its adoption in the accounting and auditing fields. Four clear patterns emerged. First, BCT is an emerging tool that accountants and auditors use mainly to analyse financial records because technology cannot disintermediate auditors from the financial system. Second, the technology can detect anomalies but cannot prevent financial fraud. Third, BCT has not been adopted by any organisation for financial reporting and accounting purposes, and accountants and auditors do not require new skillsets or an understanding of the BCT programming language to be able to operate in a BCT domain. Fourth, the advent of COVID-19 has not substantially enhanced the adoption of BCT. Additionally, this study highlights the incentives, barriers, and unintended consequences of adopting BCT as financial technology (FinTech). These findings shed light on important questions about BCT disrupting and disintermediating auditors, the extent of adoption in the accounting industry, preventing fraud and anomalies, and underscores the notion that blockchain, as an emerging technology, currently does not appear to be substantially disrupting the accounting and auditing profession. This study makes methodological, theoretical, and practical contributions. At the methodological level, the study adopted the social constructivist-interpretivism paradigm with an exploratory qualitative method to engage and understand BCT as a disruptive innovation in the accounting industry. The engagement with practitioners from diverse fields, professions, and different countries provides a distinctive and innovative contribution to methodological and practical knowledge. At the theoretical level, the findings contribute to the literature by offering an integrated conceptual TOE framework. The framework offers a reference for practitioners, academics and policymakers seeking to appraise comprehensive factors influencing BCT adoption and its likely unintended consequences. The findings suggest that, at present, no organisations are using BCT for financial reporting and accounting systems. This study contributes to practice by highlighting the differences between initial expectations and practical applications of what BCT can do in the accounting and auditing fields. The study could not find any empirical evidence that BCT will disrupt audits, eliminate the roles of auditors in a financial system, and prevent and detect financial fraud. Also, there was no significant evidence that accountants and auditors required higher-level skillsets and an understanding of BCT programming language to be able to use the technology. Future research should consider the implications of an external audit firm as a node in a BCT network on the internal audit functions. It is equally important to critically examine the relevance of including programming languages or codes in the curriculum of undergraduate accounting students. Future research could also empirically evaluate if a BCT-enabled triple-entry system could prevent financial statements and management fraud
    corecore