10,550 research outputs found

    UMSL Bulletin 2023-2024

    Get PDF
    The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp

    MolFM: A Multimodal Molecular Foundation Model

    Full text link
    Molecular knowledge resides within three different modalities of information sources: molecular structures, biomedical documents, and knowledge bases. Effective incorporation of molecular knowledge from these modalities holds paramount significance in facilitating biomedical research. However, existing multimodal molecular foundation models exhibit limitations in capturing intricate connections between molecular structures and texts, and more importantly, none of them attempt to leverage a wealth of molecular expertise derived from knowledge graphs. In this study, we introduce MolFM, a multimodal molecular foundation model designed to facilitate joint representation learning from molecular structures, biomedical texts, and knowledge graphs. We propose cross-modal attention between atoms of molecular structures, neighbors of molecule entities and semantically related texts to facilitate cross-modal comprehension. We provide theoretical analysis that our cross-modal pre-training captures local and global molecular knowledge by minimizing the distance in the feature space between different modalities of the same molecule, as well as molecules sharing similar structures or functions. MolFM achieves state-of-the-art performance on various downstream tasks. On cross-modal retrieval, MolFM outperforms existing models with 12.13% and 5.04% absolute gains under the zero-shot and fine-tuning settings, respectively. Furthermore, qualitative analysis showcases MolFM's implicit ability to provide grounding from molecular substructures and knowledge graphs. Code and models are available on https://github.com/BioFM/OpenBioMed.Comment: 31 pages, 15 figures, and 15 table

    Question Decomposition Tree for Answering Complex Questions over Knowledge Bases

    Full text link
    Knowledge base question answering (KBQA) has attracted a lot of interest in recent years, especially for complex questions which require multiple facts to answer. Question decomposition is a promising way to answer complex questions. Existing decomposition methods split the question into sub-questions according to a single compositionality type, which is not sufficient for questions involving multiple compositionality types. In this paper, we propose Question Decomposition Tree (QDT) to represent the structure of complex questions. Inspired by recent advances in natural language generation (NLG), we present a two-staged method called Clue-Decipher to generate QDT. It can leverage the strong ability of NLG model and simultaneously preserve the original questions. To verify that QDT can enhance KBQA task, we design a decomposition-based KBQA system called QDTQA. Extensive experiments show that QDTQA outperforms previous state-of-the-art methods on ComplexWebQuestions dataset. Besides, our decomposition method improves an existing KBQA system by 12% and sets a new state-of-the-art on LC-QuAD 1.0.Comment: Accepted by AAAI202

    Endogenous measures for contextualising large-scale social phenomena: a corpus-based method for mediated public discourse

    Get PDF
    This work presents an interdisciplinary methodology for developing endogenous measures of group membership through analysis of pervasive linguistic patterns in public discourse. Focusing on political discourse, this work critiques the conventional approach to the study of political participation, which is premised on decontextualised, exogenous measures to characterise groups. Considering the theoretical and empirical weaknesses of decontextualised approaches to large-scale social phenomena, this work suggests that contextualisation using endogenous measures might provide a complementary perspective to mitigate such weaknesses. This work develops a sociomaterial perspective on political participation in mediated discourse as affiliatory action performed through language. While the affiliatory function of language is often performed consciously (such as statements of identity), this work is concerned with unconscious features (such as patterns in lexis and grammar). This work argues that pervasive patterns in such features that emerge through socialisation are resistant to change and manipulation, and thus might serve as endogenous measures of sociopolitical contexts, and thus of groups. In terms of method, the work takes a corpus-based approach to the analysis of data from the Twitter messaging service whereby patterns in usersā€™ speech are examined statistically in order to trace potential community membership. The method is applied in the US state of Michigan during the second half of 2018ā€”6 November having been the date of midterm (i.e. non-Presidential) elections in the United States. The corpus is assembled from the original posts of 5,889 users, who are nominally geolocalised to 417 municipalities. These users are clustered according to pervasive language features. Comparing the linguistic clusters according to the municipalities they represent finds that there are regular sociodemographic differentials across clusters. This is understood as an indication of social structure, suggesting that endogenous measures derived from pervasive patterns in language may indeed offer a complementary, contextualised perspective on large-scale social phenomena

    Genomic Analysis of Antibiotics Resistance in Pathogens

    Get PDF
    The emergence of antibiotic-resistant pathogens currently represents a serious threat to public health and the economy. Due to antibiotic treatments in humans and veterinary medicine, prophylactic use and environmental contamination, bacteria are today more frequently exposed to unnatural doses of antibiotics and their selective effect.Antibiotic resistance can be encoded on chromosomes, plasmids, or other mobile genetic elements in bacteria. It may also result from mutations that lead to changes in the affinity of antibiotics for their targets or in the ability of antibiotics to act on bacterial growth or death. Exposure of bacteria, bacterial populations, and microbial communities to antibiotics at different concentrations shapes their genomic dynamics, as does the mobilisation and spread of resistance determinants. It is, therefore, essential to understand the dynamics and mobilisation of genes encoding antibiotic resistance, in human, animal, plant, and environmental microbiomes, through genomic and metagenomic approaches and bioinformatics analyses.This Special Issue gathers research publications on the horizontal transfer of antibiotic-resistance genes, their dissemination and epidemiology, their association with bacterial virulence, between bacterial genotypes and their phenotypes, and other related research topics

    Prompting GPT-3 To Be Reliable

    Full text link
    Large language models (LLMs) show impressive abilities via few-shot prompting. Commercialized APIs such as OpenAI GPT-3 further increase their use in real-world language applications. However, the crucial problem of how to improve the reliability of GPT-3 is still under-explored. While reliability is a broad and vaguely defined term, we decompose reliability into four main facets that correspond to the existing framework of ML safety and are well-recognized to be important: generalizability, social biases, calibration, and factuality. Our core contribution is to establish simple and effective prompts that improve GPT-3's reliability as it: 1) generalizes out-of-distribution, 2) balances demographic distribution and uses natural language instructions to reduce social biases, 3) calibrates output probabilities, and 4) updates the LLM's factual knowledge and reasoning chains. With appropriate prompts, GPT-3 is more reliable than smaller-scale supervised models on all these facets. We release all processed datasets, evaluation scripts, and model predictions. Our systematic empirical study not only sheds new insights on the reliability of prompting LLMs, but more importantly, our prompting strategies can help practitioners more reliably use LLMs like GPT-3.Comment: ICLR 202

    Blockchain Technology: Disruptor or Enhnancer to the Accounting and Auditing Profession

    Get PDF
    The unique features of blockchain technology (BCT) - peer-to-peer network, distribution ledger, consensus decision-making, transparency, immutability, auditability, and cryptographic security - coupled with the success enjoyed by Bitcoin and other cryptocurrencies have encouraged many to assume that the technology would revolutionise virtually all aspects of business. A growing body of scholarship suggests that BCT would disrupt the accounting and auditing fields by changing accounting practices, disintermediating auditors, and eliminating financial fraud. BCT disrupts audits (Lombard et al.,2021), reduces the role of audit firms (Yermack 2017), undermines accountants' roles with software developers and miners (Fortin & Pimentel 2022); eliminates many management functions, transforms businesses (Tapscott & Tapscott, 2017), facilitates a triple-entry accounting system (Cai, 2021), and prevents fraudulent transactions (Dai, et al., 2017; Rakshit et al., 2022). Despite these speculations, scholars have acknowledged that the application of BCT in the accounting and assurance industry is underexplored and many existing studies are said to lack engagement with practitioners (Dai & Vasarhelyi, 2017; Lombardi et al., 2021; Schmitz & Leoni, 2019). This study empirically explored whether BCT disrupts or enhances accounting and auditing fields. It also explored the relevance of audit in a BCT environment and the effectiveness of the BCT mechanism for fraud prevention and detection. The study further examined which technical skillsets accountants and auditors require in a BCT environment, and explored the incentives, barriers, and unintended consequences of the adoption of BCT in the accounting and auditing professions. The current COVID-19 environment was also investigated in terms of whether the pandemic has improved BCT adoption or not. A qualitative exploratory study used semi-structured interviews to engage practitioners from blockchain start-ups, IT experts, financial analysts, accountants, auditors, academics, organisational leaders, consultants, and editors who understood the technology. With the aid of NVIVO qualitative analysis software, the views of 44 participants from 13 countries: New Zealand, Australia, United States, United Kingdom, Canada, Germany, Italy, Ireland, Hong Kong, India, Pakistan, United Arab Emirates, and South Africa were analysed. The Technological, Organisational, and Environmental (TOE) framework with consequences of innovation context was adopted for this study. This expanded TOE framework was used as the theoretical lens to understand the disruption of BCT and its adoption in the accounting and auditing fields. Four clear patterns emerged. First, BCT is an emerging tool that accountants and auditors use mainly to analyse financial records because technology cannot disintermediate auditors from the financial system. Second, the technology can detect anomalies but cannot prevent financial fraud. Third, BCT has not been adopted by any organisation for financial reporting and accounting purposes, and accountants and auditors do not require new skillsets or an understanding of the BCT programming language to be able to operate in a BCT domain. Fourth, the advent of COVID-19 has not substantially enhanced the adoption of BCT. Additionally, this study highlights the incentives, barriers, and unintended consequences of adopting BCT as financial technology (FinTech). These findings shed light on important questions about BCT disrupting and disintermediating auditors, the extent of adoption in the accounting industry, preventing fraud and anomalies, and underscores the notion that blockchain, as an emerging technology, currently does not appear to be substantially disrupting the accounting and auditing profession. This study makes methodological, theoretical, and practical contributions. At the methodological level, the study adopted the social constructivist-interpretivism paradigm with an exploratory qualitative method to engage and understand BCT as a disruptive innovation in the accounting industry. The engagement with practitioners from diverse fields, professions, and different countries provides a distinctive and innovative contribution to methodological and practical knowledge. At the theoretical level, the findings contribute to the literature by offering an integrated conceptual TOE framework. The framework offers a reference for practitioners, academics and policymakers seeking to appraise comprehensive factors influencing BCT adoption and its likely unintended consequences. The findings suggest that, at present, no organisations are using BCT for financial reporting and accounting systems. This study contributes to practice by highlighting the differences between initial expectations and practical applications of what BCT can do in the accounting and auditing fields. The study could not find any empirical evidence that BCT will disrupt audits, eliminate the roles of auditors in a financial system, and prevent and detect financial fraud. Also, there was no significant evidence that accountants and auditors required higher-level skillsets and an understanding of BCT programming language to be able to use the technology. Future research should consider the implications of an external audit firm as a node in a BCT network on the internal audit functions. It is equally important to critically examine the relevance of including programming languages or codes in the curriculum of undergraduate accounting students. Future research could also empirically evaluate if a BCT-enabled triple-entry system could prevent financial statements and management fraud

    Improving Dual-Encoder Training through Dynamic Indexes for Negative Mining

    Full text link
    Dual encoder models are ubiquitous in modern classification and retrieval. Crucial for training such dual encoders is an accurate estimation of gradients from the partition function of the softmax over the large output space; this requires finding negative targets that contribute most significantly ("hard negatives"). Since dual encoder model parameters change during training, the use of traditional static nearest neighbor indexes can be sub-optimal. These static indexes (1) periodically require expensive re-building of the index, which in turn requires (2) expensive re-encoding of all targets using updated model parameters. This paper addresses both of these challenges. First, we introduce an algorithm that uses a tree structure to approximate the softmax with provable bounds and that dynamically maintains the tree. Second, we approximate the effect of a gradient update on target encodings with an efficient Nystrom low-rank approximation. In our empirical study on datasets with over twenty million targets, our approach cuts error by half in relation to oracle brute-force negative mining. Furthermore, our method surpasses prior state-of-the-art while using 150x less accelerator memory.Comment: To appear at AISTATS 202

    ON EXPRESSIVENESS, INFERENCE, AND PARAMETER ESTIMATION OF DISCRETE SEQUENCE MODELS

    Get PDF
    Huge neural autoregressive sequence models have achieved impressive performance across different applications, such as NLP, reinforcement learning, and bioinformatics. However, some lingering problems (e.g., consistency and coherency of generated texts) continue to exist, regardless of the parameter count. In the first part of this thesis, we chart a taxonomy of the expressiveness of various sequence model families (Ch 3). In particular, we put forth complexity-theoretic proofs that string latent-variable sequence models are strictly more expressive than energy-based sequence models, which in turn are more expressive than autoregressive sequence models. Based on these findings, we introduce residual energy-based sequence models, a family of energy-based sequence models (Ch 4) whose sequence weights can be evaluated efficiently, and also perform competitively against autoregressive models. However, we show how unrestricted energy-based sequence models can suffer from uncomputability; and how such a problem is generally unfixable without knowledge of the true sequence distribution (Ch 5). In the second part of the thesis, we study practical sequence model families and algorithms based on theoretical findings in the first part of the thesis. We introduce neural particle smoothing (Ch 6), a family of approximate sampling methods that work with conditional latent variable models. We also introduce neural finite-state transducers (Ch 7), which extend weighted finite state transducers with the introduction of mark strings, allowing scoring transduction paths in a finite state transducer with a neural network. Finally, we propose neural regular expressions (Ch 8), a family of neural sequence models that are easy to engineer, allowing a user to design flexible weighted relations using Marked FSTs, and combine these weighted relations together with various operations
    • ā€¦
    corecore