11,956 research outputs found

    Converging organoids and extracellular matrix::New insights into liver cancer biology

    Get PDF

    The value and structuring role of web APIs in digital innovation ecosystems: the case of the online travel ecosystem

    Get PDF
    Interfaces play a key role in facilitating the integration of external sources of innovation and structuring ecosystems. They have been conceptualized as design rules that ensure the interoperability of independently produced modules, with important strategic value for lead firms to attract and control access to complementary assets in platform ecosystems. While meaningful, these theorizations do not fully capture the value and structuring role of web APIs in digital innovation ecosystems. We show this with an empirical study of the online travel ecosystem in the 26 years (1995–2021) after the first Online Travel Agencies (OTAs) were launched. Our findings reveal that web APIs foster a dynamic digital innovation ecosystem with a distributed networked structure in which multiple actors design and use them. We provide evidence of an ecosystem where decentralized interfaces enable decentralized governance and where interfaces establish not only cooperative relationships, but also competitive ones. Instead of locking in complementors, web APIs enable the integration of capabilities from multiple organizations for the co-production of services and products, by interfacing their information systems. Web APIs are important sources of value creation and capture, increasingly being used to offer or sell services, constituting important sources of revenue

    A cost focused framework for optimizing collection and annotation of ultrasound datasets

    Get PDF
    Machine learning for medical ultrasound imaging encounters a major challenge: the prohibitive costs of producing and annotating clinical data. The issue of cost vs size is well understood in the context of clinical trials. These same methods can be applied to optimize the data collection and annotation process, ultimately reducing machine learning project cost and times in feasibility studies. This paper presents a two-phase framework for quantifying the cost of data collection using iterative accuracy/sample size predictions and active learning to guide/optimize full human annotation in medical ultrasound imaging for machine learning purposes. The paper demonstrated potential cost reductions using public breast, fetal, and lung ultrasound datasets and a practical case study on Breast Ultrasound. The results show that just as with clinical trials, the relationship between dataset size and final accuracy can be predicted, with the majority of accuracy improvements occurring using only 40-50% of the data dependent on tolerance measure. Manual annotation can be reduced further using active learning, resulting in a representative cost reduction of 66% with a tolerance measure of around 4% accuracy drop from theoretical maximums. The significance of this work lies in its ability to quantify how much additional data and annotation will be required to achieve a specific research objective. These methods are already well understood by clinical funders and so provide a valuable and effective framework for feasibility and pilot studies where machine learning will be applied within a fixed budget to maximize predictive gains, informing resourcing and further clinical study

    HOFA: Twitter Bot Detection with Homophily-Oriented Augmentation and Frequency Adaptive Attention

    Full text link
    Twitter bot detection has become an increasingly important and challenging task to combat online misinformation, facilitate social content moderation, and safeguard the integrity of social platforms. Though existing graph-based Twitter bot detection methods achieved state-of-the-art performance, they are all based on the homophily assumption, which assumes users with the same label are more likely to be connected, making it easy for Twitter bots to disguise themselves by following a large number of genuine users. To address this issue, we proposed HOFA, a novel graph-based Twitter bot detection framework that combats the heterophilous disguise challenge with a homophily-oriented graph augmentation module (Homo-Aug) and a frequency adaptive attention module (FaAt). Specifically, the Homo-Aug extracts user representations and computes a k-NN graph using an MLP and improves Twitter's homophily by injecting the k-NN graph. For the FaAt, we propose an attention mechanism that adaptively serves as a low-pass filter along a homophilic edge and a high-pass filter along a heterophilic edge, preventing user features from being over-smoothed by their neighborhood. We also introduce a weight guidance loss to guide the frequency adaptive attention module. Our experiments demonstrate that HOFA achieves state-of-the-art performance on three widely-acknowledged Twitter bot detection benchmarks, which significantly outperforms vanilla graph-based bot detection techniques and strong heterophilic baselines. Furthermore, extensive studies confirm the effectiveness of our Homo-Aug and FaAt module, and HOFA's ability to demystify the heterophilous disguise challenge.Comment: 11 pages, 7 figure

    Machine learning in solar physics

    Full text link
    The application of machine learning in solar physics has the potential to greatly enhance our understanding of the complex processes that take place in the atmosphere of the Sun. By using techniques such as deep learning, we are now in the position to analyze large amounts of data from solar observations and identify patterns and trends that may not have been apparent using traditional methods. This can help us improve our understanding of explosive events like solar flares, which can have a strong effect on the Earth environment. Predicting hazardous events on Earth becomes crucial for our technological society. Machine learning can also improve our understanding of the inner workings of the sun itself by allowing us to go deeper into the data and to propose more complex models to explain them. Additionally, the use of machine learning can help to automate the analysis of solar data, reducing the need for manual labor and increasing the efficiency of research in this field.Comment: 100 pages, 13 figures, 286 references, accepted for publication as a Living Review in Solar Physics (LRSP

    SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient

    Full text link
    Many deep learning applications benefit from using large models with billions of parameters. Training these models is notoriously expensive due to the need for specialized HPC clusters. In this work, we consider alternative setups for training large models: using cheap "preemptible" instances or pooling existing resources from multiple regions. We analyze the performance of existing model-parallel algorithms in these conditions and find configurations where training larger models becomes less communication-intensive. Based on these findings, we propose SWARM parallelism, a model-parallel training algorithm designed for poorly connected, heterogeneous and unreliable devices. SWARM creates temporary randomized pipelines between nodes that are rebalanced in case of failure. We empirically validate our findings and compare SWARM parallelism with existing large-scale training approaches. Finally, we combine our insights with compression strategies to train a large Transformer language model with 1B shared parameters (approximately 13B before sharing) on preemptible T4 GPUs with less than 200Mb/s network.Comment: Accepted to International Conference on Machine Learning (ICML) 2023. 25 pages, 8 figure

    An investigation of entorhinal spatial representations in self-localisation behaviours

    Get PDF
    Spatial-modulated cells of the medial entorhinal cortex (MEC) and neighbouring cortices are thought to provide the neural substrate for self-localisation behaviours. These cells include grid cells of the MEC which are thought to compute path integration operations to update self-location estimates. In order to read this grid code, downstream cells are thought to reconstruct a positional estimate as a simple rate-coded representation of space. Here, I show the coding scheme of grid cell and putative readout cells recorded from mice performing a virtual reality (VR) linear location task which engaged mice in both beaconing and path integration behaviours. I found grid cells can encode two unique coding schemes on the linear track, namely a position code which reflects periodic grid fields anchored to salient features of the track and a distance code which reflects periodic grid fields without this anchoring. Grid cells were found to switch between these coding schemes within sessions. When grid cells were encoding position, mice performed better at trials that required path integration but not on trials that required beaconing. This result provides the first mechanistic evidence linking grid cell activity to path integration-dependent behaviour. Putative readout cells were found in the form of ramp cells which fire proportionally as a function of location in defined regions of the linear track. This ramping activity was found to be primarily explained by track position rather than other kinematic variables like speed and acceleration. These representations were found to be maintained across both trial types and outcomes indicating they likely result from recall of the track structure. Together, these results support the functional importance of grid and ramp cells for self-localisation behaviours. Future investigations will look into the coherence between these two neural populations, which may together form a complete neural system for coding and decoding self-location in the brain

    Towards A Practical High-Assurance Systems Programming Language

    Full text link
    Writing correct and performant low-level systems code is a notoriously demanding job, even for experienced developers. To make the matter worse, formally reasoning about their correctness properties introduces yet another level of complexity to the task. It requires considerable expertise in both systems programming and formal verification. The development can be extremely costly due to the sheer complexity of the systems and the nuances in them, if not assisted with appropriate tools that provide abstraction and automation. Cogent is designed to alleviate the burden on developers when writing and verifying systems code. It is a high-level functional language with a certifying compiler, which automatically proves the correctness of the compiled code and also provides a purely functional abstraction of the low-level program to the developer. Equational reasoning techniques can then be used to prove functional correctness properties of the program on top of this abstract semantics, which is notably less laborious than directly verifying the C code. To make Cogent a more approachable and effective tool for developing real-world systems, we further strengthen the framework by extending the core language and its ecosystem. Specifically, we enrich the language to allow users to control the memory representation of algebraic data types, while retaining the automatic proof with a data layout refinement calculus. We repurpose existing tools in a novel way and develop an intuitive foreign function interface, which provides users a seamless experience when using Cogent in conjunction with native C. We augment the Cogent ecosystem with a property-based testing framework, which helps developers better understand the impact formal verification has on their programs and enables a progressive approach to producing high-assurance systems. Finally we explore refinement type systems, which we plan to incorporate into Cogent for more expressiveness and better integration of systems programmers with the verification process

    Latent-OFER: Detect, Mask, and Reconstruct with Latent Vectors for Occluded Facial Expression Recognition

    Full text link
    Most research on facial expression recognition (FER) is conducted in highly controlled environments, but its performance is often unacceptable when applied to real-world situations. This is because when unexpected objects occlude the face, the FER network faces difficulties extracting facial features and accurately predicting facial expressions. Therefore, occluded FER (OFER) is a challenging problem. Previous studies on occlusion-aware FER have typically required fully annotated facial images for training. However, collecting facial images with various occlusions and expression annotations is time-consuming and expensive. Latent-OFER, the proposed method, can detect occlusions, restore occluded parts of the face as if they were unoccluded, and recognize them, improving FER accuracy. This approach involves three steps: First, the vision transformer (ViT)-based occlusion patch detector masks the occluded position by training only latent vectors from the unoccluded patches using the support vector data description algorithm. Second, the hybrid reconstruction network generates the masking position as a complete image using the ViT and convolutional neural network (CNN). Last, the expression-relevant latent vector extractor retrieves and uses expression-related information from all latent vectors by applying a CNN-based class activation map. This mechanism has a significant advantage in preventing performance degradation from occlusion by unseen objects. The experimental results on several databases demonstrate the superiority of the proposed method over state-of-the-art methods.Comment: 11 pages, 8 figure

    The Globalization of Artificial Intelligence: African Imaginaries of Technoscientific Futures

    Get PDF
    Imaginaries of artificial intelligence (AI) have transcended geographies of the Global North and become increasingly entangled with narratives of economic growth, progress, and modernity in Africa. This raises several issues such as the entanglement of AI with global technoscientific capitalism and its impact on the dissemination of AI in Africa. The lack of African perspectives on the development of AI exacerbates concerns of raciality and inclusion in the scientific research, circulation, and adoption of AI. My argument in this dissertation is that innovation in AI, in both its sociotechnical imaginaries and political economies, excludes marginalized countries, nations and communities in ways that not only bar their participation in the reception of AI, but also as being part and parcel of its creation. Underpinned by decolonial thinking, and perspectives from science and technology studies and African studies, this dissertation looks at how AI is reconfiguring the debate about development and modernization in Africa and the implications for local sociotechnical practices of AI innovation and governance. I examined AI in international development and industry across Kenya, Ghana, and Nigeria, by tracing Canada’s AI4D Africa program and following AI start-ups at AfriLabs. I used multi-sited case studies and discourse analysis to examine the data collected from interviews, participant observations, and documents. In the empirical chapters, I first examine how local actors understand the notion of decolonizing AI and show that it has become a sociotechnical imaginary. I then investigate the political economy of AI in Africa and argue that despite Western efforts to integrate the African AI ecosystem globally, the AI epistemic communities in the continent continue to be excluded from dominant AI innovation spaces. Finally, I examine the emergence of a Pan-African AI imaginary and argue that AI governance can be understood as a state-building experiment in post-colonial Africa. The main issue at stake is that the lack of African perspectives in AI leads to negative impacts on innovation and limits the fair distribution of the benefits of AI across nations, countries, and communities, while at the same time excludes globally marginalized epistemic communities from the imagination and creation of AI
    • …
    corecore