108 research outputs found
Streaming Matching and Edge Cover in Practice
Graph algorithms with polynomial space and time requirements often become infeasible for massive graphs with billions of edges or more. State-of-the-art approaches therefore employ approximate serial, parallel, and distributed algorithms to tackle these challenges. However, such approaches require storing the entire graph in memory and thus need access to costly computing resources such as clusters and supercomputers. In this paper, we present practical streaming approaches for solving massive graph problems using limited memory for two prototypical graph problems: maximum weighted matching and minimum weighted edge cover. For matching, we conduct a thorough computational study on two of the semi-streaming algorithms including a recent breakthrough result that achieves a 1/(2+ε)-approximation of the weight while using O(n log W /ε) memory (here n is the number of vertices and W is the maximum edge weight), designed by Paz and Schwartzman [SODA, 2017]. Empirically, we show that the semi-streaming algorithms produce matchings whose weight is close to the best 1/2-approximate offline algorithm while requiring less time and an order-of-magnitude less memory.
For minimum weighted edge cover, we develop three novel semi-streaming algorithms. Two of these algorithms require a single pass through the input graph, require O(n log n) memory, and provide a 2-approximation guarantee on the objective. We also leverage a relationship between approximate maximum weighted matching and approximate minimum weighted edge cover to develop a two-pass 3/2+ε-approximate algorithm with the memory requirement of Paz and Schwartzman’s semi-streaming matching algorithm. These streaming approaches are compared against the state-of-the-art 3/2-approximate offline algorithm.
The semi-streaming matching and the novel edge cover algorithms proposed in this paper can process graphs with several billions of edges in under 30 minutes using 6 GB of memory, which is at least an order of magnitude improvement from the offline (non-streaming) algorithms. For the largest graph, the best alternative offline parallel approximation algorithm (GPA+ROMA) could not finish in three hours even while employing hundreds of processors and 1 TB of memory. We also demonstrate an application of semi-streaming algorithm by computing a matching using linearly bounded memory on intersection graphs derived from three machine learning datasets, while the existing offline algorithms could not complete on one of these datasets since its memory requirement exceeded 1TB
Approximate Bipartite -Matching using Multiplicative Auction
Given a bipartite graph with vertices and edges
and a function , a -matching is a subset of
edges such that every vertex is incident to at most edges in
the subset. When we are also given edge weights, the Max Weight -Matching
problem is to find a -matching of maximum weight, which is a fundamental
combinatorial optimization problem with many applications. Extending on the
recent work of Zheng and Henzinger (IPCO, 2023) on standard bipartite matching
problems, we develop a simple auction algorithm to approximately solve Max
Weight -Matching. Specifically, we present a multiplicative auction
algorithm that gives a -approximation in worst case time, where
the maximum -value. Although this is a factor greater
than the current best approximation algorithm by Huang and Pettie
(Algorithmica, 2022), it is considerably simpler to present, analyze, and
implement.Comment: 14 pages; Accepted as a refereed paper in the 2024 INFORMS
Optimization Society conferenc
Semi-Streaming Algorithms for Weighted k-Disjoint Matchings
We design and implement two single-pass semi-streaming algorithms for the maximum weight k-disjoint matching (k-DM) problem. Given an integer k, the k-DM problem is to find k pairwise edge-disjoint matchings such that the sum of the weights of the matchings is maximized. For k ≥ 2, this problem is NP-hard. Our first algorithm is based on the primal-dual framework of a linear programming relaxation of the problem and is 1/(3+ε)-approximate. We also develop an approximation preserving reduction from k-DM to the maximum weight b-matching problem. Leveraging this reduction and an existing semi-streaming b-matching algorithm, we design a (1/(2+ε))(1 - 1/(k+1))-approximate semi-streaming algorithm for k-DM. For any constant ε > 0, both of these algorithms require O(nk log_{1+ε}² n) bits of space. To the best of our knowledge, this is the first study of semi-streaming algorithms for the k-DM problem.
We compare our two algorithms to state-of-the-art offline algorithms on 95 real-world and synthetic test problems, including thirteen graphs generated from data center network traces. On these instances, our streaming algorithms used significantly less memory (ranging from 6× to 512× less) and were faster in runtime than the offline algorithms. Our solutions were often within 5% of the best weights from the offline algorithms. We highlight that the existing offline algorithms run out of 1 TB memory for most of the large instances (> 1 billion edges), whereas our streaming algorithms can solve these problems using only 100 GB memory for k = 8
AGS-GNN: Attribute-guided Sampling for Graph Neural Networks
We propose AGS-GNN, a novel attribute-guided sampling algorithm for Graph
Neural Networks (GNNs) that exploits node features and connectivity structure
of a graph while simultaneously adapting for both homophily and heterophily in
graphs. (In homophilic graphs vertices of the same class are more likely to be
connected, and vertices of different classes tend to be linked in heterophilic
graphs.) While GNNs have been successfully applied to homophilic graphs, their
application to heterophilic graphs remains challenging. The best-performing
GNNs for heterophilic graphs do not fit the sampling paradigm, suffer high
computational costs, and are not inductive. We employ samplers based on
feature-similarity and feature-diversity to select subsets of neighbors for a
node, and adaptively capture information from homophilic and heterophilic
neighborhoods using dual channels. Currently, AGS-GNN is the only algorithm
that we know of that explicitly controls homophily in the sampled subgraph
through similar and diverse neighborhood samples. For diverse neighborhood
sampling, we employ submodularity, which was not used in this context prior to
our work. The sampling distribution is pre-computed and highly parallel,
achieving the desired scalability. Using an extensive dataset consisting of 35
small ( 100K nodes) and large (>100K nodes) homophilic and heterophilic
graphs, we demonstrate the superiority of AGS-GNN compare to the current
approaches in the literature. AGS-GNN achieves comparable test accuracy to the
best-performing heterophilic GNNs, even outperforming methods using the entire
graph for node classification. AGS-GNN also converges faster compared to
methods that sample neighborhoods randomly, and can be incorporated into
existing GNN models that employ node or graph sampling.Comment: The paper has been accepted to KDD'24 in the research trac
Transcriptomic Analysis Reveals Novel Mechanistic Insight into Murine Biological Responses to Multi-Walled Carbon Nanotubes in Lungs and Cultured Lung Epithelial Cells
There is great interest in substituting animal work with in vitro experimentation in human health risk assessment; however, there are only few comparisons of in vitro and in vivo biological responses to engineered nanomaterials. We used high-content genomics tools to compare in vivo pulmonary responses of multiwalled carbon nanotubes (MWCNT) to those in vitro in cultured lung epithelial cells (FE1) at the global transcriptomic level. Primary size, surface area and other properties of MWCNT- XNRI -7 (Mitsui7) were characterized using DLS, SEM and TEM. Mice were exposed via a single intratracheal instillation to 18, 54, or 162 μg of Mitsui7/mouse. FE1 cells were incubated with 12.5, 25 and 100 μg/ml of Mitsui7. Tissue and cell samples were collected at 24 hours post-exposure. DNA microarrays were employed to establish mechanistic differences and similarities between the two models. Microarray results were confirmed using gene-specific RT-qPCR. Bronchoalveolar lavage (BAL) fluid was assessed for indications of inflammation in vivo. A strong dose-dependent activation of acute phase and inflammation response was observed in mouse lungs reflective mainly of an inflammatory response as observed in BAL. In vitro, a wide variety of core cellular functions were affected including transcription, cell cycle, and cellular growth and proliferation. Oxidative stress, fibrosis and inflammation processes were altered in both models. Although there were similarities observed between the two models at the pathway-level, the specific genes altered under these pathways were different, suggesting that the underlying mechanisms of responses are different in cells in culture and the lung tissue. Our results suggest that careful consideration should be given in selecting relevant endpoints when substituting animal with in vitro testing
Risk Governance of Emerging Technologies Demonstrated in Terms of its Applicability to Nanomaterials
Nanotechnologies have reached maturity and market penetration that require nano-specific changes in legislation and harmonization among legislation domains, such as the amendments to REACH for nanomaterials (NMs) which came into force in 2020. Thus, an assessment of the components and regulatory boundaries of NMs risk governance is timely, alongside related methods and tools, as part of the global efforts to optimise nanosafety and integrate it into product design processes, via Safe(r)-by-Design (SbD) concepts. This paper provides an overview of the state-of-the-art regarding risk governance of NMs and lays out the theoretical basis for the development and implementation of an effective, trustworthy and transparent risk governance framework for NMs. The proposed framework enables continuous integration of the evolving state of the science, leverages best practice from contiguous disciplines and facilitates responsive re-thinking of nanosafety governance to meet future needs. To achieve and operationalise such framework, a science-based Risk Governance Council (RGC) for NMs is being developed. The framework will provide a toolkit for independent NMs' risk governance and integrates needs and views of stakeholders. An extension of this framework to relevant advanced materials and emerging technologies is also envisaged, in view of future foundations of risk research in Europe and globally
Risk Governance of Emerging Technologies Demonstrated in Terms of its Applicability to Nanomaterials
Nanotechnologies have reached maturity and market penetration that require nano-specific changes in legislation and harmonization among legislation domains, such as the amendments to REACH for nanomaterials (NMs) which came into force in 2020. Thus, an assessment of the components and regulatory boundaries of NMs risk governance is timely, alongside related methods and tools, as part of the global efforts to optimise nanosafety and integrate it into product design processes, via Safe(r)-by-Design (SbD) concepts. This paper provides an overview of the state-of-the-art regarding risk governance of NMs and lays out the theoretical basis for the development and implementation of an effective, trustworthy and transparent risk governance framework for NMs. The proposed framework enables continuous integration of the evolving state of the science, leverages best practice from contiguous disciplines and facilitates responsive re-thinking of nanosafety governance to meet future needs. To achieve and operationalise such framework, a science-based Risk Governance Council (RGC) for NMs is being developed. The framework will provide a toolkit for independent NMs' risk governance and integrates needs and views of stakeholders. An extension of this framework to relevant advanced materials and emerging technologies is also envisaged, in view of future foundations of risk research in Europe and globally
Induction of the interleukin 6/ signal transducer and activator of transcription pathway in the lungs of mice sub-chronically exposed to mainstream tobacco smoke
<p>Abstract</p> <p>Background</p> <p>Tobacco smoking is associated with lung cancer and other respiratory diseases. However, little is known about the global molecular changes that precede the appearance of clinically detectable symptoms. In this study, the effects of mainstream tobacco smoke (MTS) on global transcription in the mouse lung were investigated.</p> <p>Methods</p> <p>Male C57B1/CBA mice were exposed to MTS from two cigarettes daily, 5 days/week for 6 or 12 weeks. Mice were sacrificed immediately, or 6 weeks following the last cigarette. High density DNA microarrays were used to characterize global gene expression changes in whole lung. Microarray results were validated by Quantitative real-time RT-PCR. Further analysis of protein synthesis and function was carried out for a select set of genes by ELISA and Western blotting.</p> <p>Results</p> <p>Globally, seventy nine genes were significantly differentially expressed following the exposure to MTS. These genes were associated with a number of biological processes including xenobiotic metabolism, redox balance, oxidative stress and inflammation. There was no differential gene expression in mice exposed to smoke and sampled 6 weeks following the last cigarette. Moreover, cluster analysis demonstrated that these samples clustered alongside their respective controls. We observed simultaneous up-regulation of <it>interleukin 6 </it>(<it>IL-6</it>) and its antagonist, <it>suppressor of cytokine signalling </it>(<it>SOCS3</it>) mRNA following 12 weeks of MTS exposure. Analysis by ELISA and Western blotting revealed a concomitant increase in total IL-6 antigen levels and its downstream targets, including phosphorylated signal transducer and activator of transcription 3 (Stat3), basal cell-lymphoma extra large (BCL-XL) and myeloid cell leukemia 1 (MCL-1) protein, in total lung tissue extracts. However, in contrast to gene expression, a subtle decrease in total SOCS3 protein was observed after 12 weeks of MTS exposure.</p> <p>Conclusion</p> <p>Global transcriptional analysis identified a set of genes responding to MTS exposure in mouse lung. These genes returned to basal levels following smoking cessation, providing evidence to support the benefits of smoking cessation. Detailed analyses were undertaken for IL-6 and its associated pathways. Our results provide further insight into the role of these pathways in lung injury and inflammation induced by MTS.</p
Building an Adverse Outcome Pathway network for COVID-19
Data availability statement:
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding authors.Supplementary material:
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fsysb.2024.1384481/full#supplementary-material .The COVID-19 pandemic generated large amounts of data on the disease pathogenesis leading to a need for organizing the vast knowledge in a succinct manner. Between April 2020 and February 2023, the CIAO consortium exploited the Adverse Outcome Pathway (AOP) framework to comprehensively gather and systematically organize published scientific literature on COVID-19 pathology. The project considered 24 pathways relevant for COVID-19 by identifying essential key events (KEs) leading to 19 adverse outcomes observed in patients. While an individual AOP defines causally linked perturbed KEs towards an outcome, building an AOP network visually reflect the interrelatedness of the various pathways and outcomes. In this study, 17 of those COVID-19 AOPs were selected based on quality criteria to computationally derive an AOP network. This primary network highlighted the need to consider tissue specificity and helped to identify missing or redundant elements which were then manually implemented in the final network. Such a network enabled visualization of the complex interactions of the KEs leading to the various outcomes of the multifaceted COVID-19 and confirmed the central role of the inflammatory response in the disease. In addition, this study disclosed the importance of terminology harmonization and of tissue/organ specificity for network building. Furthermore the unequal completeness and quality of information contained in the AOPs highlighted the need for tighter implementation of the FAIR principles to improve AOP findability, accessibility, interoperability and re-usability. Finally, the study underlined that describing KEs specific to SARS-CoV-2 replication and discriminating physiological from pathological inflammation is necessary but requires adaptations to the framework. Hence, based on the challenges encountered, we proposed recommendations relevant for ongoing and future AOP-aligned consortia aiming to build computationally biologically meaningful AOP networks in the context of, but not limited to, viral diseases.This work was supported by funding from the JRC Exploratory project CIAO (Modelling the Pathogenesis of COVID-19 using the Adverse Outcome Pathway Framework, https://www.ciao-covid.net/). DJ would like to acknowledge funding from the US National Science Foundation (EF-2133763). PN also acknowledges funding from The Swedish Fund for Research without Animals (grants F2021-0005 and F2022-0003). SH acknowledges funding received from Health Canada’s Genomics Research and Development Initiative. ST acknowledges funding received from Japan Agency for Medical Research and Development (AMED) Grant Number JP21mk0101216, JP22mk0101216, JP23mk0101216, Japan Society for the Promotion of Science (JSPS) KAKENHI Grant Number 21K12133
Polymorphism of SERPINE2 gene is associated with pulmonary emphysema in consecutive autopsy cases
<p>Abstract</p> <p>Background</p> <p>The <it>SERPINA1</it>, <it>SERPINA3</it>, and <it>SERPINE2 </it>genes, which encode antiproteases, have been proposed to be susceptible genes for of chronic obstructive pulmonary disease (COPD) and related phenotypes. Whether they are associated with emphysema is not known.</p> <p>Methods</p> <p>Twelve previously reported single nucleotide polymorphisms (SNPs) in <it>SERPINA1 </it>(rs8004738, rs17751769, rs709932, rs11832, rs1303, rs28929474, and rs17580), <it>SERPINA3 </it>(rs4934, rs17473, and rs1800463), and <it>SERPINE2 </it>(rs840088 and rs975278) were genotyped in samples obtained from 1,335 consecutive autopsies of elderly Japanese people. The association between these SNPs and the severity of emphysema, as assessed using macroscopic scores, was determined.</p> <p>Results</p> <p>Emphysema of more than moderate degree was detected in 189 subjects (14.1%) and showed a significant gender difference (males, 20.5% and females, 7.0%; p < 0.0001). Among the 12 examined SNPs, only rs975278 in the <it>SERPINE2 </it>gene was positively associated with emphysema. Unlike the major alleles, homozygous minor alleles of rs975278 were associated with emphysema (odds ratio (OR) = 1.54; 95% confidence interval (CI) = 1.02-2.30; p = 0.037) and the association was very prominent in smokers (OR = 2.02; 95% CI = 1.29-3.15; p = 0.002).</p> <p>Conclusions</p> <p><it>SERPINE2 </it>may be a risk factor for the development of emphysema and its association with emphysema may be stronger in smokers.</p
- …
