6,119 research outputs found
Machine Learning and Integrative Analysis of Biomedical Big Data.
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
Community landscapes: an integrative approach to determine overlapping network module hierarchy, identify key nodes and predict network dynamics
Background: Network communities help the functional organization and
evolution of complex networks. However, the development of a method, which is
both fast and accurate, provides modular overlaps and partitions of a
heterogeneous network, has proven to be rather difficult. Methodology/Principal
Findings: Here we introduce the novel concept of ModuLand, an integrative
method family determining overlapping network modules as hills of an influence
function-based, centrality-type community landscape, and including several
widely used modularization methods as special cases. As various adaptations of
the method family, we developed several algorithms, which provide an efficient
analysis of weighted and directed networks, and (1) determine pervasively
overlapping modules with high resolution; (2) uncover a detailed hierarchical
network structure allowing an efficient, zoom-in analysis of large networks;
(3) allow the determination of key network nodes and (4) help to predict
network dynamics. Conclusions/Significance: The concept opens a wide range of
possibilities to develop new approaches and applications including network
routing, classification, comparison and prediction.Comment: 25 pages with 6 figures and a Glossary + Supporting Information
containing pseudo-codes of all algorithms used, 14 Figures, 5 Tables (with 18
module definitions, 129 different modularization methods, 13 module
comparision methods) and 396 references. All algorithms can be downloaded
from this web-site: http://www.linkgroup.hu/modules.ph
A Comprehensive Bibliometric Analysis on Social Network Anonymization: Current Approaches and Future Directions
In recent decades, social network anonymization has become a crucial research
field due to its pivotal role in preserving users' privacy. However, the high
diversity of approaches introduced in relevant studies poses a challenge to
gaining a profound understanding of the field. In response to this, the current
study presents an exhaustive and well-structured bibliometric analysis of the
social network anonymization field. To begin our research, related studies from
the period of 2007-2022 were collected from the Scopus Database then
pre-processed. Following this, the VOSviewer was used to visualize the network
of authors' keywords. Subsequently, extensive statistical and network analyses
were performed to identify the most prominent keywords and trending topics.
Additionally, the application of co-word analysis through SciMAT and the
Alluvial diagram allowed us to explore the themes of social network
anonymization and scrutinize their evolution over time. These analyses
culminated in an innovative taxonomy of the existing approaches and
anticipation of potential trends in this domain. To the best of our knowledge,
this is the first bibliometric analysis in the social network anonymization
field, which offers a deeper understanding of the current state and an
insightful roadmap for future research in this domain.Comment: 73 pages, 28 figure
Representation Learning for Attributed Multiplex Heterogeneous Network
Network embedding (or graph embedding) has been widely used in many
real-world applications. However, existing methods mainly focus on networks
with single-typed nodes/edges and cannot scale well to handle large networks.
Many real-world networks consist of billions of nodes and edges of multiple
types, and each node is associated with different attributes. In this paper, we
formalize the problem of embedding learning for the Attributed Multiplex
Heterogeneous Network and propose a unified framework to address this problem.
The framework supports both transductive and inductive learning. We also give
the theoretical analysis of the proposed framework, showing its connection with
previous works and proving its better expressiveness. We conduct systematical
evaluations for the proposed framework on four different genres of challenging
datasets: Amazon, YouTube, Twitter, and Alibaba. Experimental results
demonstrate that with the learned embeddings from the proposed framework, we
can achieve statistically significant improvements (e.g., 5.99-28.23% lift by
F1 scores; p<<0.01, t-test) over previous state-of-the-art methods for link
prediction. The framework has also been successfully deployed on the
recommendation system of a worldwide leading e-commerce company, Alibaba Group.
Results of the offline A/B tests on product recommendation further confirm the
effectiveness and efficiency of the framework in practice.Comment: Accepted to KDD 2019. Website: https://sites.google.com/view/gatn
Human Factors in Agile Software Development
Through our four years experiments on students' Scrum based agile software
development (ASD) process, we have gained deep understanding into the human
factors of agile methodology. We designed an agile project management tool -
the HASE collaboration development platform to support more than 400 students
self-organized into 80 teams to practice ASD. In this thesis, Based on our
experiments, simulations and analysis, we contributed a series of solutions and
insights in this researches, including 1) a Goal Net based method to enhance
goal and requirement management for ASD process, 2) a novel Simple Multi-Agent
Real-Time (SMART) approach to enhance intelligent task allocation for ASD
process, 3) a Fuzzy Cognitive Maps (FCMs) based method to enhance emotion and
morale management for ASD process, 4) the first large scale in-depth empirical
insights on human factors in ASD process which have not yet been well studied
by existing research, and 5) the first to identify ASD process as a
human-computation system that exploit human efforts to perform tasks that
computers are not good at solving. On the other hand, computers can assist
human decision making in the ASD process.Comment: Book Draf
- …