925 research outputs found
Block Belief Propagation for Parameter Learning in Markov Random Fields
Traditional learning methods for training Markov random fields require doing
inference over all variables to compute the likelihood gradient. The iteration
complexity for those methods therefore scales with the size of the graphical
models. In this paper, we propose \emph{block belief propagation learning}
(BBPL), which uses block-coordinate updates of approximate marginals to compute
approximate gradients, removing the need to compute inference on the entire
graphical model. Thus, the iteration complexity of BBPL does not scale with the
size of the graphs. We prove that the method converges to the same solution as
that obtained by using full inference per iteration, despite these
approximations, and we empirically demonstrate its scalability improvements
over standard training methods.Comment: Accepted to AAAI 201
MEMORY OPTIMIZATIONS FOR HIGH-THROUGHPUT COMPUTER SYSTEMS
The emergence of new non-volatile memory (NVM) technology and deep neural network (DNN) inferences bring challenges related to off-chip memory access. Ensuring crash consistency leads to additional memory operations and exposes memory update operations on the critical execution path. DNN inference execution on some accelerators suffers from intensive off-chip memory access. The focus of this dissertation is to tackle the issues related to off-chip memory in these high performance computing systems.
The logging operations, required by the crash consistency, impose a significant performance overhead due to the extra memory access. To mitigate the persistence time of log requests, we introduce a load-aware log entry allocation scheme that allocates log requests to the address whose bank has the lightest workload. To address the problem of intra-record ordering, we propose to buffer log metadata in a non-volatile ADR buffer until the corresponding log can be removed. Moreover, the recently proposed LAD introduced unnecessary logging operations on multicore CPU. To reduce these unnecessary operations, we have devised two-stage transaction execution and virtual ADR buffers.
To tackle the challenge of low response time and high computational intensity associated with DNN inferences, these computations are often executed on customized accelerators. However, data loading from off-chip memory typically takes longer than computing, thereby reducing performance in some scenarios, especially on edge devices. To address this issue, we propose an optimization of the widely adopted Weight Stationary dataflow to remove redundant accesses to IFMAP in off-chip memory by reordering the loops in the standard convolution operation. Furthermore, to enhance the off-chip memory throughput, we introduce the load-aware placement for data tiles on off-chip memory that reduces intra/inter contentions caused by concurrent accesses from multiple tiles and improves the off-chip memory device parallelism during access
Relation Structure-Aware Heterogeneous Information Network Embedding
Heterogeneous information network (HIN) embedding aims to embed multiple
types of nodes into a low-dimensional space. Although most existing HIN
embedding methods consider heterogeneous relations in HINs, they usually employ
one single model for all relations without distinction, which inevitably
restricts the capability of network embedding. In this paper, we take the
structural characteristics of heterogeneous relations into consideration and
propose a novel Relation structure-aware Heterogeneous Information Network
Embedding model (RHINE). By exploring the real-world networks with thorough
mathematical analysis, we present two structure-related measures which can
consistently distinguish heterogeneous relations into two categories:
Affiliation Relations (ARs) and Interaction Relations (IRs). To respect the
distinctive characteristics of relations, in our RHINE, we propose different
models specifically tailored to handle ARs and IRs, which can better capture
the structures and semantics of the networks. At last, we combine and optimize
these models in a unified and elegant manner. Extensive experiments on three
real-world datasets demonstrate that our model significantly outperforms the
state-of-the-art methods in various tasks, including node clustering, link
prediction, and node classification
An Ontology-Based Artificial Intelligence Model for Medicine Side-Effect Prediction: Taking Traditional Chinese Medicine as An Example
In this work, an ontology-based model for AI-assisted medicine side-effect
(SE) prediction is developed, where three main components, including the drug
model, the treatment model, and the AI-assisted prediction model, of proposed
model are presented. To validate the proposed model, an ANN structure is
established and trained by two hundred and forty-two TCM prescriptions. These
data are gathered and classified from the most famous ancient TCM book and more
than one thousand SE reports, in which two ontology-based attributions, hot and
cold, are introduced to evaluate whether the prescription will cause SE or not.
The results preliminarily reveal that it is a relationship between the
ontology-based attributions and the corresponding predicted indicator that can
be learnt by AI for predicting the SE, which suggests the proposed model has a
potential in AI-assisted SE prediction. However, it should be noted that, the
proposed model highly depends on the sufficient clinic data, and hereby, much
deeper exploration is important for enhancing the accuracy of the prediction
Large Data Approaches to Thresholding Problems
Statistical models with discontinuities have seen much use in a variety of situations, in practical fields such as statistical process control, processing gene data, and econometrics. The study of such models is usually concerned with locating the these discontinuities, which methodologically cause various issues as estimation requires nonstandard optimization problems. With the contemporary increase in computer power and memory, it becomes more relevant to view these problems in the context of very large datasets, a context which introduces further complications for estimation. In this thesis, we study two major topics in threshold estimation, with models, methodology, and results motivated by the concern towards handling big data.
Our first topic focuses on the change point problem, which involves detection of the locations where a change in distribution occurs within a data sequence. A variety of methods have been proposed and studied in this area, with novel approaches in the case where the number of change points is an unknown that could be greater than 1, making exhaustive search methods infeasible.
Our contribution in this problem is motivated by the principle that only the data points close to the change points are useful for their estimation while other points are extraneous. From this observation we propose a zoom in estimation method which efficiently subsamples the data for estimation while not compromising the accuracy. The resulting method runs in sublinear time, while existing methods all run in linear time or above. Furthermore, the nature of this new methodology allows us to characterize the asymptotic distribution even in the case where the number of change point parameters increases without bound, a type of result not replicated in this field.
The second topic regards the change plane model, which involves a real valued signal over a multiple dimensional space with a discontinuity delineated by a hyperplane. Practically the change plane model is used to combine regression between a covariate and response variable, while performing unsupervised classification onto the covariate. As change -plane models in growing dimensions have not been studied in the literature, we confine ourselves to canonical models in this dissertation, as a first approach to these problems. in terms of details, we establish fundamental convergence and support selection properties (the latter for the high-dimensional case) and present some simulation results.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/153384/1/jlnlu_1.pd
Graduate Recital: Zhiyuan Gao, Horn; Lu Witzig, Piano; Joohee Jeong, Piano; Jinyu Zhang, Piano; April 7, 2024
Kemp Recital HallApril 7, 2024Sunday Evening6:30 p.m
- …