93 research outputs found
Adjusted chi-square test for degree-corrected block models
We propose a goodness-of-fit test for degree-corrected stochastic block
models (DCSBM). The test is based on an adjusted chi-square statistic for
measuring equality of means among groups of multinomial distributions with
observations. In the context of network models, the number of
multinomials, , grows much faster than the number of observations, ,
hence the setting deviates from classical asymptotics. We show that a simple
adjustment allows the statistic to converge in distribution, under null, as
long as the harmonic mean of grows to infinity. This result applies
to large sparse networks where the role of is played by the degree of
node . Our distributional results are nonasymptotic, with explicit
constants, providing finite-sample bounds on the Kolmogorov-Smirnov distance to
the target distribution. When applied sequentially, the test can also be used
to determine the number of communities. The test operates on a (row) compressed
version of the adjacency matrix, conditional on the degrees, and as a result is
highly scalable to large sparse networks. We incorporate a novel idea of
compressing the columns based on a -community assignment when testing
for communities. This approach increases the power in sequential
applications without sacrificing computational efficiency, and we prove its
consistency in recovering the number of communities. Since the test statistic
does not rely on a specific alternative, its utility goes beyond sequential
testing and can be used to simultaneously test against a wide range of
alternatives outside the DCSBM family. We show the effectiveness of the
approach by extensive numerical experiments with simulated and real data. In
particular, applying the test to the Facebook-100 dataset, we find that a DCSBM
with a small number of communities is far from a good fit in almost all cases
Inference And Learning: Computational Difficulty And Efficiency
In this thesis, we mainly investigate two collections of problems: statistical network inference and model selection in regression. The common feature shared by these two types of problems is that they typically exhibit an interesting phenomenon in terms of computational difficulty and efficiency.
For statistical network inference, our goal is to infer the network structure based on a noisy observation of the network. Statistically, we model the network as generated from the structural information with the presence of noise, for example, planted submatrix model (for bipartite weighted graph), stochastic block model, and Watts-Strogatz model. As the relative amount of ``signal-to-noise\u27\u27 varies, the problems exhibit different stages of computational difficulty. On the theoretical side, we investigate these stages through characterizing the transition thresholds on the ``signal-to-noise\u27\u27 ratio, for the aforementioned models. On the methodological side, we provide new computationally efficient procedures to reconstruct the network structure for each model.
For model selection in regression, our goal is to learn a ``good\u27\u27 model based on a certain model class from the observed data sequences (feature and response pairs), when the model can be misspecified. More concretely, we study two model selection problems: to learn from general classes of functions based on i.i.d. data with minimal assumptions, and to select from the sparse linear model class based on possibly adversarially chosen data in a sequential fashion. We develop new theoretical and algorithmic tools beyond empirical risk minimization to study these problems from a learning theory point of view
Bayesian approach for the spectrum sensing mimo-cognitive radio network with presence of the uncertainty
A cognitive radio technique has the ability to learn. This system not only can observe the surrounding environment, adapt to environmental conditions, but also efficiently use the radio spectrum. This technique allows the secondary users (SUs) to employ the primary users (PUs) spectrum during the band is not being utilized by the user. Cognitive radio has three main steps: sensing of the spectrum, deciding and acting. In the spectrum sensing technique, the channel occupancy is determined with a spectrum sensing approach to detect unused spectrum. In the decision process, sensing results are evaluated and the decision process is then obtained based on these results. In the final process which is called the acting process, the scholar determines how to adjust the parameters of transmission to achieve great performance for the cognitive radio network
Adaptive Design and Flexible Approval of Clinical Trials
Dose-finding clinical trials are among the most critical cornerstones of the healthcare system. In this broad research area, there are many decision making problems that are extremely challenging to address. However, a small improvement may result in significant benefits to the society. Dose-finding clinical trials are extremely expensive and require multiple time-consuming and complicated R&D phases. Despite all the costs and the long time these trials need to conclude (on average over ten years for each new drug/technology), only less than 15\% of these trials successfully end up in a new approved drug entering the market. This problem is even exacerbated for the drugs that target rare diseases, where the costs of testing subjects are higher, sampling budgets are more restricted by the number of available patients, and the chances of success and expected payoffs are at much lower levels. In this dissertation, we first propose a new information-based objective function to guide the adaptive dose allocation as a part of Phase II clinical studies for which we show its merit in small sampling budgets. Then, we redesign Phase II clinical trials with the goal of personalizing this process and to find different target doses with certain efficacy levels, for different groups of patients. Finally, we move on to the FDA aspect of the approval process, and analyze flexible approval policies that the FDA can apply to Phase III clinical trials. The motivation for this work is based on recent studies that suggest flexible approval process can incentivize the research and development of new drugs for rare diseases. We hope that the results of our research help the clinicians, pharmaceutical companies, and the FDA with better understanding the consequences of their decisions, while leading to potentially more effective treatment/dose specification for the patients who are in need of new drugs
Machine Learning in Sensors and Imaging
Machine learning is extending its applications in various fields, such as image processing, the Internet of Things, user interface, big data, manufacturing, management, etc. As data are required to build machine learning networks, sensors are one of the most important technologies. In addition, machine learning networks can contribute to the improvement in sensor performance and the creation of new sensor applications. This Special Issue addresses all types of machine learning applications related to sensors and imaging. It covers computer vision-based control, activity recognition, fuzzy label classification, failure classification, motor temperature estimation, the camera calibration of intelligent vehicles, error detection, color prior model, compressive sensing, wildfire risk assessment, shelf auditing, forest-growing stem volume estimation, road management, image denoising, and touchscreens
Natural Language Processing: Emerging Neural Approaches and Applications
This Special Issue highlights the most recent research being carried out in the NLP field to discuss relative open issues, with a particular focus on both emerging approaches for language learning, understanding, production, and grounding interactively or autonomously from data in cognitive and neural systems, as well as on their potential or real applications in different domains
- …