56 research outputs found
Paradoxes and resolutions for semiparametric fusion of individual and summary data
Suppose we have available individual data from an internal study and various
types of summary statistics from relevant external studies. External summary
statistics have been used as constraints on the internal data distribution,
which promised to improve the statistical inference in the internal data;
however, the additional use of external summary data may lead to paradoxical
results: efficiency loss may occur if the uncertainty of summary statistics is
not negligible and large estimation bias can emerge even if the bias of
external summary statistics is small. We investigate these paradoxical results
in a semiparametric framework. We establish the semiparametric efficiency bound
for estimating a general functional of the internal data distribution, which is
shown to be no larger than that using only internal data. We propose a
data-fused efficient estimator that achieves this bound so that the efficiency
paradox is resolved. Besides, a debiased estimator is further proposed which
has selection consistency property by employing adaptive lasso penalty so that
the resultant estimator can achieve the same asymptotic distribution as the
oracle one that uses only unbiased summary statistics, which resolves the bias
paradox. Simulations and application to a Helicobacter pylori infection dataset
are used to illustrate the proposed methods.Comment: 16 pages, 3 figure
Sharp bounds for variance of treatment effect estimators in the finite population in the presence of covariates
In a completely randomized experiment, the variances of treatment effect
estimators in the finite population are usually not identifiable and hence not
estimable. Although some estimable bounds of the variances have been
established in the literature, few of them are derived in the presence of
covariates.
In this paper, the difference-in-means estimator and the Wald estimator are
considered in the completely randomized experiment with perfect compliance and
noncompliance, respectively. Sharp bounds for the variances of these two
estimators are established when covariates are available.
Furthermore, consistent estimators for such bounds are obtained, which can be
used to shorten the confidence intervals and improve the power of tests.
Confidence intervals are constructed based on the consistent estimators of the
upper bounds, whose coverage rates are uniformly asymptotically guaranteed.
Simulations were conducted to evaluate the proposed methods. The proposed
methods are also illustrated with two real data analyses.Comment: Accepted by Statistica Sinic
ILCAS: Imitation Learning-Based Configuration-Adaptive Streaming for Live Video Analytics with Cross-Camera Collaboration
The high-accuracy and resource-intensive deep neural networks (DNNs) have
been widely adopted by live video analytics (VA), where camera videos are
streamed over the network to resource-rich edge/cloud servers for DNN
inference. Common video encoding configurations (e.g., resolution and frame
rate) have been identified with significant impacts on striking the balance
between bandwidth consumption and inference accuracy and therefore their
adaption scheme has been a focus of optimization. However, previous
profiling-based solutions suffer from high profiling cost, while existing deep
reinforcement learning (DRL) based solutions may achieve poor performance due
to the usage of fixed reward function for training the agent, which fails to
craft the application goals in various scenarios. In this paper, we propose
ILCAS, the first imitation learning (IL) based configuration-adaptive VA
streaming system. Unlike DRL-based solutions, ILCAS trains the agent with
demonstrations collected from the expert which is designed as an offline
optimal policy that solves the configuration adaption problem through dynamic
programming. To tackle the challenge of video content dynamics, ILCAS derives
motion feature maps based on motion vectors which allow ILCAS to visually
``perceive'' video content changes. Moreover, ILCAS incorporates a cross-camera
collaboration scheme to exploit the spatio-temporal correlations of cameras for
more proper configuration selection. Extensive experiments confirm the
superiority of ILCAS compared with state-of-the-art solutions, with 2-20.9%
improvement of mean accuracy and 19.9-85.3% reduction of chunk upload lag.Comment: This work has been submitted to the IEEE Transactions on Mobile
Computing for possible publication. Copyright may be transferred without
notice, after which this version may no longer be accessibl
Adopting a QCA Approach to Investigating the Risks Involved in Mega projects from Auditing Perspective
There is an increase of megaproject construction worldwide. At the same time, risks involved in megaprojects have also become a wide concern. Extending from the macrolevel of qualitative analysis focusing on complexity, politics, and morality, the research conducted the microscopic empirical analysis on twenty-two typical cases by adopting the quality comparative analysis (QCA) from the auditing perspective. Different from the traditional analysis method taking each causation as independent variable, the results in the study revealed that there was complex multiple concurrent causation among eight conditions; additionally, the configuration of those would be divided into six types, among which, the coverage of the three types, namely, project management risk, preliminary and construction risk, and tendering and contract management related risk, was almost eighty percent. Finally, megaproject risks in China were caused by complicated and changeable combination conditions, which would provide a new breakthrough for seeking analyzing megaproject risks through this quantitative analysis method, and indicate the researchers and practitioners to control the megaproject risks from a more systematic way
Recommended from our members
Prognostic Factors in Dedifferentiated Chondrosarcoma: A Retrospective Analysis of a Large Series Treated at a Single Institution.
Background:Dedifferentiated chondrosarcomas (DDCSs) are highly malignant tumors with a dismal prognosis and present a significant challenge in clinical management. Methods:In an IRB approved retrospective protocol, we identified 72 patients with DDCS treated at our institution between 1993 and 2017 and reviewed clinicopathological characteristics, treatment modalities, and outcomes to analyze prognostic factors. Results:Femur (44.4%), pelvis (22.2%), and humerus (12.5%) were most commonly involved sites. Twenty-three patients (31.9%) presented with distant metastasis, and 3 (4.2%) of them also had regional lymph node involvement. The median overall survival (OS) was 13.9 months. On multivariate analysis, pathological fracture, larger tumor size, lymph node involvement, metastasis at diagnosis, extraosseous extension, and undifferentiated pleomorphic sarcoma component correlated with worse OS, whereas surgical resection and chemotherapy were associated with improved OS. For progression-free survival (PFS), pathological fracture and metastasis at diagnosis showed increased risk, while chemotherapy was associated with decreased risk. Among patients who received chemotherapy, doxorubicin and cisplatin were significantly associated with improved PFS but not OS. Among patients without metastasis at diagnosis, 17 (34.7%) developed local recurrence. Thirty-one (63.3%) developed distant metastases at a median interval of 18.1 months. On multivariate analysis, R1/R2 resection was related with local recurrence, while macroscopic dedifferentiated component was associated with distant metastasis. Conclusions:The prognosis of DDCS is poor. Complete resection remains a significant prognostic factor for local control. Chemotherapy with doxorubicin and cisplatin seems to have better PFS. More prognostic, multicenter trials are warranted to further explore the effectiveness of chemotherapy in selected DDCS patients
Large Language Models for Code Analysis: Do LLMs Really Do Their Job?
Large language models (LLMs) have demonstrated significant potential in the
realm of natural language understanding and programming code processing tasks.
Their capacity to comprehend and generate human-like code has spurred research
into harnessing LLMs for code analysis purposes. However, the existing body of
literature falls short in delivering a systematic evaluation and assessment of
LLMs' effectiveness in code analysis, particularly in the context of obfuscated
code.
This paper seeks to bridge this gap by offering a comprehensive evaluation of
LLMs' capabilities in performing code analysis tasks. Additionally, it presents
real-world case studies that employ LLMs for the analysis of malicious code.
Our findings indicate that LLMs can indeed serve as valuable tools for
automating code analysis, albeit with certain limitations. Through meticulous
exploration, this research contributes to a deeper understanding of the
potential and constraints associated with utilizing LLMs in code analysis,
paving the way for enhanced applications in this critical domain
Large-scale prediction of long non-coding RNA functions in a codingānon-coding gene co-expression network
Although accumulating evidence has provided insight into the various functions of long-non-coding RNAs (lncRNAs), the exact functions of the majority of such transcripts are still unknown. Here, we report the first computational annotation of lncRNA functions based on public microarray expression profiles. A codingānon-coding gene co-expression (CNC) network was constructed from re-annotated Affymetrix Mouse Genome Array data. Probable functions for altogether 340 lncRNAs were predicted based on topological or other network characteristics, such as module sharing, association with network hubs and combinations of co-expression and genomic adjacency. The functions annotated to the lncRNAs mainly involve organ or tissue development (e.g. neuron, eye and muscle development), cellular transport (e.g. neuronal transport and sodium ion, acid or lipid transport) or metabolic processes (e.g. involving macromolecules, phosphocreatine and tyrosine)
Robust estimation of bacterial cell count from optical density
Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data
- ā¦