97 research outputs found
Local causal effects with continuous exposures: A matching estimator for the average causal derivative effect
The estimation of causal effects is a fundamental goal in the field of causal
inference. However, it is challenging for various reasons. One reason is that
the exposure (or treatment) is naturally continuous in many real-world
scenarios. When dealing with continuous exposure, dichotomizing the exposure
variable based on a pre-defined threshold may result in a biased understanding
of causal relationships. In this paper, we propose a novel causal inference
framework that can measure the causal effect of continuous exposure. We define
the expectation of a derivative of potential outcomes at a specific exposure
level as the average causal derivative effect. Additionally, we propose a
matching method for this estimator and propose a permutation approach to test
the hypothesis of no local causal effect. We also investigate the asymptotic
properties of the proposed estimator and examine its performance through
simulation studies. Finally, we apply this causal framework in a real data
example of Chronic Obstructive Pulmonary Disease (COPD) patients.Comment: 27 pages, 2 figures, 4 tables. Supplementary materials are available.
The R files are available at
https://github.com/suhwanbong121/average_causal_derivative_effec
????????? ?????? ???????????? ?????? ?????? ????????? ????????? ?????? ?????? ?????? ?????? ???????????????
Department of Industrial EngineeringThe development of models for process outcome prediction using event logs has evolved with a clear focus on performance improvement. In this thesis we take a different perspective, focusing on obtaining interpretable predictive models for outcome prediction. In particular, we propose a method based on association rule mining, which results in inherently interpretable classification models. While association rule mining has been used with event logs for process model approximation and anomaly detection in the past, its application to outcome-based predictive model is novel. The proposed method defines how to pre-process logs, obtain the rules, prune the rules to a limited number that can be handled by human decision makers, and use the rules to predict process outcomes. The experimental results on real world event logs show that in most cases the performance of the proposed method is aligned with the one of traditional approaches, with only a slight decrease in some cases. We argue that such a decrease of performance is an acceptable trade-off in return for a predictive model that is interpretable by design.ope
Domain Alignment and Temporal Aggregation for Unsupervised Video Object Segmentation
Unsupervised video object segmentation aims at detecting and segmenting the
most salient object in videos. In recent times, two-stream approaches that
collaboratively leverage appearance cues and motion cues have attracted
extensive attention thanks to their powerful performance. However, there are
two limitations faced by those methods: 1) the domain gap between appearance
and motion information is not well considered; and 2) long-term temporal
coherence within a video sequence is not exploited. To overcome these
limitations, we propose a domain alignment module (DAM) and a temporal
aggregation module (TAM). DAM resolves the domain gap between two modalities by
forcing the values to be in the same range using a cross-correlation mechanism.
TAM captures long-term coherence by extracting and leveraging global cues of a
video. On public benchmark datasets, our proposed approach demonstrates its
effectiveness, outperforming all existing methods by a substantial margin
Tsanet: Temporal and Scale Alignment for Unsupervised Video Object Segmentation
Unsupervised Video Object Segmentation (UVOS) refers to the challenging task
of segmenting the prominent object in videos without manual guidance. In recent
works, two approaches for UVOS have been discussed that can be divided into:
appearance and appearance-motion-based methods, which have limitations
respectively. Appearance-based methods do not consider the motion of the target
object due to exploiting the correlation information between randomly paired
frames. Appearance-motion-based methods have the limitation that the dependency
on optical flow is dominant due to fusing the appearance with motion. In this
paper, we propose a novel framework for UVOS that can address the
aforementioned limitations of the two approaches in terms of both time and
scale. Temporal Alignment Fusion aligns the saliency information of adjacent
frames with the target frame to leverage the information of adjacent frames.
Scale Alignment Decoder predicts the target object mask by aggregating
multi-scale feature maps via continuous mapping with implicit neural
representation. We present experimental results on public benchmark datasets,
DAVIS 2016 and FBMS, which demonstrate the effectiveness of our method.
Furthermore, we outperform the state-of-the-art methods on DAVIS 2016.Comment: Accepted to ICIP 202
Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation
Unsupervised video object segmentation (VOS) is a task that aims to detect
the most salient object in a video without external guidance about the object.
To leverage the property that salient objects usually have distinctive
movements compared to the background, recent methods collaboratively use motion
cues extracted from optical flow maps with appearance cues extracted from RGB
images. However, as optical flow maps are usually very relevant to segmentation
masks, the network is easy to be learned overly dependent on the motion cues
during network training. As a result, such two-stream approaches are vulnerable
to confusing motion cues, making their prediction unstable. To relieve this
issue, we design a novel motion-as-option network by treating motion cues as
optional. During network training, RGB images are randomly provided to the
motion encoder instead of optical flow maps, to implicitly reduce motion
dependency of the network. As the learned motion encoder can deal with both RGB
images and optical flow maps, two different predictions can be generated
depending on which source information is used as motion input. In order to
fully exploit this property, we also propose an adaptive output selection
algorithm to adopt optimal prediction result at test time. Our proposed
approach affords state-of-the-art performance on all public benchmark datasets,
even maintaining real-time inference speed
Global-Local Aggregation with Deformable Point Sampling for Camouflaged Object Detection
The camouflaged object detection (COD) task aims to find and segment objects
that have a color or texture that is very similar to that of the background.
Despite the difficulties of the task, COD is attracting attention in medical,
lifesaving, and anti-military fields. To overcome the difficulties of COD, we
propose a novel global-local aggregation architecture with a deformable point
sampling method. Further, we propose a global-local aggregation transformer
that integrates an object's global information, background, and boundary local
information, which is important in COD tasks. The proposed transformer obtains
global information from feature channels and effectively extracts important
local information from the subdivided patch using the deformable point sampling
method. Accordingly, the model effectively integrates global and local
information for camouflaged objects and also shows that important boundary
information in COD can be efficiently utilized. Our method is evaluated on
three popular datasets and achieves state-of-the-art performance. We prove the
effectiveness of the proposed method through comparative experiments
Leveraging Spatio-Temporal Dependency for Skeleton-Based Action Recognition
Skeleton-based action recognition has attracted considerable attention due to
its compact skeletal structure of the human body. Many recent methods have
achieved remarkable performance using graph convolutional networks (GCNs) and
convolutional neural networks (CNNs), which extract spatial and temporal
features, respectively. Although spatial and temporal dependencies in the human
skeleton have been explored, spatio-temporal dependency is rarely considered.
In this paper, we propose the Inter-Frame Curve Network (IFC-Net) to
effectively leverage the spatio-temporal dependency of the human skeleton. Our
proposed network consists of two novel elements: 1) The Inter-Frame Curve (IFC)
module; and 2) Dilated Graph Convolution (D-GC). The IFC module increases the
spatio-temporal receptive field by identifying meaningful node connections
between every adjacent frame and generating spatio-temporal curves based on the
identified node connections. The D-GC allows the network to have a large
spatial receptive field, which specifically focuses on the spatial domain. The
kernels of D-GC are computed from the given adjacency matrices of the graph and
reflect large receptive field in a way similar to the dilated CNNs. Our IFC-Net
combines these two modules and achieves state-of-the-art performance on three
skeleton-based action recognition benchmarks: NTU-RGB+D 60, NTU-RGB+D 120, and
Northwestern-UCLA.Comment: 12 pages, 5 figure
Occluded Person Re-Identification via Relational Adaptive Feature Correction Learning
Occluded person re-identification (Re-ID) in images captured by multiple
cameras is challenging because the target person is occluded by pedestrians or
objects, especially in crowded scenes. In addition to the processes performed
during holistic person Re-ID, occluded person Re-ID involves the removal of
obstacles and the detection of partially visible body parts. Most existing
methods utilize the off-the-shelf pose or parsing networks as pseudo labels,
which are prone to error. To address these issues, we propose a novel Occlusion
Correction Network (OCNet) that corrects features through relational-weight
learning and obtains diverse and representative features without using external
networks. In addition, we present a simple concept of a center feature in order
to provide an intuitive solution to pedestrian occlusion scenarios.
Furthermore, we suggest the idea of Separation Loss (SL) for focusing on
different parts between global features and part features. We conduct extensive
experiments on five challenging benchmark datasets for occluded and holistic
Re-ID tasks to demonstrate that our method achieves superior performance to
state-of-the-art methods especially on occluded scene.Comment: ICASSP 202
- …