18 research outputs found
The Impact of Beam Variations on Power Spectrum Estimation for 21 cm Cosmology II: Mitigation of Foreground Systematics for HERA
One key challenge in detecting 21 cm cosmological signal at z > 6 is to
separate the cosmological signal from foreground emission. This can be studied
in a power spectrum space where the foreground is confined to low delay modes
whereas the cosmological signal can spread out to high delay modes. When there
is a calibration error, however, chromaticity of gain errors propagates to the
power spectrum estimate and contaminates the modes for cosmological detection.
The Hydrogen Epoch of Reionization Array (HERA) employs a high-precision
calibration scheme using redundancy in measurements. In this study, we focus on
the gain errors induced by nonredundancies arising from feed offset relative to
the HERA's 14 meter parabolic dish element, and investigate how to mitigate the
chromatic gain errors using three different methods: restricting baseline
lengths for calibration, smoothing the antenna gains, and applying a temporal
filter prior to calibration. With 2 cm/2 degree perturbations for
translation/tilting motions, a level achievable under normal HERA operating
conditions, the combination of the baseline cut and temporal filtering
indicates that the spurious gain feature due to nonredundancies is
significantly reduced, and the power spectrum recovers the clean
foreground-free region. We found that the mitigation technique works even for
large feed motions but in order to keep a stable calibration process, the feed
positions need to be constrained to 2 cm for translation motions and 2 degree
for tilting offset relative to the dish's vertex.Comment: Accepted for publication in Ap
Direct Optimal Mapping Image Power Spectrum and its Window Functions
The key to detecting neutral hydrogen during the epoch of reionization (EoR)
is to separate the cosmological signal from the dominating foreground
radiation. We developed direct optimal mapping (Xu et al. 2022) to map
interferometric visibilities; it contains only linear operations, with full
knowledge of point spread functions from visibilities to images. Here we
present an FFT-based image power spectrum and its window functions based on
direct optimal mapping. We use noiseless simulation, based on the Hydrogen
Epoch of Reionization Array (HERA) Phase I configuration, to study the image
power spectrum properties. The window functions show power leakage
from the foreground-dominated region into the EoR window; the 2D and 1D power
spectra also verify the separation between the foregrounds and the EoR.
Furthermore, we simulated visibilities from a -complete array and
calculated its image power spectrum. The result shows that the foreground--EoR
leakage is further suppressed below , dominated by the tapering
function sidelobes; the 2D power spectrum does not show signs of the horizon
wedge. The -complete result provides a reference case for future 21cm
cosmology array designs.Comment: Submitted to Ap
Direct Optimal Mapping for 21cm Cosmology: A Demonstration with the Hydrogen Epoch of Reionization Array
Motivated by the desire for wide-field images with well-defined statistical
properties for 21cm cosmology, we implement an optimal mapping pipeline that
computes a maximum likelihood estimator for the sky using the interferometric
measurement equation. We demonstrate this direct optimal mapping with data from
the Hydrogen Epoch of Reionization (HERA) Phase I observations. After
validating the pipeline with simulated data, we develop a maximum likelihood
figure-of-merit for comparing four sky models at 166MHz with a bandwidth of
100kHz. The HERA data agree with the GLEAM catalogs to <10%. After subtracting
the GLEAM point sources, the HERA data discriminate between the different
continuum sky models, providing most support for the model of Byrne et al.
2021. We report the computation cost for mapping the HERA Phase I data and
project the computation for the HERA 320-antenna data; both are feasible with a
modern server. The algorithm is broadly applicable to other interferometers and
is valid for wide-field and non-coplanar arrays.Comment: 16 pages, 10 figures, 2 tables, published on Ap
Classifying apartment defect repair tasks in South Korea: a machine learning approach
Managing building defects in the residential environment is an important social issue in South Korea. Therefore, most South Korean construction companies devote a large amount of human resources and economic costs in managing such defects. This paper proposes a machine learning approach for investigating whether a specific defect can be autonomously categorized into one of the categories of repair tasks. To this end, we employed a dataset of 310,044 defect cases (from 656,266 validated cases of 717,550 total collected cases). Three machine learning classifiers (support vector machine, random forest, and logistic regression) with three word embedding methods (bag-of-words, term frequency-inverse document frequency, and Word2Vec) were employed for the classification tasks. The highest yielded results showed more than 99% accuracy, precision, recall, and F1-scores for the random forest classifier with the Word2Vec embedding. Finally, based on these findings, the implications and limitations of this study are discussed. Representatively, the findings of this research can improve the defect management effectiveness of the apartment construction industry in South Korea. Moreover, to contribute to future research, we have made the dataset publicly available
Deep learning model based on expectation-confirmation theory to predict customer satisfaction in hospitality service
Customer satisfaction is one of the most important measures in the hospitality
industry. Therefore, several psychological and cognitive theories have been utilized
to provide appropriate explanations of customer perception. Owing to recent rapid
developments in artifcial intelligence and big data, novel methodologies have presented to examine several psychological theories applied in the hospitality industry. Within this framework, this study combines deep learning techniques with the
expectation-confrmation theory to elucidate customer satisfaction in hospitality
services. Customer hotel review comments, hotel information, and images were
employed to predict customer satisfaction with hotel service. The results show that
the proposed fused model achieved an accuracy of 83.54%. In addition, the recall
value that predicts dissatisfaction improved from 16.46–33.41%. Based on the fndings of this study, both academic and managerial implications for the hospitality
industry are presented
Impacts of Soil Properties, Topography, and Environmental Features on Soil Water Holding Capacities (SWHCs) and Their Interrelationships
Soil water holding capacities (SWHCs) are among the most important factors for understanding the water cycle in forested catchments because they control available plant water that supports evapotranspiration. The direct determination of SWHCs, however, is time consuming and expensive, so many pedotransfer functions (PTFs) and digital soil mapping (DSM) models have been developed for predicting SWHCs. Thus, it is important to select the correct soil properties, topographies, and environmental features when developing a prediction model, as well as to understand the interrelationships among variables. In this study, we collected soil samples at 971 forest sites and developed PTF and DSM models for predicting three kinds of SWHCs: saturated water content (θS) and water content at pF1.8 and pF2.7 (θ1.8 and θ2.7). Important explanatory variables for SWHC prediction were selected from two variable importance analyses. Correlation matrix and sensitivity analysis based on the developed models showed that, as the matric suction changed, the soil physical and chemical properties that influence the SWHCs changed, i.e., soil structure rather than soil particle distribution at θS, coarse soil particles at θ1.8, and finer soil particle at θ2.7. In addition, organic matter had a considerable influence on all SWHCs. Among the topographic features, elevation was the most influential, and it was closely related to the geological variability of bedrock and soil properties. Aspect was highly related to vegetation, confirming that it was an important variable for DSM modeling. Information about important variables and their interrelationship can be used to strengthen PTFs and DSM models for future research
A deep hybrid learning model for customer repurchase behavior
Smartphones have become an integral part of our daily lives, which has led to the rapid growth of the smartphone market. As the global smartphone market tends to remain stable, retaining existing customers has become a challenge for smartphone manufacturers. This study investigates whether a deep hybrid learning approach with various customer-oriented types of data can be useful in exploring customer repurchase behavior of same-brand smartphones. Considering data from more than 74,000 customers, the proposed deep learning approach showed a prediction accuracy higher than 90%. Based on the results of deep hybrid learning models, we aim to provide better understanding on customer behavior, such that it could be used as valuable assets for innovating future marketing strategies
Simple Optimal Sampling Algorithm to Strengthen Digital Soil Mapping Using the Spatial Distribution of Machine Learning Predictive Uncertainty: A Case Study for Field Capacity Prediction
Machine learning models are now capable of delivering coveted digital soil mapping (DSM) benefits (e.g., field capacity (FC) prediction); therefore, determining the optimal sample sites and sample size is essential to maximize the training efficacy. We solve this with a novel optimal sampling algorithm that allows the authentic augmentation of insufficient soil features using machine learning predictive uncertainty. Nine hundred and fifty-three forest soil samples and geographically referenced forest information were used to develop predictive models, and FCs in South Korea were estimated with six predictor set hierarchies. Random forest and gradient boosting models were used for estimation since tree-based models had better predictive performance than other machine learning algorithms. There was a significant relationship between model predictive uncertainties and training data distribution, where higher uncertainties were distributed in the data scarcity area. Further, we confirmed that the predictive uncertainties decreased when additional sample sites were added to the training data. Environmental covariate information of each grid cell in South Korea was then used to select the sampling sites. Optimal sites were coordinated at the cell having the highest predictive uncertainty, and the sample size was determined using the predictable rate. This intuitive method can be generalized to improve global DSM