15 research outputs found
Target Detection in a Known Number of Intervals Based on Cooperative Search Technique
Finding hidden/lost targets in a broad region costs strenuous effort and
takes a long time. From a practical view, it is convenient to analyze the
available data to exclude some parts of the search region. This paper discusses
the coordinated search technique of a one-dimensional problem with a search
region consisting of several mutual intervals. In other words, if the lost
target has a probability of existing in a bounded interval, then the successive
bounded interval has a far-fetched probability. Moreover, the search domain is
swept by two searchers moving in opposite directions, leading to three
categories of target distribution truncations: commensurate, uneven, and
symmetric. The truncated probability distributions are defined and applied
based on the proposed classification to calculate the expected value of the
elapsed time to find the hidden object. Furthermore, the optimization of the
associated expected time values of various cases is investigated based on
Newton's method. Several examples are presented to discuss the behavior of
various distributions under each case of truncation. Also, the associated
expected time values are calculated as their minimum values.Comment: 32 pages, 11 figure
The Second Competition on Spatial Statistics for Large Datasets
In the last few decades, the size of spatial and spatio-temporal datasets in
many research areas has rapidly increased with the development of data
collection technologies. As a result, classical statistical methods in spatial
statistics are facing computational challenges. For example, the kriging
predictor in geostatistics becomes prohibitive on traditional hardware
architectures for large datasets as it requires high computing power and memory
footprint when dealing with large dense matrix operations. Over the years,
various approximation methods have been proposed to address such computational
issues, however, the community lacks a holistic process to assess their
approximation efficiency. To provide a fair assessment, in 2021, we organized
the first competition on spatial statistics for large datasets, generated by
our {\em ExaGeoStat} software, and asked participants to report the results of
estimation and prediction. Thanks to its widely acknowledged success and at the
request of many participants, we organized the second competition in 2022
focusing on predictions for more complex spatial and spatio-temporal processes,
including univariate nonstationary spatial processes, univariate stationary
space-time processes, and bivariate stationary spatial processes. In this
paper, we describe in detail the data generation procedure and make the
valuable datasets publicly available for a wider adoption. Then, we review the
submitted methods from fourteen teams worldwide, analyze the competition
outcomes, and assess the performance of each team
Novel Bayesian Techniques for Dose Response Data
All forms of life are being exposed to different levels of harmful chemicals that can cause various and serious side effects. Toxicological experiments enable researchers to test these chemicals on animal to determine their major effects. Although optimal and acceptable dose levels have been investigated, researchers continue to strive to minimize side effects and the chemical dosages. Dose-response models and other benchmark approaches play a role in determining the acceptable exposure levels of hazardous chemicals. Parametric techniques, used to determine the tolerable dosages, utilizing ANOVA and non-linear regression models are well represented in the literature. We determined the benchmark dose tolerable region for multiple chemicals and multiple endpoints using the Bayesian approach. We then considered improving the tolerable region, which contains the safest dosage, by using a sequential Bayesian design. Sequential Bayesian design uses criteria to determine the optimal follow-up experimental design step by considering the parametric dose-response model. Using our developed criterion, our goal is to define the tolerable region and, hence, the tolerable dosage that results in the fewest adverse side effects.
The biggest drawbacks of parametric approaches is the need to specify the “correct” model, which can be difficult depending on the nature of the data. Recently, there has been an interest in nonparametric approaches for tolerable dosage estimation since it does not depend on parameters information or a predefined distribution. We focused on a monotonically decreasing dose-response model, where the response is a percent to control. This imposes two constraints on the nonparametric approach: the dose-response function must be monotonic and always positive. We propose a Bayesian solution to this problem using a novel class of nonparametric models by considering new basis functions, the Alamri Monotonic spline (AM-spline). Our approach is illustrated using two simulated datasets and two experimental datasets from pesticide related research from the US Environmental Protection Agency.
The toxicology experiment considers the effect of combined multiple chemicals that requires a higher-dimensional dose-response model. Furthermore, multivariate parametric and the multivariate nonparametric models have difficulty fitting rough data, which motivated us to develop the AM-spline to fit that aspect. This new model, the Alamri Monotonic K Dimensional spline (AMKD-spline), is a development of the univariate AM-spline model. Our approach is illustrated using three simulated datasets and one experimental dataset from pesticide-related research from the US Environmental Pro- tection Agency
Data analysis for vague contingency data
Abstract The existing Fisher’s exact test has been widely applied for investigating whether the difference between the observed frequencies is significant or not. The existing Fisher’s exact test can be applied only when the observed frequencies are in determinate form and has no vogues information. In practice, due to the complicity in the production process, it is not always possible to have observed frequencies in determinate form. Therefore, the use of the existing Fisher’s exact test may mislead the industrial engineers. The paper presents the modification of Fisher’s exact test using neutrosophic statistics. The operational process, simulation study, and application using the production data will be given in the paper. From the analysis of industrial data, it can be concluded that the proposed Fisher’s exact test performs well than the existing Fisher’s exact test
Novel Analysis between Two-Unit Hot and Cold Standby Redundant Systems with Varied Demand
Decisive applications, such as control systems and aerial navigation, require a standby system to meet stringent safety, availability, and reliability. The paper evaluates the availability, reliability, and other measures of system effectiveness for two stochastic models in a symmetrical way with varying demand: Model 1 (a two-unit cold standby system) and Model 2 (a two-unit hot standby system). In Model 1, the standby unit needs to be activated before it may begin to function; in Model 2, the standby unit is always operational unless it fails. The current study demonstrates that the hot standby system is more expensive than the cold standby system under two circumstances: a decrease in demand or the hot standby unit’s failure rate exceeding a predetermined threshold. The cold standby system’s activation time is at most a certain threshold, and turning both units on at once is necessary to handle the increasing demand. In that case, the hot standby will be more expensive than the cold standby system. The authors used semi-Markov and regenerative point techniques to analyze both models. They collected actual data from a cable manufacturing plant to illustrate the findings. Plotting several graphs and obtaining cut-off points make it easier to choose the standby to employ
Comparing the efficacy of coefficient of variation control charts using generalized multiple dependent state sampling with various run-rule control charts
Abstract This paper aimed to develop a coefficient of variation (CV) control chart utilizing the generalized multiple dependent state (GMDS) sampling approach for CV monitoring. We conducted a comprehensive examination of this designed control chart in comparison to existing control charts based on multiple dependent state sampling (MDS) and the Shewhart-type CV control chart, with a focus on average run lengths. The results were then compared to run-rule control charts available in the existing literature. Additionally, we elucidated the implementation of the proposed control chart through concrete examples and a simulation study. The findings clearly demonstrated that the GMDS sampling control chart shows significantly superior accuracy in detecting process shifts when compared to the MDS sampling control chart. As a result, the control chart approach presented in this paper holds significant potential for applications in textile and medical industries, particularly when researchers seek to identify minor to moderate shifts in the CV, contributing to enhanced quality control and process monitoring in these domains
Classical and Bayesian Inference of a Progressive-Stress Model for the Nadarajah–Haghighi Distribution with Type II Progressive Censoring and Different Loss Functions
Accelerated life testing (ALT) is a time-saving technology used in a variety of fields to obtain failure time data for test units in a fraction of the time required to test them under normal operating conditions. This study investigated progressive-stress ALT with progressive type II filtering with the lifetime of test units following a Nadarajah–Haghighi (NH) distribution. It is assumed that the scale parameter of the distribution obeys the inverse power law. The maximum likelihood estimates and estimated confidence intervals for the model parameters were obtained first. The Metropolis–Hastings (MH) algorithm was then used to build Bayes estimators for various squared error loss functions. We also computed the highest posterior density (HPD) credible ranges for the model parameters. Monte Carlo simulations were used to compare the outcomes of the various estimation methods proposed. Finally, one data set was analyzed for validation purposes
RDET stacking classifier: a novel machine learning based approach for stroke prediction using imbalance data
The main cause of stroke is the unexpected blockage of blood flow to the brain. The brain cells die if blood is not supplied to them, resulting in body disability. The timely identification of medical conditions ensures patients receive the necessary treatments and assistance. This early diagnosis plays a crucial role in managing symptoms effectively and enhancing the overall quality of life for individuals affected by the stroke. The research proposed an ensemble machine learning (ML) model that predicts brain stroke while reducing parameters and computational complexity. The dataset was obtained from an open-source website Kaggle and the total number of participants is 3,254. However, this dataset needs a significant class imbalance problem. To address this issue, we utilized Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic Sampling (ADAYSN), a technique for oversampling issues. The primary focus of this study centers around developing a stacking and voting approach that exhibits exceptional performance. We propose a stacking ensemble classifier that is more accurate and effective in predicting stroke disease in order to improve the classifier’s performance and minimize overfitting problems. To create a final stronger classifier, the study used three tree-based ML classifiers. Hyperparameters are used to train and fine-tune the random forest (RF), decision tree (DT), and extra tree classifier (ETC), after which they were combined using a stacking classifier and a k-fold cross-validation technique. The effectiveness of this method is verified through the utilization of metrics such as accuracy, precision, recall, and F1-score. In addition, we utilized nine ML classifiers with Hyper-parameter tuning to predict the stroke and compare the effectiveness of Proposed approach with these classifiers. The experimental outcomes demonstrated the superior performance of the stacking classification method compared to other approaches. The stacking method achieved a remarkable accuracy of 100% as well as exceptional F1-score, precision, and recall score. The proposed approach demonstrates a higher rate of accurate predictions compared to previous techniques
ANN: adversarial news net for robust fake news classification
Abstract With easy access to social media platforms, spreading fake news has become a growing concern today. Classifying fake news is essential, as it can help prevent its negative impact on individuals and society. In this regard, an end-to-end framework for fake news detection is developed by utilizing the power of adversarial training to make the model more robust and resilient. The framework is named "ANN: Adversarial News Net," emoticons have been extracted from the datasets to understand their meanings concerning fake news. This information is then fed into the model, which helps to improve its performance in classifying fake news. The performance of the ANN framework is evaluated using four publicly available datasets, and it is found to outperform baseline methods and previous studies after adversarial training. Experiments show that Adversarial Training improved the performance by 2.1% over the Random Forest baseline and 2.4% over the BERT baseline method in terms of accuracy. The proposed framework can be used to detect fake news in real-time, thereby mitigating its harmful effects on society