139 research outputs found

    Meta-optimizations for Cluster Analysis

    Get PDF
    This dissertation thesis deals with advances in the automation of cluster analysis.This dissertation thesis deals with advances in the automation of cluster analysis

    The Impact of COVID-19 on Packaging Design and Production: A Case Study

    Get PDF
    The COVID-19 pandemic was one of the biggest challenges in recent history that affected all aspects of socio-economic life, as well as the ecological environment of our planet. Before the pandemic, the packaging industry captured the interest of governments because of their commitments regarding designs for sustainability. These commitments intended to reduce single-use plastic packaging, increase the use of recyclable materials, and use eco-friendly materials. The crisis, however, has negatively impacted, and changed these priorities. The aim of this study is to provide an overview of the changes in packaging design and production during the COVID-19 pandemic, focusing on a case study in the food packaging industry. A secondary data analysis method was conducted to collect information for the process of comparing and evaluating the factors affecting the change of this industry, and a case study was conducted for the food packaging sector. The obtained research results showed that these companies had the ability to improvise in the context of the COVID-19 pandemic, and suggesst a potential model in order to solve urgent problems in the new context. Keywords: packaging design, design for sustainability, COVID-19 pandemic, case study analysi

    Addressing the challenges of uncertainty in regression models for high dimensional and heterogeneous data from observational studies

    Get PDF
    The lack of replicability in research findings from different scientific disciplines has gained wide attention in the last few years and led to extensive discussions. In this `replication crisis', different types of uncertainty play an important role, which occur at different points of data collection and statistical analysis. Nevertheless, the consequences are often ignored in current research practices with the risk of low credibility and reliability of research findings. For the analysis and the development of solutions to this problem, we define measurement uncertainty, sampling uncertainty, data pre-processing uncertainty, method uncertainty, and model uncertainty, and investigate them in particular in the context of regression analyses. Therefore, we consider data from observational studies with the focus on high dimensionality and heterogeneous variables, which are characteristics of growing importance. High dimensional data, i.e., data with more variables than observations, play an important role in the area of medical research, where large amounts of molecular data (omics data) can be collected with ever decreasing expense and effort. Where several types of omics data are available, we are additionally faced with heterogeneity. Moreover, heterogeneous data can be found in many observational studies, where data originate from different sources, or where variables of different types are collected. This work comprises four contributions with different approaches to this topic and a different focus of investigation. Contribution 1 can be considered as a practical example to illustrate data pre-processing and method uncertainty in the context of prediction and variable selection from high dimensional and heterogeneous data. In the first part of this paper, we introduce the development of priority-Lasso, a hierarchical method for prediction using multi-omics data. Priority-Lasso is based on standard Lasso and assumes a pre-specified priority order of blocks of data. The idea is to successively fit Lasso models on these blocks of data and to take the linear predictor from every fit as an offset in the fit of the block with next lowest priority. In the second part, we apply this method in a current study of acute myeloid leukemia (AML) and compare its performance to standard Lasso. We illustrate data pre-processing and method uncertainty, caused by different choices of variable definitions and specifications of settings in the application of the method. These choices result in different effect estimates and thus different prediction performances and selected variables. In the second contribution, we compare method uncertainty with sampling uncertainty in the context of variable selection and ranking of omics biomarkers. For this purpose, we develop a user-friendly and versatile framework. We apply this framework on data from AML patients with high dimensional and heterogeneous characteristics and explore three different scenarios: First, variable selection in multivariable regression based on multi-omics data, second, variable ranking based on variable importance measures from random forests, and, third, identification of genes based on differential gene expression analysis. In contributions 3 and 4, we apply the vibration of effects framework, which was initially used to analyze model uncertainty in a large epidemiological study (NHANES), to assess and compare different types of uncertainty. The two contributions intensively address the methodological extension of this framework to different types of uncertainty. In contribution 3, we describe the extension of the vibration of effects framework to sampling and data pre-processing uncertainty. As a practical illustration, we take a large data set from psychological research with heterogeneous variable structure (SAPA-project), and examine sampling, model and data pre-processing uncertainty in the context of logistic regression for varying sample sizes. Beyond the comparison of single types of uncertainty, we introduce a strategy which allows quantifying cumulative model and data pre-processing uncertainty and analyzing their relative contributions to the total uncertainty with a variance decomposition. Finally, we extend the vibration of effects framework to measurement uncertainty in contribution 4. In a practical example, we conduct a comparison study between sampling, model and measurement uncertainty on the NHANES data set in the context of survival analysis. We focus on different scenarios of measurement uncertainty which differ in the choice of variables considered to be measured with error. Moreover, we analyze the behavior of different types of uncertainty with increasing sample sizes in a large simulation study

    Novel methods of measuring the similarity and distance between complex fuzzy sets

    Get PDF
    This thesis develops measures that enable comparisons of subjective information that is represented through fuzzy sets. Many applications rely on information that is subjective and imprecise due to varying contexts and so fuzzy sets were developed as a method of modelling uncertain data. However, making relative comparisons between data-driven fuzzy sets can be challenging. For example, when data sets are ambiguous or contradictory, then the fuzzy set models often become non-normal or non-convex, making them difficult to compare. This thesis presents methods of comparing data that may be represented by such (complex) non-normal or non-convex fuzzy sets. The developed approaches for calculating relative comparisons also enable fusing methods of measuring similarity and distance between fuzzy sets. By using multiple methods, more meaningful comparisons of fuzzy sets are possible. Whereas if only a single type of measure is used, ambiguous results are more likely to occur. This thesis provides a series of advances around the measuring of similarity and distance. Based on them, novel applications are possible, such as personalised and crowd-driven product recommendations. To demonstrate the value of the proposed methods, a recommendation system is developed that enables a person to describe their desired product in relation to one or more other known products. Relative comparisons are then used to find and recommend something that matches a person's subjective preferences. Demonstrations illustrate that the proposed method is useful for comparing complex, non-normal and non-convex fuzzy sets. In addition, the recommendation system is effective at using this approach to find products that match a given query

    Novel methods of measuring the similarity and distance between complex fuzzy sets

    Get PDF
    This thesis develops measures that enable comparisons of subjective information that is represented through fuzzy sets. Many applications rely on information that is subjective and imprecise due to varying contexts and so fuzzy sets were developed as a method of modelling uncertain data. However, making relative comparisons between data-driven fuzzy sets can be challenging. For example, when data sets are ambiguous or contradictory, then the fuzzy set models often become non-normal or non-convex, making them difficult to compare. This thesis presents methods of comparing data that may be represented by such (complex) non-normal or non-convex fuzzy sets. The developed approaches for calculating relative comparisons also enable fusing methods of measuring similarity and distance between fuzzy sets. By using multiple methods, more meaningful comparisons of fuzzy sets are possible. Whereas if only a single type of measure is used, ambiguous results are more likely to occur. This thesis provides a series of advances around the measuring of similarity and distance. Based on them, novel applications are possible, such as personalised and crowd-driven product recommendations. To demonstrate the value of the proposed methods, a recommendation system is developed that enables a person to describe their desired product in relation to one or more other known products. Relative comparisons are then used to find and recommend something that matches a person's subjective preferences. Demonstrations illustrate that the proposed method is useful for comparing complex, non-normal and non-convex fuzzy sets. In addition, the recommendation system is effective at using this approach to find products that match a given query

    A Quantitative Methodology for Vetting Dark Network Intelligence Sources for Social Network Analysis

    Get PDF
    Social network analysis (SNA) is used by the DoD to describe and analyze social networks, leading to recommendations for operational decisions. However, social network models are constructed from various information sources of indeterminate reliability. Inclusion of unreliable information can lead to incorrect models resulting in flawed analysis and decisions. This research develops a methodology to assist the analyst by quantitatively identifying and categorizing information sources so that determinations on including or excluding provided data can be made. This research pursued three main thrusts. It consolidated binary similarity measures to determine social network information sources\u27 concordance and developed a methodology to select suitable measures dependent upon application considerations. A methodology was developed to assess the validity of individual sources of social network data. This methodology utilized source pairwise comparisons to measure information sources\u27 concordance and a weighting schema to account for sources\u27 unique perspectives of the underlying social network. Finally, the developed methodology was tested over a variety of generated networks with varying parameters in a design of experiments paradigm (DOE). Various factors relevant to conditions faced by SNA analysts potentially employing this methodology were examined. The DOE was comprised of a 24 full factorial design augmented with a nearly orthogonal Latin hypercube. A linear model was constructed using quantile regression to mitigate the non-normality of the error terms

    Algebraic structures of neutrosophic triplets, neutrosophic duplets, or neutrosophic multisets. Volume II

    Get PDF
    The topics approached in this collection of papers are: neutrosophic sets; neutrosophic logic; generalized neutrosophic set; neutrosophic rough set; multigranulation neutrosophic rough set (MNRS); neutrosophic cubic sets; triangular fuzzy neutrosophic sets (TFNSs); probabilistic single-valued (interval) neutrosophic hesitant fuzzy set; neutro-homomorphism; neutrosophic computation; quantum computation; neutrosophic association rule; data mining; big data; oracle Turing machines; recursive enumerability; oracle computation; interval number; dependent degree; possibility degree; power aggregation operators; multi-criteria group decision-making (MCGDM); expert set; soft sets; LA-semihypergroups; single valued trapezoidal neutrosophic number; inclusion relation; Q-linguistic neutrosophic variable set; vector similarity measure; fundamental neutro-homomorphism theorem; neutro-isomorphism theorem; quasi neutrosophic triplet loop; quasi neutrosophic triplet group; BE-algebra; cloud model; fuzzy measure; clustering algorithm; and many more

    SemEval-2018 Task 1: Affect in Tweets

    Get PDF
    We present the SemEval-2018 Task 1: Affect in Tweets, which includes an array of subtasks on inferring the affectual state of a person from their tweet. For each task, we created labeled data from English, Arabic, and Spanish tweets. The individual tasks are: 1. emotion intensity regression, 2. emotion intensity ordinal classification, 3. valence (sentiment) regression, 4. valence ordinal classification, and 5. emotion classification. Seventy-five teams (about 200 team members) participated in the shared task. We summarize the methods, resources, and tools used by the participating teams, with a focus on the techniques and resources that are particularly useful. We also analyze systems for consistent bias towards a particular race or gender. The data is made freely available to further improve our understanding of how people convey emotions through language

    Collected Papers (on Physics, Artificial Intelligence, Health Issues, Decision Making, Economics, Statistics), Volume XI

    Get PDF
    This eleventh volume of Collected Papers includes 90 papers comprising 988 pages on Physics, Artificial Intelligence, Health Issues, Decision Making, Economics, Statistics, written between 2001-2022 by the author alone or in collaboration with the following 84 co-authors (alphabetically ordered) from 19 countries: Abhijit Saha, Abu Sufian, Jack Allen, Shahbaz Ali, Ali Safaa Sadiq, Aliya Fahmi, Atiqa Fakhar, Atiqa Firdous, Sukanto Bhattacharya, Robert N. Boyd, Victor Chang, Victor Christianto, V. Christy, Dao The Son, Debjit Dutta, Azeddine Elhassouny, Fazal Ghani, Fazli Amin, Anirudha Ghosha, Nasruddin Hassan, Hoang Viet Long, Jhulaneswar Baidya, Jin Kim, Jun Ye, Darjan Karabašević, Vasilios N. Katsikis, Ieva Meidutė-Kavaliauskienė, F. Kaymarm, Nour Eldeen M. Khalifa, Madad Khan, Qaisar Khan, M. Khoshnevisan, Kifayat Ullah,, Volodymyr Krasnoholovets, Mukesh Kumar, Le Hoang Son, Luong Thi Hong Lan, Tahir Mahmood, Mahmoud Ismail, Mohamed Abdel-Basset, Siti Nurul Fitriah Mohamad, Mohamed Loey, Mai Mohamed, K. Mohana, Kalyan Mondal, Muhammad Gulfam, Muhammad Khalid Mahmood, Muhammad Jamil, Muhammad Yaqub Khan, Muhammad Riaz, Nguyen Dinh Hoa, Cu Nguyen Giap, Nguyen Tho Thong, Peide Liu, Pham Huy Thong, Gabrijela Popović‬‬‬‬‬‬‬‬‬‬, Surapati Pramanik, Dmitri Rabounski, Roslan Hasni, Rumi Roy, Tapan Kumar Roy, Said Broumi, Saleem Abdullah, Muzafer Saračević, Ganeshsree Selvachandran, Shariful Alam, Shyamal Dalapati, Housila P. Singh, R. Singh, Rajesh Singh, Predrag S. Stanimirović, Kasan Susilo, Dragiša Stanujkić, Alexandra Şandru, Ovidiu Ilie Şandru, Zenonas Turskis, Yunita Umniyati, Alptekin Ulutaș, Maikel Yelandi Leyva Vázquez, Binyamin Yusoff, Edmundas Kazimieras Zavadskas, Zhao Loon Wang.‬‬‬
    corecore