152 research outputs found
MVMR-FS : Non-parametric feature selection algorithm based on Maximum inter-class Variation and Minimum Redundancy
How to accurately measure the relevance and redundancy of features is an
age-old challenge in the field of feature selection. However, existing
filter-based feature selection methods cannot directly measure redundancy for
continuous data. In addition, most methods rely on manually specifying the
number of features, which may introduce errors in the absence of expert
knowledge. In this paper, we propose a non-parametric feature selection
algorithm based on maximum inter-class variation and minimum redundancy,
abbreviated as MVMR-FS. We first introduce supervised and unsupervised kernel
density estimation on the features to capture their similarities and
differences in inter-class and overall distributions. Subsequently, we present
the criteria for maximum inter-class variation and minimum redundancy (MVMR),
wherein the inter-class probability distributions are employed to reflect
feature relevance and the distances between overall probability distributions
are used to quantify redundancy. Finally, we employ an AGA to search for the
feature subset that minimizes the MVMR. Compared with ten state-of-the-art
methods, MVMR-FS achieves the highest average accuracy and improves the
accuracy by 5% to 11%
LIPIcs, Volume 261, ICALP 2023, Complete Volume
LIPIcs, Volume 261, ICALP 2023, Complete Volum
Finding Optimal Diverse Feature Sets with Alternative Feature Selection
Feature selection is popular for obtaining small, interpretable, yet highly
accurate prediction models. Conventional feature-selection methods typically
yield one feature set only, which might not suffice in some scenarios. For
example, users might be interested in finding alternative feature sets with
similar prediction quality, offering different explanations of the data. In
this article, we introduce alternative feature selection and formalize it as an
optimization problem. In particular, we define alternatives via constraints and
enable users to control the number and dissimilarity of alternatives. Next, we
analyze the complexity of this optimization problem and show NP-hardness.
Further, we discuss how to integrate conventional feature-selection methods as
objectives. Finally, we evaluate alternative feature selection with 30
classification datasets. We observe that alternative feature sets may indeed
have high prediction quality, and we analyze several factors influencing this
outcome
Finding Optimal Diverse Feature Sets with Alternative Feature Selection
Feature selection is popular for obtaining small, interpretable, yet highly accurate prediction models. Conventional feature-selection methods typically yield one feature set only, which might not suffice in some scenarios. For example, users might be interested in finding alternative feature sets with similar prediction quality, offering different explanations of the data. In this article, we introduce alternative feature selection and formalize it as an optimization problem. In particular, we define alternatives via constraints and enable users to control the number and dissimilarity of alternatives. Next, we analyze the complexity of this optimization problem and show NP-hardness. Further, we discuss how to integrate conventional feature-selection methods as objectives. Finally, we evaluate alternative feature selection with 30 classification datasets. We observe that alternative feature sets may indeed have high prediction quality, and we analyze several factors influencing this outcome
Z-Numbers-Based Approach to Hotel Service Quality Assessment
In this study, we are analyzing the possibility of using Z-numbers for
measuring the service quality and decision-making for quality improvement in the
hotel industry. Techniques used for these purposes are based on consumer evalu-
ations - expectations and perceptions. As a rule, these evaluations are expressed
in crisp numbers (Likert scale) or fuzzy estimates. However, descriptions of the
respondent opinions based on crisp or fuzzy numbers formalism not in all cases
are relevant. The existing methods do not take into account the degree of con-
fidence of respondents in their assessments. A fuzzy approach better describes
the uncertainties associated with human perceptions and expectations. Linguis-
tic values are more acceptable than crisp numbers. To consider the subjective
natures of both service quality estimates and confidence degree in them, the two-
component Z-numbers Z = (A, B) were used. Z-numbers express more adequately
the opinion of consumers. The proposed and computationally efficient approach
(Z-SERVQUAL, Z-IPA) allows to determine the quality of services and iden-
tify the factors that required improvement and the areas for further development.
The suggested method was applied to evaluate the service quality in small and
medium-sized hotels in Turkey and Azerbaijan, illustrated by the example
De l'apprentissage faiblement supervisé au catalogage en ligne
Applied mathematics and machine computations have raised a lot of hope since the recent success of supervised learning. Many practitioners in industries have been trying to switch from their old paradigms to machine learning. Interestingly, those data scientists spend more time scrapping, annotating and cleaning data than fine-tuning models. This thesis is motivated by the following question: can we derive a more generic framework than the one of supervised learning in order to learn from clutter data? This question is approached through the lens of weakly supervised learning, assuming that the bottleneck of data collection lies in annotation. We model weak supervision as giving, rather than a unique target, a set of target candidates. We argue that one should look for an âoptimisticâ function that matches most of the observations. This allows us to derive a principle to disambiguate partial labels. We also discuss the advantage to incorporate unsupervised learning techniques into our framework, in particular manifold regularization approached through diffusion techniques, for which we derived a new algorithm that scales better with input dimension then the baseline method. Finally, we switch from passive to active weakly supervised learning, introducing the âactive labelingâ framework, in which a practitioner can query weak information about chosen data. Among others, we leverage the fact that one does not need full information to access stochastic gradients and perform stochastic gradient descent.Les mathĂ©matiques appliquĂ©es et le calcul nourrissent beaucoup dâespoirs Ă la suite des succĂšs rĂ©cents de lâapprentissage supervisĂ©. Dans lâindustrie, beaucoup dâingĂ©nieurs cherchent Ă remplacer leurs anciens paradigmes de pensĂ©e par lâapprentissage machine. Ătonnamment, ces ingĂ©nieurs passent plus de temps Ă collecter, annoter et nettoyer des donnĂ©es quâĂ raffiner des modĂšles. Ce phĂ©nomĂšne motive la problĂ©matique de cette thĂšse: peut-on dĂ©finir un cadre thĂ©orique plus gĂ©nĂ©ral que lâapprentissage supervisĂ© pour apprendre grĂące Ă des donnĂ©es hĂ©tĂ©rogĂšnes? Cette question est abordĂ©e via le concept de supervision faible, faisant lâhypothĂšse que le problĂšme que posent les donnĂ©es est leur annotation. On modĂ©lise la supervision faible comme lâaccĂšs, pour une entrĂ©e donnĂ©e, non pas dâune sortie claire, mais dâun ensemble de sorties potentielles. On plaide pour lâadoption dâune perspective « optimiste » et lâapprentissage dâune fonction qui vĂ©rifie la plupart des observations. Cette perspective nous permet de dĂ©finir un principe pour lever lâambiguĂŻtĂ© des informations faibles. On discute Ă©galement de lâimportance dâincorporer des techniques sans supervision dâapprĂ©hension des donnĂ©es dâentrĂ©e dans notre thĂ©orie, en particulier de comprĂ©hension de la variĂ©tĂ© sous-jacente via des techniques de diffusion, pour lesquelles on propose un algorithme rĂ©aliste afin dâĂ©viter le flĂ©au de la dimension, Ă lâinverse de ce qui existait jusquâalors. Enfin, nous nous attaquons Ă la question de collecte active dâinformations faibles, dĂ©finissant le problĂšme de « catalogage en ligne », oĂč un intendant doit acquĂ©rir une maximum dâinformations fiables sur ses donnĂ©es sous une contrainte de budget. Entre autres, nous tirons parti du fait que pour obtenir un gradient stochastique et effectuer une descente de gradient, il nây a pas besoin de supervision totale
Methods in Contemporary Linguistics
The present volume is a broad overview of methods and methodologies in linguistics, illustrated with examples from concrete research. It collects insights gained from a broad range of linguistic sub-disciplines, ranging from core disciplines to topics in cross-linguistic and language-internal diversity or to contributions towards language, space and society. Given its critical and innovative nature, the volume is a valuable source for students and researchers of a broad range of linguistic interests
Essays in Behavioral Economics and Microeconomic Theory
Kapitel 1: Im Rahmen des Erwartungsnutzenmodells
leite ich ein theoretisches Modell von choice bracketing
aus zwei verhaltensökonomischen Axiomen ab. Das erste etabliert
einen direkten Zusammenhang zwischen narrow bracketing und correlation
neglect. Das zweite identifiziert den Referenzpunkt als den Ort, an
dem broad und narrow PrÀferenzen miteinander verbunden sind. In meinem
Modell ist der narrow bracketer durch die UnfÀhigkeit, VerÀnderungen vom
Referenzpunkt in unterschiedlichen Dimensionen gleichzeitig zu verarbeiten,
charakterisiert.
Kapitel 2: Warum geben Menschen, wenn man
sie fragt, prÀferieren aber, nicht gefragt zu werden, und nehmen sogar, wenn
sich die Gelegenheit ergibt? Wir zeigen, dass Axiome wie SeparabilitÀt, narrow
bracketing, und scaling invariance diese scheinbar widersprĂŒchlichen
Beobachtungen vorhersagen. Insbesondere implizieren diese Axiome, dass
die Interdependenz von PrĂ€ferenzen (âAltruismusâ) ein Ergebnis des Interesses
fĂŒr das Wohlbefinden anderer im Gegensatz zu ihren bloĂen Auszahlungen
ist. Hierbei wird das Wohlbefinden durch die referenzabhÀngige Wertfunktion
aus der Prospekttheorie erfasst.
Kapitel 3: Wir untersuchen,
wie sich fake news auf den Informationsfluss zwischen Nachrichtenportalen
und âökonomischen Agenten auswirkt. Wir erweitern das klassische cheaptalk-
Modell um Unsicherheit ĂŒber die PrĂ€ferenzen des sender (Nachrichtenportal).
Es gibt zwei Typen von Nachrichtenportalen. Ein fake-news-Portal
möchte im Agenten unabhÀngig vom wahren Zustand eine maximale Erwartung
wecken. Ein legitimes Nachrichtenportal möchte die Wahrheit offenbaren.
Wir zeigen, dass jedes informative perfekte Bayesianische Gleichgewicht durch
einen Schwellenwert charakterisiert ist. WÀhrend der Agent alle ZustÀnde
unter dem Schwellenwert unterscheiden kann, ist es ihm unmöglich, ZustÀnde
ĂŒber dem Schwellenwert zu unterscheiden.Chapter 1: I derive a theoretical
model of choice bracketing from two behavioral axioms in an expected utility
framework. The first behavioral axiom establishes a direct link between narrow
bracketing and correlation neglect. The second behavioral axiom identifies
the reference point as the place where broad and narrow preferences are
connected. In my model, the narrow bracketer is characterized by an inability
to process changes from the reference point in different dimensions simultaneously.
Chapter 2: Why do people give when asked, but
prefer not to be asked, and even take when possible? We show that standard
behavioral axioms including separability, narrow bracketing, and scaling invariance
predict these seemingly inconsistent observations. Specifically, these
axioms imply that interdependence of preferences (âaltruismâ) results from
concerns for the welfare of others, as opposed to their mere payoffs, where
individual welfares are captured by the reference-dependent value functions
known from prospect theory. The resulting preferences are non-convex, which
captures giving, sorting, and taking directly.
Chapter 3: We present a theoretical
model to investigate how the presence of fake news affects information
transmission from media outlets to economic agents. In a standard cheap talk
framework we introduce uncertainty about the senderâs (media outletâs) preferences.
There are two types of media outlets. A fake news outlet wants to
push the agentâs belief to the maximum irrespective of the state of the world.
A legitimate outlet wants to reveal the true state to the agent. We show that
any informative perfect Bayesian equilibrium of our game is characterized
by a threshold value. While the agent can perfectly separate amongst states
below the threshold value, there is no separation amongst states above the
threshold value. We determine the unique most informative threshold value
for a general class of equilibria
Legal methods for resolving apparent conflicts between fundamental rights
The subject of this thesis are apparent conflicts between fundamental rights, which represent one of the most important problems contemporary legal systems are faced with. More specifically, this thesis presents and analyses different legal methods that have been suggested as answers to the problem. Contemporary constitutions usually contain provisions protecting certain fundamental rights, such as the right to life, the right to privacy, the right to freedom of expression, personality rights, the right to health, etc. The problem can arise when two (or more) provisions protecting fundamental rights are relevant to the specific situation. The question can then arise: should the behaviour be permitted or prohibited? Judges may then be faced with the situation of having to decide the case without any explicit or clear guidance on how to decide the case. In such situations, lex superior, lex posterior and lex specialis are usually inapplicable, because the provisions regulating fundamental rights are usually on the same hierarchical level, were enacted at the same time and no general â special relationship can be established between them. The problem is further complicated by the fact that the norms expressing fundamental rights are generally understood as legal principles, supposedly different from legal rules. These cases are commonly referred to and known in the literature as hard cases. In order to decide such cases and solve the problem we are faced with various legal methods have been proposed. These methods represent possible answers to the problem of the resolution of the apparent conflicts between fundamental rights. The term âapparentâ is used, since there is a debate regarding the existence of ârealâ conflicts between fundamental rights. The objective of the thesis is to provide an answer to the research question: What are the legal methods of resolving apparent conflicts between fundamental rights and what are their merits in comparison to each other?
In order to answer the research question, different legal methods that have been suggested as an answer to the problem of apparent conflicts between fundamental rights are presented, analysed and compared. In this way, the thesis aims to contribute to the understanding of the strengths and weaknesses of the different legal methods that have been proposed to solve the problem. To achieve this, the thesis is divided into three main chapters, each of which presents and analyses different legal methods on apparent conflicts between fundamental rights. In Chapter I and Chapter II, the main legal method proposed to resolve apparent conflicts between fundamental rights â judicial balancing â is presented and analysed. In Chapter III, alternative, non-balancing legal methods for resolving apparent conflicts between fundamental rights are presented and analysed.
Chapter I presents and analyses Alexyan theory of judicial balancing, developed by Robert Alexy and further refined by his disciples Jan-Reinard Sieckmann, Martin Borowski and Matthias Klatt. Chapter II presents and analyses approaches from Aharon Barak, Manuel Atienza, JosĂ© Juan Moreso, Riccardo Guastini and Susan Lynn Hurley. In Chapter III, alternative, non-balancing legal methods for resolving apparent conflicts between fundamental rights are presented and analysed. The authors whose approaches are analysed and presented in this chapter are, in this order: Ronald Dworkin, Luigi Ferrajoli, Juan Antonio GarcĂa Amado, Lorenzo Zucca and Ruth Chang
- âŠ