244 research outputs found
A Response-Time Analysis for Non-Preemptive Job Sets under Global Scheduling
An effective way to increase the timing predictability of multicore platforms is to use non-preemptive scheduling. It reduces preemption and job migration overheads, avoids intra-core cache interference, and improves the accuracy of worst-case execution time (WCET) estimates. However, existing schedulability tests for global non-preemptive multiprocessor scheduling are pessimistic, especially when applied to periodic workloads. This paper reduces this pessimism by introducing a new type of sufficient schedulability analysis that is based on an exploration of the space of possible schedules using concise abstractions and state-pruning techniques. Specifically, we analyze the schedulability of non-preemptive job sets (with bounded release jitter and execution time variation) scheduled by a global job-level fixed-priority (JLFP) scheduling algorithm upon an identical multicore platform. The analysis yields a lower bound on the best-case response-time (BCRT) and an upper bound on the worst-case response time (WCRT) of the jobs. In an empirical evaluation with randomly generated workloads, we show that the method scales to 30 tasks, a hundred thousand jobs (per hyperperiod), and up to 9 cores
A Response-Time Analysis for Non-Preemptive Job Sets under Global Scheduling
An effective way to increase the timing predictability of multicore platforms is to use non-preemptive scheduling. It reduces preemption and job migration overheads, avoids intra-core cache interference, and improves the accuracy of worst-case execution time (WCET) estimates. However, existing schedulability tests for global non-preemptive multiprocessor scheduling are pessimistic, especially when applied to periodic workloads. This paper reduces this pessimism by introducing a new type of sufficient schedulability analysis that is based on an exploration of the space of possible schedules using concise abstractions and state-pruning techniques. Specifically, we analyze the schedulability of non-preemptive job sets (with bounded release jitter and execution time variation) scheduled by a global job-level fixed-priority (JLFP) scheduling algorithm upon an identical multicore platform. The analysis yields a lower bound on the best-case response-time (BCRT) and an upper bound on the worst-case response time (WCRT) of the jobs. In an empirical evaluation with randomly generated workloads, we show that the method scales to 30 tasks, a hundred thousand jobs (per hyperperiod), and up to 9 cores.info:eu-repo/semantics/publishedVersio
Bag-of-words representations for computer audition
Computer audition is omnipresent in everyday life, in applications ranging from personalised virtual agents to health care. From a technical point of view, the goal is to robustly classify the content of an audio signal in terms of a defined set of labels, such as, e.g., the acoustic scene, a medical diagnosis, or, in the case of speech, what is said or how it is said. Typical approaches employ machine learning (ML), which means that task-specific models are trained by means of examples. Despite recent successes in neural network-based end-to-end learning, taking the raw audio signal as input, models relying on hand-crafted acoustic features are still superior in some domains, especially for tasks where data is scarce. One major issue is nevertheless that a sequence of acoustic low-level descriptors (LLDs) cannot be fed directly into many ML algorithms as they require a static and fixed-length input. Moreover, also for dynamic classifiers, compressing the information of the LLDs over a temporal block by summarising them can be beneficial. However, the type of instance-level representation has a fundamental impact on the performance of the model. In this thesis, the so-called bag-of-audio-words (BoAW) representation is investigated as an alternative to the standard approach of statistical functionals. BoAW is an unsupervised method of representation learning, inspired from the bag-of-words method in natural language processing, forming a histogram of the terms present in a document. The toolkit openXBOW is introduced, enabling systematic learning and optimisation of these feature representations, unified across arbitrary modalities of numeric or symbolic descriptors. A number of experiments on BoAW are presented and discussed, focussing on a large number of potential applications and corresponding databases, ranging from emotion recognition in speech to medical diagnosis. The evaluations include a comparison of different acoustic LLD sets and configurations of the BoAW generation process. The key findings are that BoAW features are a meaningful alternative to statistical functionals, offering certain benefits, while being able to preserve the advantages of functionals, such as data-independence. Furthermore, it is shown that both representations are complementary and their fusion improves the performance of a machine listening system.Maschinelles Hören ist im täglichen Leben allgegenwärtig, mit Anwendungen, die von personalisierten virtuellen Agenten bis hin zum Gesundheitswesen reichen. Aus technischer Sicht besteht das Ziel darin, den Inhalt eines Audiosignals hinsichtlich einer Auswahl definierter Labels robust zu klassifizieren. Die Labels beschreiben bspw. die akustische Umgebung der Aufnahme, eine medizinische Diagnose oder - im Falle von Sprache - was gesagt wird oder wie es gesagt wird. Übliche Ansätze hierzu verwenden maschinelles Lernen, d.h., es werden anwendungsspezifische Modelle anhand von Beispieldaten trainiert. Trotz jüngster Erfolge beim Ende-zu-Ende-Lernen mittels neuronaler Netze, in welchen das unverarbeitete Audiosignal als Eingabe benutzt wird, sind Modelle, die auf definierten akustischen Merkmalen basieren, in manchen Bereichen weiterhin überlegen. Dies gilt im Besonderen für Einsatzzwecke, für die nur wenige Daten vorhanden sind. Allerdings besteht dabei das Problem, dass Zeitfolgen von akustischen Deskriptoren in viele Algorithmen des maschinellen Lernens nicht direkt eingespeist werden können, da diese eine statische Eingabe fester Länge benötigen. Außerdem kann es auch für dynamische (zeitabhängige) Klassifikatoren vorteilhaft sein, die Deskriptoren über ein gewisses Zeitintervall zusammenzufassen. Jedoch hat die Art der Merkmalsdarstellung einen grundlegenden Einfluss auf die Leistungsfähigkeit des Modells. In der vorliegenden Dissertation wird der sogenannte Bag-of-Audio-Words-Ansatz (BoAW) als Alternative zum Standardansatz der statistischen Funktionale untersucht. BoAW ist eine Methode des unüberwachten Lernens von Merkmalsdarstellungen, die von der Bag-of-Words-Methode in der Computerlinguistik inspiriert wurde, bei der ein Textdokument als Histogramm der vorkommenden Wörter beschrieben wird. Das Toolkit openXBOW wird vorgestellt, welches systematisches Training und Optimierung dieser Merkmalsdarstellungen - vereinheitlicht für beliebige Modalitäten mit numerischen oder symbolischen Deskriptoren - erlaubt. Es werden einige Experimente zum BoAW-Ansatz durchgeführt und diskutiert, die sich auf eine große Zahl möglicher Anwendungen und entsprechende Datensätze beziehen, von der Emotionserkennung in gesprochener Sprache bis zur medizinischen Diagnostik. Die Auswertungen beinhalten einen Vergleich verschiedener akustischer Deskriptoren und Konfigurationen der BoAW-Methode. Die wichtigsten Erkenntnisse sind, dass BoAW-Merkmalsvektoren eine geeignete Alternative zu statistischen Funktionalen darstellen, gewisse Vorzüge bieten und gleichzeitig wichtige Eigenschaften der Funktionale, wie bspw. die Datenunabhängigkeit, erhalten können. Zudem wird gezeigt, dass beide Darstellungen komplementär sind und eine Fusionierung die Leistungsfähigkeit eines Systems des maschinellen Hörens verbessert
Spatio-Temporal Reasoning About Agent Behavior
There are many applications where we wish to reason about spatio-temporal aspects of an agent's behavior. This dissertation examines several facets of this type of reasoning. First, given a model of past agent behavior, we wish to reason about the probability that an agent takes a given action at a certain time. Previous work combining temporal and probabilistic reasoning has made either independence or Markov assumptions. This work introduces Annotated Probabilistic Temporal (APT) logic which makes neither assumption. Statements in APT logic consist of rules of the form "Formula G becomes true with a probability [L,U] within T time units after formula F becomes true'' and can be written by experts or extracted automatically. We explore the problem of entailment - finding the probability that an agent performs a given action at a certain time based on such a model. We study this problem's complexity and develop a sound, but incomplete fixpoint operator as a heuristic - implementing it and testing it on automatically generated models from several datasets.
Second, agent behavior often results in "observations'' at geospatial locations that imply the existence of other, unobserved, locations we wish to find ("partners"). In this dissertation, we formalize this notion with "geospatial abduction problems" (GAPs). GAPs try to infer a set of partner locations for a set of observations and a model representing the relationship between observations and partners for a given agent. This dissertation presents exact and approximate algorithms for solving GAPs as well as an implemented software package for addressing these problems called
SCARE (the Spatio-Cultural Abductive Reasoning Engine). We tested SCARE on counter-insurgency data from Iraq and obtained good results. We then provide an adversarial extension to GAPs as follows: given a fixed set of observations, if an adversary has probabilistic knowledge of how an agent were to find a corresponding set of partners, he would place the partners in locations that minimize the expected number of partners found by the agent. We examine this problem, along with its complement by studying their computational complexity, developing algorithms, and implementing approximation schemes.
We also introduce a class of problems called geospatial optimization problems (GOPs). Here the agent has a set of actions that modify attributes of a geospatial region and he wishes to select a limited number of such actions (with respect to some budget and other constraints) in a manner that maximizes a benefit function. We study the complexity of this problem and develop exact methods. We then develop an approximation algorithm with a guarantee. For some real-world applications, such as epidemiology, there is an underlying diffusion process that also affects geospatial proprieties. We address this with social network optimization problems (SNOPs) where given a weighted, labeled, directed graph we seek to find a set of vertices, that if given some initial property, optimize an aggregate study with respect to such diffusion. We develop and implement a heuristic that obtains a guarantee for a large class of such problems
A modified multi-class association rule for text mining
Classification and association rule mining are significant tasks in data mining. Integrating association rule discovery and classification in data mining brings us an approach known as the associative classification. One common shortcoming of existing Association Classifiers is the huge number of rules produced in order to obtain high classification accuracy. This study proposes s a Modified Multi-class Association Rule Mining (mMCAR) that consists of three procedures; rule discovery, rule pruning and group-based class assignment. The rule discovery and rule pruning
procedures are designed to reduce the number of classification rules. On the other hand, the group-based class assignment procedure contributes in improving the classification accuracy. Experiments on the structured and unstructured text datasets
obtained from the UCI and Reuters repositories are performed in order to evaluate the proposed Association Classifier. The proposed mMCAR classifier is benchmarked against the traditional classifiers and existing Association Classifiers.
Experimental results indicate that the proposed Association Classifier, mMCAR, produced high accuracy with a smaller number of classification rules. For the structured dataset, the mMCAR produces an average of 84.24% accuracy as compared to MCAR that obtains 84.23%. Even though the classification accuracy difference is small, the proposed mMCAR uses only 50 rules for the classification while its benchmark method involves 60 rules. On the other hand, mMCAR is at par
with MCAR when unstructured dataset is utilized. Both classifiers produce 89% accuracy but mMCAR uses less number of rules for the classification. This study contributes to the text mining domain as automatic classification of huge and widely
distributed textual data could facilitate the text representation and retrieval processes
Efficient algorithms for optimal matching problems under preferences
In this thesis we consider efficient algorithms for matching problems involving preferences,
i.e., problems where agents may be required to list other agents that they find
acceptable in order of preference. In particular we mainly study the Stable Marriage
problem (SM), the Hospitals / Residents problem (HR) and the Student / Project Allocation
problem (SPA), and some of their variants. In some of these problems the aim
is to find a stable matching which is one that admits no blocking pair. A blocking pair
with respect to a matching is a pair of agents that prefer to be matched to each other
than their assigned partners in the matching if any.
We present an Integer Programming (IP) model for the Hospitals / Residents problem
with Ties (HRT) and use it to find a maximum cardinality stable matching. We also
present results from an empirical evaluation of our model which show it to be scalable
with respect to real-world HRT instance sizes.
Motivated by the observation that not all blocking pairs that exist in theory will lead
to a matching being undermined in practice, we investigate a relaxed stability criterion
called social stability where only pairs of agents with a social relationship have the
ability to undermine a matching. This stability concept is studied in instances of
the Stable Marriage problem with Incomplete lists (smi) and in instances of hr. We
show that, in the smi and hr contexts, socially stable matchings can be of varying
sizes and the problem of finding a maximum socially stable matching (max smiss and
max hrss respectively) is NP-hard though approximable within 3/2. Furthermore we
give polynomial time algorithms for three special cases of the problem arising from
restrictions on the social network graph and the lengths of agents’ preference lists.
We also consider other optimality criteria with respect to social stability and establish
inapproximability bounds for the problems of finding an egalitarian, minimum regret
and sex equal socially stable matching in the sm context.
We extend our study of social stability by considering other variants and restrictions
of max smiss and max hrss. We present NP-hardness results for max smiss even
under certain restrictions on the degree and structure of the social network graph as
well as the presence of master lists. Other NP-hardness results presented relate to the
problem of determining whether a given man-woman pair belongs to a socially stable
matching and the problem of determining whether a given man (or woman) is part of
at least one socially stable matching. We also consider the Stable Roommates problem
with Incomplete lists under Social Stability (a non-bipartite generalisation of smi under
social stability). We observe that the problem of finding a maximum socially stable
matching in this context is also NP-hard. We present efficient algorithms for three
special cases of the problem arising from restrictions on the social network graph and
the lengths of agents’ preference lists. These are the cases where (i) there exists a
constant number of acquainted pairs (ii) or a constant number of unacquainted pairs
or (iii) each preference list is of length at most 2.
We also present algorithmic results for finding matchings in the spa context that are
optimal with respect to profile, which is the vector whose ith component is the number
of students assigned to their ith-choice project. We present an efficient algorithm for
finding a greedy maximum matching in the spa context — this is a maximum matching
whose profile is lexicographically maximum. We then show how to adapt this algorithm
to find a generous maximum matching — this is a matching whose reverse profile is
lexicographically minimum. We demonstrate how this approach can allow additional
constraints, such as lecturer lower quotas, to be handled flexibly. We also present
results of empirical evaluations carried out on both real world and randomly generated
datasets. These results demonstrate the scalability of our algorithms as well as some
interesting properties of these profile-based optimality criteria.
Practical applications of spa motivate the investigation of certain special cases of the
problem. For instance, it is often desired that the workload on lecturers is evenly distributed
(i.e. load balanced). We enforce this by either adding lower quota constraints
on the lecturers (which leads to the potential for infeasible problem instances) or adding
a load balancing optimisation criterion. We present efficient algorithms in both cases.
Another consideration is the fact that certain projects may require a minimum number
of students to become viable. This can be handled by enforcing lower quota constraints
on the projects (which also leads to the possibility of infeasible problem instances). A
technique of handling this infeasibility is the idea of closing projects that do not meet
their lower quotas (i.e. leaving such project completely unassigned). We show that the
problem of finding a maximum matching subject to project lower quotas where projects
can be closed is NP-hard even under severe restrictions on preference lists lengths and
project upper and lower quotas. To offset this hardness, we present polynomial time
heuristics that find large feasible matchings in practice. We also present ip models
for the spa variants discussed and show results obtained from an empirical evaluation
carried out on both real and randomly generated datasets. These results show that
our algorithms and heuristics are scalable and provide good matchings with respect to
profile-based optimalit
Understanding and Enhancing the Use of Context for Machine Translation
To understand and infer meaning in language, neural models have to learn
complicated nuances. Discovering distinctive linguistic phenomena from data is
not an easy task. For instance, lexical ambiguity is a fundamental feature of
language which is challenging to learn. Even more prominently, inferring the
meaning of rare and unseen lexical units is difficult with neural networks.
Meaning is often determined from context. With context, languages allow meaning
to be conveyed even when the specific words used are not known by the reader.
To model this learning process, a system has to learn from a few instances in
context and be able to generalize well to unseen cases. The learning process is
hindered when training data is scarce for a task. Even with sufficient data,
learning patterns for the long tail of the lexical distribution is challenging.
In this thesis, we focus on understanding certain potentials of contexts in
neural models and design augmentation models to benefit from them. We focus on
machine translation as an important instance of the more general language
understanding problem. To translate from a source language to a target
language, a neural model has to understand the meaning of constituents in the
provided context and generate constituents with the same meanings in the target
language. This task accentuates the value of capturing nuances of language and
the necessity of generalization from few observations. The main problem we
study in this thesis is what neural machine translation models learn from data
and how we can devise more focused contexts to enhance this learning. Looking
more in-depth into the role of context and the impact of data on learning
models is essential to advance the NLP field. Moreover, it helps highlight the
vulnerabilities of current neural networks and provides insights into designing
more robust models.Comment: PhD dissertation defended on November 10th, 202
- …