1,291 research outputs found
Selecting and Generating Computational Meaning Representations for Short Texts
Language conveys meaning, so natural language processing (NLP) requires representations of meaning. This work addresses two broad questions: (1) What meaning representation should we use? and (2) How can we transform text to our chosen meaning representation? In the first part, we explore different meaning representations (MRs) of short texts, ranging from surface forms to deep-learning-based models. We show the advantages and disadvantages of a variety of MRs for summarization, paraphrase detection, and clustering. In the second part, we use SQL as a running example for an in-depth look at how we can parse text into our chosen MR. We examine the text-to-SQL problem from three perspectives—methodology, systems, and applications—and show how each contributes to a fuller understanding of the task.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143967/1/cfdollak_1.pd
Towards Massive Machine Type Communications in Ultra-Dense Cellular IoT Networks: Current Issues and Machine Learning-Assisted Solutions
The ever-increasing number of resource-constrained Machine-Type Communication
(MTC) devices is leading to the critical challenge of fulfilling diverse
communication requirements in dynamic and ultra-dense wireless environments.
Among different application scenarios that the upcoming 5G and beyond cellular
networks are expected to support, such as eMBB, mMTC and URLLC, mMTC brings the
unique technical challenge of supporting a huge number of MTC devices, which is
the main focus of this paper. The related challenges include QoS provisioning,
handling highly dynamic and sporadic MTC traffic, huge signalling overhead and
Radio Access Network (RAN) congestion. In this regard, this paper aims to
identify and analyze the involved technical issues, to review recent advances,
to highlight potential solutions and to propose new research directions. First,
starting with an overview of mMTC features and QoS provisioning issues, we
present the key enablers for mMTC in cellular networks. Along with the
highlights on the inefficiency of the legacy Random Access (RA) procedure in
the mMTC scenario, we then present the key features and channel access
mechanisms in the emerging cellular IoT standards, namely, LTE-M and NB-IoT.
Subsequently, we present a framework for the performance analysis of
transmission scheduling with the QoS support along with the issues involved in
short data packet transmission. Next, we provide a detailed overview of the
existing and emerging solutions towards addressing RAN congestion problem, and
then identify potential advantages, challenges and use cases for the
applications of emerging Machine Learning (ML) techniques in ultra-dense
cellular networks. Out of several ML techniques, we focus on the application of
low-complexity Q-learning approach in the mMTC scenarios. Finally, we discuss
some open research challenges and promising future research directions.Comment: 37 pages, 8 figures, 7 tables, submitted for a possible future
publication in IEEE Communications Surveys and Tutorial
AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments
This report considers the application of Articial Intelligence (AI) techniques to
the problem of misuse detection and misuse localisation within telecommunications
environments. A broad survey of techniques is provided, that covers inter alia
rule based systems, model-based systems, case based reasoning, pattern matching,
clustering and feature extraction, articial neural networks, genetic algorithms, arti
cial immune systems, agent based systems, data mining and a variety of hybrid
approaches. The report then considers the central issue of event correlation, that
is at the heart of many misuse detection and localisation systems. The notion of
being able to infer misuse by the correlation of individual temporally distributed
events within a multiple data stream environment is explored, and a range of techniques,
covering model based approaches, `programmed' AI and machine learning
paradigms. It is found that, in general, correlation is best achieved via rule based approaches,
but that these suffer from a number of drawbacks, such as the difculty of
developing and maintaining an appropriate knowledge base, and the lack of ability
to generalise from known misuses to new unseen misuses. Two distinct approaches
are evident. One attempts to encode knowledge of known misuses, typically within
rules, and use this to screen events. This approach cannot generally detect misuses
for which it has not been programmed, i.e. it is prone to issuing false negatives.
The other attempts to `learn' the features of event patterns that constitute normal
behaviour, and, by observing patterns that do not match expected behaviour, detect
when a misuse has occurred. This approach is prone to issuing false positives,
i.e. inferring misuse from innocent patterns of behaviour that the system was not
trained to recognise. Contemporary approaches are seen to favour hybridisation,
often combining detection or localisation mechanisms for both abnormal and normal
behaviour, the former to capture known cases of misuse, the latter to capture
unknown cases. In some systems, these mechanisms even work together to update
each other to increase detection rates and lower false positive rates. It is concluded
that hybridisation offers the most promising future direction, but that a rule or state
based component is likely to remain, being the most natural approach to the correlation
of complex events. The challenge, then, is to mitigate the weaknesses of
canonical programmed systems such that learning, generalisation and adaptation
are more readily facilitated
FedComm: Federated Learning as a Medium for Covert Communication
Proposed as a solution to mitigate the privacy implications related to the
adoption of deep learning, Federated Learning (FL) enables large numbers of
participants to successfully train deep neural networks without having to
reveal the actual private training data. To date, a substantial amount of
research has investigated the security and privacy properties of FL, resulting
in a plethora of innovative attack and defense strategies. This paper
thoroughly investigates the communication capabilities of an FL scheme. In
particular, we show that a party involved in the FL learning process can use FL
as a covert communication medium to send an arbitrary message. We introduce
FedComm, a novel multi-system covert-communication technique that enables
robust sharing and transfer of targeted payloads within the FL framework. Our
extensive theoretical and empirical evaluations show that FedComm provides a
stealthy communication channel, with minimal disruptions to the training
process. Our experiments show that FedComm successfully delivers 100% of a
payload in the order of kilobits before the FL procedure converges. Our
evaluation also shows that FedComm is independent of the application domain and
the neural network architecture used by the underlying FL scheme.Comment: 18 page
The Allies Of Others: How Stakeholders’ Relationships Shape Non-Market Strategy
This dissertation shifts analytic focus from firm, stakeholder and institutional characteristics as drivers of a firm’s non-market strategy to the fields in which stakeholders are embedded which are characterized by their own social relationships, norms and identities. In so doing, I strive to develop a more socialized view of non-market strategy. The first chapter provides evidence that the identity of stakeholders in their fields and the structure of relations between them can circumscribe firms’ strategic responses to stakeholder conflict that require stakeholder cooperation. The second chapter explores the pathways by which firms attenuate stakeholder threats through an understudied phenomenon: cooperative non-market strategy, or when firms establish formal cooperative relationships with stakeholders. I find that cooperative non-market strategy is an effective way for firms allay threats from a broad swathe of stakeholders by exploiting the social networks and identity of an allied stakeholder. The first two chapters draw on a unique, self-constructed 25-year panel of all contentious and collaborative interactions between 118 environmental movement organizations and Fortune 500 firms, complemented by multiplex network data on movements and firms. While the first two chapters explore cooperative non-market strategy, the last chapter demonstrates the utility of taking account of stakeholder fields in unilateral non-market strategy, in this case, improvements in corporate social and environmental performance. Drawing on a dataset of 250 million media-reported events to construct comprehensive socio-political networks and stakeholder fields across 42 countries, I find that stakeholder ties to country-level socio-political networks and to each other, and who participates in stakeholder fields and mobilizes against firms, manifest in observable differences in corporate social and environmental performance across countries. In addition to establishing that stakeholder fields are central to explanations of non-market strategy, this dissertation finds that the mechanisms underlying their impact are multi-faceted, and consistently operate through two characteristics of stakeholder fields: the relational ties of stakeholders, and the identity of stakeholders within their field. Stakeholder fields are central to understanding firms’ strategic management of stakeholders because fields constrain stakeholder agency, are susceptible to influence through their relational structures and member identities, and in turn, influence issue salience for outsiders
The Art of Repression: Digital Dissent and Power Consolidation in El-Sisi’s Egypt
Imprecise measurement tools impede the study of protest mobilization. Mobilization proxies, such as counting protesters and protest events, result in significant outliers and variance while ignoring sociocultural, cybernetic, economic, legal, and other features that relevant academic literature considers essential to understanding mobilization dynamics. Without accurate empirical models, researchers’ and policymakers’ investigations of autocratic repression have little explanatory power. This thesis proposes a methodological addition to the mobilization literature: Two three-level scales distinguish an event’s potential to attract an audience from the protest’s actual output relative to similar episodes. I employ the Armed Conflict Location and Event Data (ACLED) project to demonstrate the measurement’s utility. Afterwards, I apply these models to conduct an impact assessment of recent Egyptian cyberregulatory laws. Controlling for the grievances of protesters and performing other robustness checks, the time series demonstrates a strong, statistically significant relationship between the policies and the reduction of low-level potential mobilizational capacity of Egyptian dissidents, but fails to identify an expected relationship between police pressure and the decline of mobilizational capacity. These findings contribute to the theoretical frameworks of mobilization scholars and policymaker discussions regarding the value of internet censorship tools for curtailing oppositional political action
Recommended from our members
Towards Optimized Traffic Provisioning and Adaptive Cache Management for Content Delivery
Content delivery networks (CDNs) deploy hundreds of thousands of servers around the world to cache and serve trillions of user requests every day for a diverse set of content such as web pages, videos, software downloads and images. In this dissertation, we propose algorithms to provision traffic across cache servers and manage the content they host to achieve performance objectives such as maximizing the cache hit rate, minimizing the bandwidth cost of the network and minimizing the energy consumption of the servers.
Traffic provisioning is the process of determining the set of content domains hosted on the servers. We propose footprint descriptors that effectively capture the popularity characteristics and caching performance of different content classes. We also propose a footprint descriptor calculus that can be used to decide how content should be mixed or partitioned to efficiently provision traffic. To automate traffic provisioning, we propose optimization models to provision traffic such that the cache miss traffic from the network is minimized without overloading the servers. We find that such optimization models produce significant reductions in the cache miss traffic when compared with traffic provisioning algorithms in use today.
Cache management is the process of deciding how content is cached in the servers of a CDN. We propose TTL-based caching algorithms that provably achieve performance targets specified by a CDN operator. We show that the proposed algorithms converge to the target hit rate and target cache size with low error. Finally, we propose cache management algorithms to make the servers energy-efficient using disk shutdown. We find that disk shutdown is well suited for CDN servers and provides energy savings without significantly impacting cache hit rates
Recommended from our members
Dealings on the Dark Web: An Examination of the Trust, Consumer Satisfaction, and the Efficacy of Interventions Against a Dark Web Cryptomarket
Abstract
Objective. The overarching goal of this thesis is to better understand not only the network dynamics which undergird the function and operation of cryptomarkets but the nature of consumer satisfaction and trust on these platforms. More specifically, I endeavour to push the cryptomarket literature beyond its current theoretical and methodological limits by documenting the network structure of a cryptomarket, the factors which predicts for vendor trust, the efficacy of targeted strategies on the transactional network of a cryptomarket, and the dynamics which facilitate consumer satisfaction despite information asymmetry. Moreover, we also aim to test the generalizability of findings made in prior cryptomarket studies (Duxbury and Haynie, 2017; 2020; Norbutas, 2018).
Methods. I realize the aims of this research by using a buyer-seller dataset from the Abraxas cryptomarket (Branwen et al., 2015). Given the differences between the topics and the research questions featured, this thesis employs a variety of methodological techniques. Chapter two uses a combination of descriptive network analysis, community detection analysis, statistical modelling, and trajectory modelling. Chapter three utilizes three text analytic strategies: descriptive text analysis, sentiment analysis, and textual feature extraction. Finally, chapter four employs sequential node deletion pursuant to six law enforcement strategies: lead k (degree centrality), eccentricity, unique items bought/sold, cumulative reputation score, total purchase price, and random targeting.
Results. Social network analysis of the Abraxas cryptomarket revealed a large and diffuse network where the majority of buyers purchased from a small cohort of vendors. This theme of preferential selection of vendors on the part of buyers is repeated in other findings within this study. More generally, the Abraxas transactional network can then be viewed as set of transactional islands as opposed to a large, densely connected conglomeration of vendors and buyers. With regard buyer feedback, buyers are generally pleased with their transactions on Abraxas as long as the product arrives on time and is as advertised. In general, vendors have a relatively low bar to achieve when it comes to satisfying their customers. Based on the results of the sequential node deletion, random targeting was found to be ineffective across the five outcome measures, producing minimal and a slow disruptive effect. Finally, these strategies are based on a power law where a small percentage of deleted nodes is responsible for an outsized proportion of the disruptive impact.
Conclusion. As with all applied research examining emergent phenomena, this thesis lends itself to a more refined understanding of dark web cryptomarkets. While the results and conclusions drawn from these results are not perfectly generalizable to all cryptomarkets, they should serve to inform law enforcement on the dynamics which undergird these markets. To this extent, a sombre consideration of trust, consumer satisfaction, and tactical effectiveness of interventions is a necessary step towards the development of more effective countermeasures against these illicit online marketplaces. For law enforcement to be more effective against cryptomarkets, it is advised that an evidence-based approach be taken
Towards Massive Machine Type Communications in Ultra-Dense Cellular IoT Networks: Current Issues and Machine Learning-Assisted Solutions
The ever-increasing number of resource-constrained
Machine-Type Communication (MTC) devices is leading to the
critical challenge of fulfilling diverse communication requirements
in dynamic and ultra-dense wireless environments. Among
different application scenarios that the upcoming 5G and beyond
cellular networks are expected to support, such as enhanced Mobile
Broadband (eMBB), massive Machine Type Communications
(mMTC) and Ultra-Reliable and Low Latency Communications
(URLLC), the mMTC brings the unique technical challenge of
supporting a huge number of MTC devices in cellular networks,
which is the main focus of this paper. The related challenges
include Quality of Service (QoS) provisioning, handling highly
dynamic and sporadic MTC traffic, huge signalling overhead and
Radio Access Network (RAN) congestion. In this regard, this
paper aims to identify and analyze the involved technical issues,
to review recent advances, to highlight potential solutions and to
propose new research directions. First, starting with an overview
of mMTC features and QoS provisioning issues, we present
the key enablers for mMTC in cellular networks. Along with
the highlights on the inefficiency of the legacy Random Access
(RA) procedure in the mMTC scenario, we then present the key
features and channel access mechanisms in the emerging cellular
IoT standards, namely, LTE-M and Narrowband IoT (NB-IoT).
Subsequently, we present a framework for the performance
analysis of transmission scheduling with the QoS support along
with the issues involved in short data packet transmission. Next,
we provide a detailed overview of the existing and emerging
solutions towards addressing RAN congestion problem, and then
identify potential advantages, challenges and use cases for the
applications of emerging Machine Learning (ML) techniques in
ultra-dense cellular networks. Out of several ML techniques, we
focus on the application of low-complexity Q-learning approach
in the mMTC scenario along with the recent advances towards
enhancing its learning performance and convergence. Finally,
we discuss some open research challenges and promising future
research directions
- …