816 research outputs found
Unsupervised learning in high-dimensional space
Thesis (Ph.D.)--Boston UniversityIn machine learning, the problem of unsupervised learning is that of trying to explain key features and find hidden structures in unlabeled data. In this thesis we focus on three unsupervised learning scenarios: graph based clustering with imbalanced data, point-wise anomaly detection and anomalous cluster detection on graphs.
In the first part we study spectral clustering, a popular graph based clustering technique. We investigate the reason why spectral clustering performs badly on imbalanced and proximal data. We then propose the partition constrained minimum cut (PCut) framework based on a novel parametric graph construction method, that is shown to adapt to different degrees of imbalanced data. We analyze the limit cut behavior of our approach, and demonstrate the significant performance improvement through clustering and semi-supervised learning experiments on imbalanced data. [TRUNCATED
Monitoring the health and integrity of Wireless Sensor Networks
Wireless Sensor Networks (WSNs) will play a major role in the Internet of Things collecting the data that will support decision-making and enable the automation of many applications. Nevertheless, the introduction of these devices into our daily life raises serious concerns about their integrity. Therefore, at any given point, one must be able to tell whether or not a node has been compromised. Moreover, it is crucial to understand how the compromise of a particular node or set of nodes may affect the network operation.
In this thesis, we present a framework to monitor the health and integrity of WSNs that allows us to detect compromised devices and comprehend how they might impact a network’s performance. We start by investigating the use of attestation to identify malicious nodes and advance the state of the art by exploring limitations of existing mechanisms. Firstly, we tackle effectiveness and scalability by combining attestation with measurements inspection and show that the right combination of both schemes can achieve high accuracy whilst significantly reducing power consumption. Secondly, we propose a novel stochastic software-based attestation approach that relaxes a fundamental and yet overlooked assumption made in the literature significantly reducing time and energy consumption while improving the detection rate of honest devices.
Lastly, we propose a mathematical model to represent the health of a WSN according to its abilities to perform its functions. Our model combines the knowledge regarding compromised nodes with additional information that quantifies the importance of each node. In this context, we propose a new centrality measure and analyse how well existing metrics can rank the importance each sensor node has on the network connectivity. We demonstrate that while no measure is invariably better, our proposed metric outperforms the others in the vast majority of cases.Open Acces
Unified architecture of mobile ad hoc network security (MANS) system
In this dissertation, a unified architecture of Mobile Ad-hoc Network Security (MANS) system is proposed, under which IDS agent, authentication, recovery policy and other policies can be defined formally and explicitly, and are enforced by a uniform architecture. A new authentication model for high-value transactions in cluster-based MANET is also designed in MANS system. This model is motivated by previous works but try to use their beauties and avoid their shortcomings, by using threshold sharing of the certificate signing key within each cluster to distribute the certificate services, and using certificate chain and certificate repository to achieve better scalability, less overhead and better security performance. An Intrusion Detection System is installed in every node, which is responsible for colleting local data from its host node and neighbor nodes within its communication range, pro-processing raw data and periodically broadcasting to its neighborhood, classifying normal or abnormal based on pro-processed data from its host node and neighbor nodes. Security recovery policy in ad hoc networks is the procedure of making a global decision according to messages received from distributed IDS and restore to operational health the whole system if any user or host that conducts the inappropriate, incorrect, or anomalous activities that threaten the connectivity or reliability of the networks and the authenticity of the data traffic in the networks. Finally, quantitative risk assessment model is proposed to numerically evaluate MANS security
Cellular, Wide-Area, and Non-Terrestrial IoT: A Survey on 5G Advances and the Road Towards 6G
The next wave of wireless technologies is proliferating in connecting things
among themselves as well as to humans. In the era of the Internet of things
(IoT), billions of sensors, machines, vehicles, drones, and robots will be
connected, making the world around us smarter. The IoT will encompass devices
that must wirelessly communicate a diverse set of data gathered from the
environment for myriad new applications. The ultimate goal is to extract
insights from this data and develop solutions that improve quality of life and
generate new revenue. Providing large-scale, long-lasting, reliable, and near
real-time connectivity is the major challenge in enabling a smart connected
world. This paper provides a comprehensive survey on existing and emerging
communication solutions for serving IoT applications in the context of
cellular, wide-area, as well as non-terrestrial networks. Specifically,
wireless technology enhancements for providing IoT access in fifth-generation
(5G) and beyond cellular networks, and communication networks over the
unlicensed spectrum are presented. Aligned with the main key performance
indicators of 5G and beyond 5G networks, we investigate solutions and standards
that enable energy efficiency, reliability, low latency, and scalability
(connection density) of current and future IoT networks. The solutions include
grant-free access and channel coding for short-packet communications,
non-orthogonal multiple access, and on-device intelligence. Further, a vision
of new paradigm shifts in communication networks in the 2030s is provided, and
the integration of the associated new technologies like artificial
intelligence, non-terrestrial networks, and new spectra is elaborated. Finally,
future research directions toward beyond 5G IoT networks are pointed out.Comment: Submitted for review to IEEE CS&
A Machine Learning Enhanced Scheme for Intelligent Network Management
The versatile networking services bring about huge influence on daily living styles while the amount and diversity of services cause high complexity of network systems. The network scale and complexity grow with the increasing infrastructure apparatuses, networking function, networking slices, and underlying architecture evolution. The conventional way is manual administration to maintain the large and complex platform, which makes effective and insightful management troublesome. A feasible and promising scheme is to extract insightful information from largely produced network data. The goal of this thesis is to use learning-based algorithms inspired by machine learning communities to discover valuable knowledge from substantial network data, which directly promotes intelligent management and maintenance. In the thesis, the management and maintenance focus on two schemes: network anomalies detection and root causes localization; critical traffic resource control and optimization. Firstly, the abundant network data wrap up informative messages but its heterogeneity and perplexity make diagnosis challenging. For unstructured logs, abstract and formatted log templates are extracted to regulate log records. An in-depth analysis framework based on heterogeneous data is proposed in order to detect the occurrence of faults and anomalies. It employs representation learning methods to map unstructured data into numerical features, and fuses the extracted feature for network anomaly and fault detection. The representation learning makes use of word2vec-based embedding technologies for semantic expression. Next, the fault and anomaly detection solely unveils the occurrence of events while failing to figure out the root causes for useful administration so that the fault localization opens a gate to narrow down the source of systematic anomalies. The extracted features are formed as the anomaly degree coupled with an importance ranking method to highlight the locations of anomalies in network systems. Two types of ranking modes are instantiated by PageRank and operation errors for jointly highlighting latent issue of locations. Besides the fault and anomaly detection, network traffic engineering deals with network communication and computation resource to optimize data traffic transferring efficiency. Especially when network traffic are constrained with communication conditions, a pro-active path planning scheme is helpful for efficient traffic controlling actions. Then a learning-based traffic planning algorithm is proposed based on sequence-to-sequence model to discover hidden reasonable paths from abundant traffic history data over the Software Defined Network architecture. Finally, traffic engineering merely based on empirical data is likely to result in stale and sub-optimal solutions, even ending up with worse situations. A resilient mechanism is required to adapt network flows based on context into a dynamic environment. Thus, a reinforcement learning-based scheme is put forward for dynamic data forwarding considering network resource status, which explicitly presents a promising performance improvement. In the end, the proposed anomaly processing framework strengthens the analysis and diagnosis for network system administrators through synthesized fault detection and root cause localization. The learning-based traffic engineering stimulates networking flow management via experienced data and further shows a promising direction of flexible traffic adjustment for ever-changing environments
- …