14 research outputs found

    CBRS Spectrum Sharing between LTE-U and WiFi: A Multiarmed Bandit Approach

    Get PDF
    The surge of mobile devices such as smartphone and tablets requires additional capacity. To achieve ubiquitous and high data rate Internet connectivity, effective spectrum sharing and utilization of the wireless spectrum carry critical importance. In this paper, we consider the use of unlicensed LTE (LTE-U) technology in the 3.5 GHz Citizens Broadband Radio Service (CBRS) band and develop a multiarmed bandit (MAB) based spectrum sharing technique for a smooth coexistence with WiFi. In particular, we consider LTE-U to operate as a General Authorized Access (GAA) user; hereby MAB is used to adaptively optimize the transmission duty cycle of LTE-U transmissions. Additionally, we incorporate downlink power control which yields a high energy efficiency and interference suppression. Simulation results demonstrate a significant improvement in the aggregate capacity (approximately 33%) and cell-edge throughput of coexisting LTE-U and WiFi networks for different base station densities and user densities

    Bayesian Learning of Asymmetric Gaussian-Based Statistical Models using Markov Chain Monte Carlo Techniques

    Get PDF
    A novel unsupervised Bayesian learning framework based on asymmetric Gaussian mixture (AGM) statistical model is proposed since AGM is shown to be more effective compared to the classic Gaussian mixture. The Bayesian learning framework is developed by adopting sampling-based Markov chain Monte Carlo (MCMC) methodology. More precisely, the fundamental learning algorithm is a hybrid Metropolis-Hastings within Gibbs sampling solution which is integrated within a reversible jump MCMC (RJMCMC) learning framework, a self-adapted sampling-based MCMC implementation, that enables model transfer throughout the mixture parameters learning process, therefore, automatically converges to the optimal number of data groups. Furthermore, a feature selection technique is included to tackle the irrelevant and unneeded information from datasets. The performance comparison between AGM and other popular solutions is given and both synthetic and real data sets extracted from challenging applications such as intrusion detection, spam filtering and image categorization are evaluated to show the merits of the proposed approach

    From Data to Actions in Intelligent Transportation Systems: A Prescription of Functional Requirements for Model Actionability

    Get PDF
    Advances in Data Science permeate every field of Transportation Science and Engineering, resulting in developments in the transportation sector that are data-driven. Nowadays, Intelligent Transportation Systems (ITS) could be arguably approached as a “story” intensively producing and consuming large amounts of data. A diversity of sensing devices densely spread over the infrastructure, vehicles or the travelers’ personal devices act as sources of data flows that are eventually fed into software running on automatic devices, actuators or control systems producing, in turn, complex information flows among users, traffic managers, data analysts, traffic modeling scientists, etc. These information flows provide enormous opportunities to improve model development and decision-making. This work aims to describe how data, coming from diverse ITS sources, can be used to learn and adapt data-driven models for efficiently operating ITS assets, systems and processes; in other words, for data-based models to fully become actionable. Grounded in this described data modeling pipeline for ITS, we define the characteristics, engineering requisites and challenges intrinsic to its three compounding stages, namely, data fusion, adaptive learning and model evaluation. We deliberately generalize model learning to be adaptive, since, in the core of our paper is the firm conviction that most learners will have to adapt to the ever-changing phenomenon scenario underlying the majority of ITS applications. Finally, we provide a prospect of current research lines within Data Science that can bring notable advances to data-based ITS modeling, which will eventually bridge the gap towards the practicality and actionability of such models.This work was supported in part by the Basque Government for its funding support through the EMAITEK program (3KIA, ref. KK-2020/00049). It has also received funding support from the Consolidated Research Group MATHMODE (IT1294-19) granted by the Department of Education of the Basque Government

    Stock Price Change Rate Prediction by Utilizing Social Network Activities

    Get PDF
    Predicting stock price change rates for providing valuable information to investors is a challenging task. Individual participants may express their opinions in social network service (SNS) before or after their transactions in the market; we hypothesize that stock price change rate is better predicted by a function of social network service activities and technical indicators than by a function of just stock market activities. The hypothesis is tested by accuracy of predictions as well as performance of simulated trading because success or failure of prediction is better measured by profits or losses the investors gain or suffer. In this paper, we propose a hybrid model that combines multiple kernel learning (MKL) and genetic algorithm (GA). MKL is adopted to optimize the stock price change rate prediction models that are expressed in a multiple kernel linear function of different types of features extracted from different sources. GA is used to optimize the trading rules used in the simulated trading by fusing the return predictions and values of three well-known overbought and oversold technical indicators. Accumulated return and Sharpe ratio were used to test the goodness of performance of the simulated trading. Experimental results show that our proposed model performed better than other models including ones using state of the art techniques

    Networks, Communication, and Computing Vol. 2

    Get PDF
    Networks, communications, and computing have become ubiquitous and inseparable parts of everyday life. This book is based on a Special Issue of the Algorithms journal, and it is devoted to the exploration of the many-faceted relationship of networks, communications, and computing. The included papers explore the current state-of-the-art research in these areas, with a particular interest in the interactions among the fields

    A Study on Anomaly Detection Using Mixture Models

    Get PDF
    With the increase in networks capacities and number of online users, threats of different cyber attacks on computer networks also increased significantly, causing the loss of a vast amount of money every year to various organizations. This requires the need to identify and group these threats according to different attack types. Many anomaly detection systems have been introduced over the years based on different machine learning algorithms. More precisely, unsupervised learning algorithms have proven to be very effective. In many research studies, to build an effective ADS system, finite mixture models have been widely accepted as an essential clustering method. In this thesis, we deploy different non-Gaussian mixture models that have been proven to model well bounded and semi-bounded data. These models are based on the Dirichlet family of distributions. The deployed models are tested with Geometric Area Analysis Technique (GAA) and with an adversarial learning framework. Moreover, we build an effective hybrid anomaly detection system with finite and in-finite mixture models. In addition, we propose a feature selection approach based on the highest vote obtained. We evaluated the performance of mixture models with Geometric Area Analysis technique based on Trapezoidal Area Estimation (TAE) and the effect of adversarial learning on ADS performance via extensive experiments based on well-known data sets

    Workflow models for heterogeneous distributed systems

    Get PDF
    The role of data in modern scientific workflows becomes more and more crucial. The unprecedented amount of data available in the digital era, combined with the recent advancements in Machine Learning and High-Performance Computing (HPC), let computers surpass human performances in a wide range of fields, such as Computer Vision, Natural Language Processing and Bioinformatics. However, a solid data management strategy becomes crucial for key aspects like performance optimisation, privacy preservation and security. Most modern programming paradigms for Big Data analysis adhere to the principle of data locality: moving computation closer to the data to remove transfer-related overheads and risks. Still, there are scenarios in which it is worth, or even unavoidable, to transfer data between different steps of a complex workflow. The contribution of this dissertation is twofold. First, it defines a novel methodology for distributed modular applications, allowing topology-aware scheduling and data management while separating business logic, data dependencies, parallel patterns and execution environments. In addition, it introduces computational notebooks as a high-level and user-friendly interface to this new kind of workflow, aiming to flatten the learning curve and improve the adoption of such methodology. Each of these contributions is accompanied by a full-fledged, Open Source implementation, which has been used for evaluation purposes and allows the interested reader to experience the related methodology first-hand. The validity of the proposed approaches has been demonstrated on a total of five real scientific applications in the domains of Deep Learning, Bioinformatics and Molecular Dynamics Simulation, executing them on large-scale mixed cloud-High-Performance Computing (HPC) infrastructures

    Sentiment Analysis for Social Media

    Get PDF
    Sentiment analysis is a branch of natural language processing concerned with the study of the intensity of the emotions expressed in a piece of text. The automated analysis of the multitude of messages delivered through social media is one of the hottest research fields, both in academy and in industry, due to its extremely high potential applicability in many different domains. This Special Issue describes both technological contributions to the field, mostly based on deep learning techniques, and specific applications in areas like health insurance, gender classification, recommender systems, and cyber aggression detection

    A submodular optimization framework for never-ending learning : semi-supervised, online, and active learning.

    Get PDF
    The revolution in information technology and the explosion in the use of computing devices in people\u27s everyday activities has forever changed the perspective of the data mining and machine learning fields. The enormous amounts of easily accessible, information rich data is pushing the data analysis community in general towards a shift of paradigm. In the new paradigm, data comes in the form a stream of billions of records received everyday. The dynamic nature of the data and its sheer size makes it impossible to use the traditional notion of offline learning where the whole data is accessible at any time point. Moreover, no amount of human resources is enough to get expert feedback on the data. In this work we have developed a unified optimization based learning framework that approaches many of the challenges mentioned earlier. Specifically, we developed a Never-Ending Learning framework which combines incremental/online, semi-supervised, and active learning under a unified optimization framework. The established framework is based on the class of submodular optimization methods. At the core of this work we provide a novel formulation of the Semi-Supervised Support Vector Machines (S3VM) in terms of submodular set functions. The new formulation overcomes the non-convexity issues of the S3VM and provides a state of the art solution that is orders of magnitude faster than the cutting edge algorithms in the literature. Next, we provide a stream summarization technique via exemplar selection. This technique makes it possible to keep a fixed size exemplar representation of a data stream that can be used by any label propagation based semi-supervised learning technique. The compact data steam representation allows a wide range of algorithms to be extended to incremental/online learning scenario. Under the same optimization framework, we provide an active learning algorithm that constitute the feedback between the learning machine and an oracle. Finally, the developed Never-Ending Learning framework is essentially transductive in nature. Therefore, our last contribution is an inductive incremental learning technique for incremental training of SVM using the properties of local kernels. We demonstrated through this work the importance and wide applicability of the proposed methodologies

    Information Retrieval Based on DOM Trees

    Full text link
    [ES] Desde hace varios años, la cantidad de información disponible en la web crece de manera exponencial. Cada día se genera una gran cantidad de información que prácticamente de inmediato está disponible en la web. Los buscadores e indexadores recorren diariamente la web para encontrar toda esa información que se ha ido añadiendo y así, ponerla a disposición del usuario devolviéndola en los resultados de las búsquedas. Sin embargo, la cantidad de información es tan grande que debe ser preprocesada con anterioridad. Dado que el usuario que realiza una búsqueda de información solamente está interesado en la información relevante, no tiene sentido que los buscadores e indexadores procesen el resto de elementos de las páginas web. El procesado de elementos irrelevantes de páginas web supone un gasto de recursos innecesario, como por ejemplo espacio de almacenamiento, tiempo de procesamiento, uso de ancho de banda, etc. Se estima que entre el 40% y el 50% del contenido de las páginas web son elementos irrelevantes. Por eso, en los últimos 20 años se han desarrollado técnicas para la detección de elementos tanto relevantes como irrelevantes de páginas web. Este objetivo se puede abordar de diversas maneras, por lo que existen técnicas diametralmente distintas para afrontar el problema. Esta tesis se centra en el desarrollo de técnicas basadas en árboles DOM para la detección de diversas partes de las páginas web, como son el contenido principal, la plantilla, y el menú. La mayoría de técnicas existentes se centran en la detección de texto dentro del contenido principal de las páginas web, ya sea eliminando la plantilla de dichas páginas o detectando directamente el contenido principal. Las técnicas que proponemos no sólo son capaces de realizar la extracción de texto, sino que, bien por eliminación de plantilla o bien por detección del contenido principal, son capaces de aislar cualquier elemento relevante de las páginas web, como por ejemplo imágenes, animaciones, videos, etc. Dichas técnicas no sólo son útiles para buscadores y rastreadores, sino que también pueden ser útiles directamente para el usuario que navega por la web. Por ejemplo, en el caso de usuarios con diversidad funcional (como sería una ceguera) puede ser interesante la eliminación de elementos irrelevantes para facilitar la lectura (o escucha) de las páginas web. Para hacer las técnicas accesibles a todo el mundo, las hemos implementado como extensiones del navegador, y son compatibles con navegadores basados en Mozilla o en Chromium. Además, estas herramientas están públicamente disponibles para que cualquier persona interesada pueda acceder a ellas y continuar con la investigación si así lo deseara.[CA] Des de fa diversos anys, la quantitat d'informació disponible en la web creix de manera exponencial. Cada dia es genera una gran quantitat d'informació que immediatament es posa disponible en la web. Els cercadors i indexadors recorren diàriament la web per a trobar tota aqueixa informació que s'ha anat afegint i així, posar-la a la disposició de l'usuari retornant-la en els resultats de les cerques. No obstant això, la quantitat d'informació és tan gran que aquesta ha de ser preprocessada. Atés que l'usuari que realitza una cerca d'informació solament es troba interessat en la informació rellevant, no té sentit que els cercadors i indexadors processen la resta d'elements de les pàgines web. El processament d'elements irrellevants de pàgines web suposa una despesa de recursos innecessària, com per exemple espai d'emmagatzematge, temps de processament, ús d'amplada de banda, etc. S'estima que entre el 40% i el 50% del contingut de les pàgines web són elements irrellevants. Precisament per això, en els últims 20 anys s'han desenvolupat tècniques per a la detecció d'elements tant rellevants com irrellevants de pàgines web. Aquest objectiu es pot afrontar de diverses maneres, per la qual cosa existeixen tècniques diametralment diferents per a afrontar el problema. Aquesta tesi se centra en el desenvolupament de tècniques basades en arbres DOM per a la detecció de diverses parts de les pàgines web, com són el contingut principal, la plantilla, i el menú. La majoria de tècniques existents se centren en la detecció de text dins del contingut principal de les pàgines web, ja siga eliminant la plantilla d'aquestes pàgines o detectant directament el contingut principal. Les tècniques que hi proposem no sols són capaces de realitzar l'extracció de text, sinó que, bé per eliminació de plantilla o bé per detecció del contingut principal, són capaços d'aïllar qualsevol element rellevant de les pàgines web, com per exemple imatges, animacions, vídeos, etc. Aquestes tècniques no sols són útils per a cercadors i rastrejadors, sinó també poden ser útils directament per a l'usuari que navega per la web. Per exemple, en el cas d'usuaris amb diversitat funcional (com ara una ceguera) pot ser interessant l'eliminació d'elements irrellevants per a facilitar-ne la lectura (o l'escolta) de les pàgines web. Per a fer les tècniques accessibles a tothom, les hem implementades com a extensions del navegador, i són compatibles amb navegadors basats en Mozilla i en Chromium. A més, aquestes eines estan públicament disponibles perquè qualsevol persona interessada puga accedir a elles i continuar amb la investigació si així ho desitjara.[EN] For several years, the amount of information available on the Web has been growing exponentially. Every day, a huge amount of data is generated and it is made immediately available on the Web. Indexers and crawlers browse the Web daily to find the new information that has been added, and they make it available to answer the users' search queries. However, the amount of information is so huge that it must be preprocessed. Given that users are only interested in the relevant information, it is not necessary for indexers and crawlers to process other boilerplate, redundant or useless elements of the web pages. Processing such irrelevant elements lead to an unnecessary waste of resources, such as storage space, runtime, bandwidth, etc. Different studies have shown that between 40% and 50% of the data on the Web are noisy elements. For this reason, several techniques focused on the detection of both, relevant and irrelevant data, have been developed over the last 20 years. The problems of identifying the relevant content of a web page, its template, its menu, etc. can be faced in various ways, and for this reason, there exist completely different techniques to address those problems. This thesis is focused on the development of information retrieval techniques based on DOM trees. Its goal is to detect different parts of a web page, such as the main content, the template, and the main menu. Most of the existing techniques are focused on the detection of text inside the main content of the web pages, mainly by removing the template of the web page or by inferring the main content. The techniques proposed in this thesis do not only extract text by eliminating the template or inferring the main content, but also extract any other relevant information from web pages such as images, animations, videos, etc. Our techniques are not only useful for indexers and crawlers but also for the user browsing the Web. For instance, in the case of users with functional diversity problems (such as blindness), removing noisy elements can facilitate them to read (or listen to) the web pages. To make the techniques broadly accessible to everybody, we have implemented them as browser extensions, which are compatible with Mozilla-based and Chromium-based browsers. In addition, these tools are publicly available, so any interested person can access them and continue with the research if they wish to do so.Alarte Aleixandre, J. (2023). Information Retrieval Based on DOM Trees [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/19667
    corecore