18 research outputs found

    Web pre-fetching schemes using Machine Learning for Mobile Cloud Computing

    Get PDF
    Pre-fetching is one of the technologies used in reducing latency on network traffic on the Internet. We propose this technology to utilise Mobile Cloud Computing (MCC) environment to handle latency issues in context of data management. However, overaggressive use of the pre-fetching technique causes overhead and slows down the system performance since pre-fetching the wrong objects data wastes the storage capacity of a mobile device. Many studies have been using Machine Learning (ML) to solve such issues. However, in MCC environment, the pre-fetching using ML is not widely used. Therefore, this research aims to implement ML techniques to classify the web objects that require decision rules. These decision rules are generated using few ML algorithms such as J48, Random Tree (RT), Naive Bayes (NB) and Rough Set (RS).These rules represent the characteristics of the input data accordingly. The experimental results reveal that J48 performs well in classifying the web objects for all three different datasets with testing accuracy of 95.49%, 98.28% and 97.9% for the UTM blog data, IRCache, and Proxy Cloud Computing (CC) datasets respectively. It shows that J48 algorithm is capable to handle better cloud data management with good recommendation to users with or without the cloud storage

    Sistemas interativos e distribuídos para telemedicina

    Get PDF
    doutoramento Ciências da ComputaçãoDurante as últimas décadas, as organizações de saúde têm vindo a adotar continuadamente as tecnologias de informação para melhorar o funcionamento dos seus serviços. Recentemente, em parte devido à crise financeira, algumas reformas no sector de saúde incentivaram o aparecimento de novas soluções de telemedicina para otimizar a utilização de recursos humanos e de equipamentos. Algumas tecnologias como a computação em nuvem, a computação móvel e os sistemas Web, têm sido importantes para o sucesso destas novas aplicações de telemedicina. As funcionalidades emergentes de computação distribuída facilitam a ligação de comunidades médicas, promovem serviços de telemedicina e a colaboração em tempo real. Também são evidentes algumas vantagens que os dispositivos móveis podem introduzir, tais como facilitar o trabalho remoto a qualquer hora e em qualquer lugar. Por outro lado, muitas funcionalidades que se tornaram comuns nas redes sociais, tais como a partilha de dados, a troca de mensagens, os fóruns de discussão e a videoconferência, têm o potencial para promover a colaboração no sector da saúde. Esta tese teve como objetivo principal investigar soluções computacionais mais ágeis que permitam promover a partilha de dados clínicos e facilitar a criação de fluxos de trabalho colaborativos em radiologia. Através da exploração das atuais tecnologias Web e de computação móvel, concebemos uma solução ubíqua para a visualização de imagens médicas e desenvolvemos um sistema colaborativo para a área de radiologia, baseado na tecnologia da computação em nuvem. Neste percurso, foram investigadas metodologias de mineração de texto, de representação semântica e de recuperação de informação baseada no conteúdo da imagem. Para garantir a privacidade dos pacientes e agilizar o processo de partilha de dados em ambientes colaborativos, propomos ainda uma metodologia que usa aprendizagem automática para anonimizar as imagens médicasDuring the last decades, healthcare organizations have been increasingly relying on information technologies to improve their services. At the same time, the optimization of resources, both professionals and equipment, have promoted the emergence of telemedicine solutions. Some technologies including cloud computing, mobile computing, web systems and distributed computing can be used to facilitate the creation of medical communities, and the promotion of telemedicine services and real-time collaboration. On the other hand, many features that have become commonplace in social networks, such as data sharing, message exchange, discussion forums, and a videoconference, have also the potential to foster collaboration in the health sector. The main objective of this research work was to investigate computational solutions that allow us to promote the sharing of clinical data and to facilitate the creation of collaborative workflows in radiology. By exploring computing and mobile computing technologies, we have designed a solution for medical imaging visualization, and developed a collaborative system for radiology, based on cloud computing technology. To extract more information from data, we investigated several methodologies such as text mining, semantic representation, content-based information retrieval. Finally, to ensure patient privacy and to streamline the data sharing in collaborative environments, we propose a machine learning methodology to anonymize medical images

    Transaction-filtering data mining and a predictive model for intelligent data management

    Get PDF
    This thesis, first of all, proposes a new data mining paradigm (transaction-filtering association rule mining) addressing a time consumption issue caused by the repeated scans of original transaction databases in conventional associate rule mining algorithms. An in-memory transaction filter is designed to discard those infrequent items in the pruning steps. This filter is a data structure to be updated at the end of each iteration. The results based on an IBM benchmark show that an execution time reduction of 10% - 19% is achieved compared with the base case. Next, a data mining-based predictive model is then established contributing to intelligent data management within the context of Centre for Grid Computing. The capability of discovering unseen rules, patterns and correlations enables data mining techniques favourable in areas where massive amounts of data are generated. The past behaviours of two typical scenarios (network file systems and Data Grids) have been analyzed to build the model. The future popularity of files can be forecasted with an accuracy of 90% by deploying the above predictor based on the given real system traces. A further step towards intelligent policy design is achieved by analyzing the prediction results of files’ future popularity. The real system trace-based simulations have shown improvements of 2-4 times in terms of data response time in network file system scenario and 24% mean job time reduction in Data Grids compared with conventional cases.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Human Mobility and Application Usage Prediction Algorithms for Mobile Devices

    Get PDF
    Mobile devices such as smartphones and smart watches are ubiquitous companions of humans’ daily life. Since 2014, there are more mobile devices on Earth than humans. Mobile applications utilize sensors and actuators of these devices to support individuals in their daily life. In particular, 24% of the Android applications leverage users’ mobility data. For instance, this data allows applications to understand which places an individual typically visits. This allows providing her with transportation information, location-based advertisements, or to enable smart home heating systems. These and similar scenarios require the possibility to access the Internet from everywhere and at any time. To realize these scenarios 83% of the applications available in the Android Play Store require the Internet to operate properly and therefore access it from everywhere and at any time. Mobile applications such as Google Now or Apple Siri utilize human mobility data to anticipate where a user will go next or which information she is likely to access en route to her destination. However, predicting human mobility is a challenging task. Existing mobility prediction solutions are typically optimized a priori for a particular application scenario and mobility prediction task. There is no approach that allows for automatically composing a mobility prediction solution depending on the underlying prediction task and other parameters. This approach is required to allow mobile devices to support a plethora of mobile applications running on them, while each of the applications support its users by leveraging mobility predictions in a distinct application scenario. Mobile applications rely strongly on the availability of the Internet to work properly. However, mobile cellular network providers are struggling to provide necessary cellular resources. Mobile applications generate a monthly average mobile traffic volume that ranged between 1 GB in Asia and 3.7 GB in North America in 2015. The Ericsson Mobility Report Q1 2016 predicts that by the end of 2021 this mobile traffic volume will experience a 12-fold increase. The consequences are higher costs for both providers and consumers and a reduced quality of service due to congested mobile cellular networks. Several countermeasures can be applied to cope with these problems. For instance, mobile applications apply caching strategies to prefetch application content by predicting which applications will be used next. However, existing solutions suffer from two major shortcomings. They either (1) do not incorporate traffic volume information into their prefetching decisions and thus generate a substantial amount of cellular traffic or (2) require a modification of mobile application code. In this thesis, we present novel human mobility and application usage prediction algorithms for mobile devices. These two major contributions address the aforementioned problems of (1) selecting a human mobility prediction model and (2) prefetching of mobile application content to reduce cellular traffic. First, we address the selection of human mobility prediction models. We report on an extensive analysis of the influence of temporal, spatial, and phone context data on the performance of mobility prediction algorithms. Building upon our analysis results, we present (1) SELECTOR – a novel algorithm for selecting individual human mobility prediction models and (2) MAJOR – an ensemble learning approach for human mobility prediction. Furthermore, we introduce population mobility models and demonstrate their practical applicability. In particular, we analyze techniques that focus on detection of wrong human mobility predictions. Among these techniques, an ensemble learning algorithm, called LOTUS, is designed and evaluated. Second, we present EBC – a novel algorithm for prefetching mobile application content. EBC’s goal is to reduce cellular traffic consumption to improve application content freshness. With respect to existing solutions, EBC presents novel techniques (1) to incorporate different strategies for prefetching mobile applications depending on the available network type and (2) to incorporate application traffic volume predictions into the prefetching decisions. EBC also achieves a reduction in application launch time to the cost of a negligible increase in energy consumption. Developing human mobility and application usage prediction algorithms requires access to human mobility and application usage data. To this end, we leverage in this thesis three publicly available data set. Furthermore, we address the shortcomings of these data sets, namely, (1) the lack of ground-truth mobility data and (2) the lack of human mobility data at short-term events like conferences. We contribute with JK2013 and UbiComp Data Collection Campaign (UbiDCC) two human mobility data sets that address these shortcomings. We also develop and make publicly available a mobile application called LOCATOR, which was used to collect our data sets. In summary, the contributions of this thesis provide a step further towards supporting mobile applications and their users. With SELECTOR, we contribute an algorithm that allows optimizing the quality of human mobility predictions by appropriately selecting parameters. To reduce the cellular traffic footprint of mobile applications, we contribute with EBC a novel approach for prefetching of mobile application content by leveraging application usage predictions. Furthermore, we provide insights about how and to what extent wrong and uncertain human mobility predictions can be detected. Lastly, with our mobile application LOCATOR and two human mobility data sets, we contribute practical tools for researchers in the human mobility prediction domain

    Web Usage Mining: Algorithms and results

    Get PDF

    Combination of web usage, content and structure information for diverse web mining applications in the tourism context and the context of users with disabilities

    Get PDF
    188 p.This PhD focuses on the application of machine learning techniques for behaviourmodelling in different types of websites. Using data mining techniques two aspects whichare problematic and difficult to solve have been addressed: getting the system todynamically adapt to possible changes of user preferences, and to try to extract theinformation necessary to ensure the adaptation in a transparent manner for the users,without infringing on their privacy. The work in question combines information of differentnature such as usage information, content information and website structure and usesappropriate web mining techniques to extract as much knowledge as possible from thewebsites. The extracted knowledge is used for different purposes such as adaptingwebsites to the users through proposals of interesting links, so that the users can get therelevant information more easily and comfortably; for discovering interests or needs ofusers accessing the website and to inform the service providers about it; or detectingproblems during navigation.Systems have been successfully generated for two completely different fields: thefield of tourism, working with the website of bidasoa turismo (www.bidasoaturismo.com)and, the field of disabled people, working with discapnet website (www.discapnet.com)from ONCE/Tecnosite foundation

    Practical Analysis of Encrypted Network Traffic

    Get PDF
    The growing use of encryption in network communications is an undoubted boon for user privacy. However, the limitations of real-world encryption schemes are still not well understood, and new side-channel attacks against encrypted communications are disclosed every year. Furthermore, encrypted network communications, by preventing inspection of packet contents, represent a significant challenge from a network security perspective: our existing infrastructure relies on such inspection for threat detection. Both problems are exacerbated by the increasing prevalence of encrypted traffic: recent estimates suggest that 65% or more of downstream Internet traffic will be encrypted by the end of 2016. This work addresses these problems by expanding our understanding of the properties and characteristics of encrypted network traffic and exploring new, specialized techniques for the handling of encrypted traffic by network monitoring systems. We first demonstrate that opaque traffic, of which encrypted traffic is a subset, can be identified in real-time and how this ability can be leveraged to improve the capabilities of existing IDS systems. To do so, we evaluate and compare multiple methods for rapid identification of opaque packets, ultimately pinpointing a simple hypothesis test (which can be implemented on an FPGA) as an efficient and effective detector of such traffic. In our experiments, using this technique to “winnow”, or filter, opaque packets from the traffic load presented to an IDS system significantly increased the throughput of the system, allowing the identification of many more potential threats than the same system without winnowing. Second, we show that side channels in encrypted VoIP traffic enable the reconstruction of approximate transcripts of conversations. Our approach leverages techniques from linguistics, machine learning, natural language processing, and machine translation to accomplish this task despite the limited information leaked by such side channels. Our ability to do so underscores both the potential threat to user privacy which such side channels represent and the degree to which this threat has been underestimated. Finally, we propose and demonstrate the effectiveness of a new paradigm for identifying HTTP resources retrieved over encrypted connections. Our experiments demonstrate how the predominant paradigm from prior work fails to accurately represent real-world situations and how our proposed approach offers significant advantages, including the ability to infer partial information, in comparison. We believe these results represent both an enhanced threat to user privacy and an opportunity for network monitors and analysts to improve their own capabilities with respect to encrypted traffic.Doctor of Philosoph

    Towards Closing the Programmability-Efficiency Gap using Software-Defined Hardware

    Full text link
    The past decade has seen the breakdown of two important trends in the computing industry: Moore’s law, an observation that the number of transistors in a chip roughly doubles every eighteen months, and Dennard scaling, that enabled the use of these transistors within a constant power budget. This has caused a surge in domain-specific accelerators, i.e. specialized hardware that deliver significantly better energy efficiency than general-purpose processors, such as CPUs. While the performance and efficiency of such accelerators are highly desirable, the fast pace of algorithmic innovation and non-recurring engineering costs have deterred their widespread use, since they are only programmable across a narrow set of applications. This has engendered a programmability-efficiency gap across contemporary platforms. A practical solution that can close this gap is thus lucrative and is likely to engender broad impact in both academic research and the industry. This dissertation proposes such a solution with a reconfigurable Software-Defined Hardware (SDH) system that morphs parts of the hardware on-the-fly to tailor to the requirements of each application phase. This system is designed to deliver near-accelerator-level efficiency across a broad set of applications, while retaining CPU-like programmability. The dissertation first presents a fixed-function solution to accelerate sparse matrix multiplication, which forms the basis of many applications in graph analytics and scientific computing. The solution consists of a tiled hardware architecture, co-designed with the outer product algorithm for Sparse Matrix-Matrix multiplication (SpMM), that uses on-chip memory reconfiguration to accelerate each phase of the algorithm. A proof-of-concept is then presented in the form of a prototyped 40 nm Complimentary Metal-Oxide Semiconductor (CMOS) chip that demonstrates energy efficiency and performance per die area improvements of 12.6x and 17.1x over a high-end CPU, and serves as a stepping stone towards a full SDH system. The next piece of the dissertation enhances the proposed hardware with reconfigurability of the dataflow and resource sharing modes, in order to extend acceleration support to a set of common parallelizable workloads. This reconfigurability lends the system the ability to cater to discrete data access and compute patterns, such as workloads with extensive data sharing and reuse, workloads with limited reuse and streaming access patterns, among others. Moreover, this system incorporates commercial cores and a prototyped software stack for CPU-level programmability. The proposed system is evaluated on a diverse set of compute-bound and memory-bound kernels that compose applications in the domains of graph analytics, machine learning, image and language processing. The evaluation shows average performance and energy-efficiency gains of 5.0x and 18.4x over the CPU. The final part of the dissertation proposes a runtime control framework that uses low-cost monitoring of hardware performance counters to predict the next best configuration and reconfigure the hardware, upon detecting a change in phase or nature of data within the application. In comparison to prior work, this contribution targets multicore CGRAs, uses low-overhead decision tree based predictive models, and incorporates reconfiguration cost-awareness into its policies. Compared to the best-average static (non-reconfiguring) configuration, the dynamically reconfigurable system achieves a 1.6x improvement in performance-per-Watt in the Energy-Efficient mode of operation, or the same performance with 23% lower energy in the Power-Performance mode, for SpMM across a suite of real-world inputs. The proposed reconfiguration mechanism itself outperforms the state-of-the-art approach for dynamic runtime control by up to 2.9x in terms of energy-efficiency.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/169859/1/subh_1.pd

    Advances in knowledge discovery and data mining Part II

    Get PDF
    19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
    corecore