9 research outputs found

    Collaborative QoS Prediction for Mobile Service with Data Filtering and SlopeOne Model

    Get PDF

    Privacy preserving recommender systems

    Get PDF
    The recommender systems help users find suitable and interesting products and contents from the huge amount of information that are available in the internet. There are various types of recommender systems available which have been providing recommendation services to users. For example Collaborative Filtering (CF) based recommendations, Content based (CB) recommendations, context aware recommendations and so on. Despite the fact that these recommender systems are very useful to solve the information overload problem by filtering interesting information, they suffer from huge privacy issues. In order to generate user personalized recommendations, the recommendation service providers need to acquire the information related to attributes, preferences, experiences as well as demands, which are related to users' confidential information. Usually the more information available to the service providers, the more accurate recommendations can be generated. However, the service providers are not always trustworthy to share personal information for recommendation purposes since they may cause serious privacy threats to users' privacy by leaking them to other parties or providing false recommendations. Therefore the user information must be protected prior to share them to any third party service provider to ensure the privacy of users. To overcome the privacy issues of recommender systems several techniques have been proposed which can be categorized into decentralization, randomization and secure computations based approaches. In decentralization based approach, the central service providers are removed and the main controls of recommendation services are given to participant users. The main issue with this kind of approach is that to generate recommendations, the users need to be dependant to other users' availability in online services. If any user becomes offline, her information can not be used in the system. The randomization based techniques add noises to users data to obfuscate them from learning the true information. However the main issue is that adding noise affects recommendation accuracy. On the contrary, the secure computations preserve user information while providing accurate recommendations. In this thesis we preserve user privacy by means of encrypting user information, specifically their ratings and other related information using homomorphic encryption based techniques to provide recommendations based on the encrypted data. The main advantage of homomorphic encryption based technique is that it is semantically secure and computationally it is hard to distinguish the true information from the given ciphertext. Using the homomorphic based encryption tools and techniques we build different privacy preserving protocols for different types of recommendation approaches by analyzing their privacy requirements and challenges. More specifically, we focus on different key recommendation techniques and differentiate them into centralized and partitioned dataset based recommendation techniques. From available recommendation techniques, we found that some of the existing and popular recommendation techniques like user based recommendation, item based recommendation and context aware recommendation can be grouped into centralized recommendation approach. In partitioned dataset based recommendation, the user information can be partitioned into different organizations and these organizations can collaborate with each other by gathering sufficient information in order to provide accurate recommendations without revealing their own confidential information. After categorizing the recommendation techniques we analyze the problems and requirements in terms of privacy preservation. Then for each type of recommendation approach, we develop the privacy preserving protocols to generate recommendations taking their specific privacy requirements and challenges into consideration. We also investigate the problems and limitations of existing privacy preserving recommendations and found that the current solutions suffer from huge computation and communication overhead as well as privacy of users. In the thesis we identify the related problems and solve the issues using our proposed privacy preserving protocols. As an overall idea, our proposed recommendation protocols work as follows. The users encrypt their ratings using homomorphic encryption and send them to service providers. We assume the service providers are semi honest but curious, they follow the protocol but at the same time try to find new information from the available data. The service provider has the ability to perform homomorphic operations and it performs certain computations over encrypted data without learning any true information and returns the results to the query users who ask for recommendations. The system models of our privacy preserving protocols for different recommendation techniques differ from each other because of their different privacy requirements. The proposed privacy preserving protocols are tested on various real world datasets. Based on the application areas of different recommendation approaches our gathered datasets are also different such as movie rating, social network, checkin information for different locations and quality of service of web services. For each proposed privacy preserving protocols we also present the privacy analysis and describe how the system can perform the computations without leaking the private information of users. The experimental and privacy analysis of our proposed privacy preserving protocols for different types of recommendation techniques show that they are private as well as practical

    Explainable, Security-Aware and Dependency-Aware Framework for Intelligent Software Refactoring

    Full text link
    As software systems continue to grow in size and complexity, their maintenance continues to become more challenging and costly. Even for the most technologically sophisticated and competent organizations, building and maintaining high-performing software applications with high-quality-code is an extremely challenging and expensive endeavor. Software Refactoring is widely recognized as the key component for maintaining high-quality software by restructuring existing code and reducing technical debt. However, refactoring is difficult to achieve and often neglected due to several limitations in the existing refactoring techniques that reduce their effectiveness. These limitation include, but not limited to, detecting refactoring opportunities, recommending specific refactoring activities, and explaining the recommended changes. Existing techniques are mainly focused on the use of quality metrics such as coupling, cohesion, and the Quality Metrics for Object Oriented Design (QMOOD). However, there are many other factors identified in this work to assist and facilitate different maintenance activities for developers: 1. To structure the refactoring field and existing research results, this dissertation provides the most scalable and comprehensive systematic literature review analyzing the results of 3183 research papers on refactoring covering the last three decades. Based on this survey, we created a taxonomy to classify the existing research, identified research trends and highlighted gaps in the literature for further research. 2. To draw attention to what should be the current refactoring research focus from the developers’ perspective, we carried out the first large scale refactoring study on the most popular online Q&A forum for developers, Stack Overflow. We collected and analyzed posts to identify what developers ask about refactoring, the challenges that practitioners face when refactoring software systems, and what should be the current refactoring research focus from the developers’ perspective. 3. To improve the detection of refactoring opportunities in terms of quality and security in the context of mobile apps, we designed a framework that recommends the files to be refactored based on user reviews. We also considered the detection of refactoring opportunities in the context of web services. We proposed a machine learning-based approach that helps service providers and subscribers predict the quality of service with the least costs. Furthermore, to help developers make an accurate assessment of the quality of their software systems and decide if the code should be refactored, we propose a clustering-based approach to automatically identify the preferred benchmark to use for the quality assessment of a project. 4. Regarding the refactoring generation process, we proposed different techniques to enhance the change operators and seeding mechanism by using the history of applied refactorings and incorporating refactoring dependencies in order to improve the quality of the refactoring solutions. We also introduced the security aspect when generating refactoring recommendations, by investigating the possible impact of improving different quality attributes on a set of security metrics and finding the best trade-off between them. In another approach, we recommend refactorings to prioritize fixing quality issues in security-critical files, improve quality attributes and remove code smells. All the above contributions were validated at the large scale on thousands of open source and industry projects in collaboration with industry partners and the open source community. The contributions of this dissertation are integrated in a cloud-based refactoring framework which is currently used by practitioners.Ph.D.College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttp://deepblue.lib.umich.edu/bitstream/2027.42/171082/1/Chaima Abid Final Dissertation.pdfDescription of Chaima Abid Final Dissertation.pdf : Dissertatio

    Hybrid intelligence for data mining

    Full text link
    Today, enormous amount of data are being recorded in all kinds of activities. This sheer size provides an excellent opportunity for data scientists to retrieve valuable information using data mining techniques. Due to the complexity of data in many neoteric problems, one-size-fits-all solutions are seldom able to provide satisfactory answers. Although the studies of data mining have been active, hybrid techniques are rarely scrutinized in detail. Currently, not many techniques can handle time-varying properties while performing their core functions, neither do they retrieve and combine information from heterogeneous dimensions, e.g., textual and numerical horizons. This thesis summarizes our investigations on hybrid methods to provide data mining solutions to problems involving non-trivial datasets, such as trajectories, microblogs, and financial data. First, time-varying dynamic Bayesian networks are extended to consider both causal and dynamic regularization requirements. Combining with density-based clustering, the enhancements overcome the difficulties in modeling spatial-temporal data where heterogeneous patterns, data sparseness and distribution skewness are common. Secondly, topic-based methods are proposed for emerging outbreak and virality predictions on microblogs. Complicated models that consider structural details are popular while others might have taken overly simplified assumptions to sacrifice accuracy for efficiency. Our proposed virality prediction solution delivers the benefits of both worlds. It considers the important characteristics of a structure yet without the burden of fine details to reduce complexity. Thirdly, the proposed topic-based approach for microblog mining is extended for sentiment prediction problems in finance. Sentiment-of-topic models are learned from both commentaries and prices for better risk management. Moreover, previously proposed, supervised topic model provides an avenue to associate market volatility with financial news yet it displays poor resolutions at extreme regions. To overcome this problem, extreme topic model is proposed to predict volatility in financial markets by using supervised learning. By mapping extreme events into Poisson point processes, volatile regions are magnified to reveal their hidden volatility-topic relationships. Lastly, some of the proposed hybrid methods are applied to service computing to verify that they are sufficiently generic for wider applications

    Proceedings of the 7th Sound and Music Computing Conference

    Get PDF
    Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

    Data Science and Knowledge Discovery

    Get PDF
    Data Science (DS) is gaining significant importance in the decision process due to a mix of various areas, including Computer Science, Machine Learning, Math and Statistics, domain/business knowledge, software development, and traditional research. In the business field, DS's application allows using scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data to support the decision process. After collecting the data, it is crucial to discover the knowledge. In this step, Knowledge Discovery (KD) tasks are used to create knowledge from structured and unstructured sources (e.g., text, data, and images). The output needs to be in a readable and interpretable format. It must represent knowledge in a manner that facilitates inferencing. KD is applied in several areas, such as education, health, accounting, energy, and public administration. This book includes fourteen excellent articles which discuss this trending topic and present innovative solutions to show the importance of Data Science and Knowledge Discovery to researchers, managers, industry, society, and other communities. The chapters address several topics like Data mining, Deep Learning, Data Visualization and Analytics, Semantic data, Geospatial and Spatio-Temporal Data, Data Augmentation and Text Mining

    Fish4Knowledge: Collecting and Analyzing Massive Coral Reef Fish Video Data

    Get PDF
    This book gives a start-to-finish overview of the whole Fish4Knowledge project, in 18 short chapters, each describing one aspect of the project. The Fish4Knowledge project explored the possibilities of big video data, in this case from undersea video. Recording and analyzing 90 thousand hours of video from ten camera locations, the project gives a 3 year view of fish abundance in several tropical coral reefs off the coast of Taiwan. The research system built a remote recording network, over 100 Tb of storage, supercomputer processing, video target detection and

    Anales del XIII Congreso Argentino de Ciencias de la Computación (CACIC)

    Get PDF
    Contenido: Arquitecturas de computadoras Sistemas embebidos Arquitecturas orientadas a servicios (SOA) Redes de comunicaciones Redes heterogéneas Redes de Avanzada Redes inalámbricas Redes móviles Redes activas Administración y monitoreo de redes y servicios Calidad de Servicio (QoS, SLAs) Seguridad informática y autenticación, privacidad Infraestructura para firma digital y certificados digitales Análisis y detección de vulnerabilidades Sistemas operativos Sistemas P2P Middleware Infraestructura para grid Servicios de integración (Web Services o .Net)Red de Universidades con Carreras en Informática (RedUNCI

    CACIC 2015 : XXI Congreso Argentino de Ciencias de la Computación. Libro de actas

    Get PDF
    Actas del XXI Congreso Argentino de Ciencias de la Computación (CACIC 2015), realizado en Sede UNNOBA Junín, del 5 al 9 de octubre de 2015.Red de Universidades con Carreras en Informática (RedUNCI
    corecore