8 research outputs found
Transfer nonnegative matrix factorization for image representation
Nonnegative Matrix Factorization (NMF) has received considerable attention due to its psychological and physiological interpretation of naturally occurring data whose representation may be parts based in the human brain. However, when labeled and unlabeled images are sampled from different distributions, they may be quantized into different basis vector space and represented in different coding vector space, which may lead to low representation fidelity. In this paper, we investigate how to extend NMF to cross-domain scenario. We accomplish this goal through TNMF - a novel semi-supervised transfer learning approach. Specifically, we aim to minimize the distribution divergence between labeled and unlabeled images, and incorporate this criterion into the objective function of NMF to construct new robust representations. Experiments show that TNMF outperforms state-of-the-art methods on real dataset
Leveraging cloud computing for the semantic web: review and trends
Semantic and cloud computing technologies have become vital elements for developing and deploying solutions across diverse fields in computing. While they are independent of each other, they can be integrated in diverse ways for developing solutions and this has been significantly explored in recent times. With the migration of web-based data and applications to cloud platforms and the evolution of the web itself from a social, web 2.0 to a semantic, web 3.0 comes as the convergence of both technologies. While several concepts and implementations have been provided regarding interactions between the two technologies from existing research, without an explicit classification of the modes of interaction, it can be quite challenging to articulate the interaction modes; hence, building upon them can be a very daunting task. Hence, this research identifies and describes the modes of interaction between them. Furthermore, a “cloud-driven” interaction mode which focuses on fully maximising cloud computing characteristics and benefits for driving the semantic web is described, providing an approach for evolving the semantic web and delivering automated semantic annotation on a large scale to web applications
Hessian regularized support vector machines for mobile image annotation on the cloud
With the rapid development of the cloud computing and mobile service, users expect a better experience through multimedia computing, such as automatic or semi-automatic personal image and video organization and intelligent user interface. These functions heavily depend on the success of image understanding, and thus large-scale image annotation has received intensive attention in recent years. The collaboration between mobile and cloud opens a new avenue for image annotation, because the heavy computation can be transferred to the cloud for immediately responding user actions. In this paper, we present a scheme for image annotation on the cloud, which transmits mobile images compressed by Hamming compressed sensing to the cloud and conducts semantic annotation through a novel Hessian regularized support vector machine on the cloud. We carefully explained the rationality of Hessian regularization for encoding the local geometry of the compact support of the marginal distribution and proved that Hessian regularized support vector machine in the reproducing kernel Hilbert space is equivalent to conduct Hessian regularized support vector machine in the space spanned by the principal components of the kernel principal component analysis. We conducted experiments on the PASCAL VOC'07 dataset and demonstrated the effectiveness of Hessian regularized support vector machine for large-scale image annotation
Recommended from our members
Leveraging the Power of Crowds: Automated Test Report Processing for The Maintenance of Mobile Applications
Crowdsourcing is an emerging distributed problem-solving model combining human and machine computation. It collects intelligence and knowledge from a large and diverse workforce to complete complex tasks. In the software engineering domain, crowdsourced techniques have been adopted to facilitate various tasks, such as design, testing, debugging, development, and so on. Specifically, in crowdsourced testing, crowdsourced workers are given testing tasks to perform and submit their feedback in the form of test reports. One of the key advantages of crowdsourced testing is that it is capable of providing engineers software engineers with domain knowledge and feedback from a large number of real users. Based on diverse software and hardware settings of these users, engineers can bugs that are not caught by traditional quality assurance techniques. Such benefits are particularly ideal for mobile application testing, which needs rapid development-and-deployment iterations and support diverse execution environments. However, crowdsourced testing naturally generates an overwhelming number of crowdsourced test reports, and inspecting such a large number of reports becomes a time-consuming yet inevitable task. This dissertation presents a series of techniques, tools and experiments to assist in crowdsourced report processing. These techniques are designed for improving this task in multiple aspects: 1. prioritizing crowdsourced report to assist engineers in finding as many unique bugs as possible, and as quickly as possible; 2. grouping crowdsourced report to assist engineers in identifying the representative ones in a short time; 3. summarizing the duplicate reports to provide engineers with a concise and accurate understanding of a group of reports; In the first step, I present a text-analysis-based technique to prioritize test reports for manual inspection. This technique leverages two key strategies: (1) a diversity strategy to help developers inspect a wide variety of test reports and to avoid duplicates and wasted effort on falsely classified faulty behavior, and (2) a risk-assessment strategy to help developers identify test reports that may be more likely to be fault-revealing based on past observations.Together, these two strategies form our technique to prioritize test reports in crowdsourced testing. Moreover, in the mobile testing domain, test reports often consist of more screenshots and shorter descriptive text, and thus text-analysis-based techniques may be ineffective or inapplicable. The shortage and ambiguity of natural-language text information and the well-defined screenshots of activity views within mobile applications motivate me to propose a novel technique based on using image understanding for multi-objective test-report prioritization. This technique employs the Spatial Pyramid Matching (SPM) technique to measure the similarity of the screenshots, and apply the natural-language processing technique to measure the distance between the text of test reports. Next, I design and implement CTRAS: a novel approach to leveraging duplicates to enrich the content of bug descriptions and improve the efficiency of inspecting these reports. CTRAS is capable of automatically aggregating duplicates based on both textual information and screenshots, and further summarizes the duplicate test reports into a comprehensive and comprehensible report.I validate all of these techniques on industrial data by collaborating with several companies. The results show my techniques can improve both the efficiency and effectiveness of crowdsourced test report processing. Also, I suggest settings for different usage scenarios and discuss future research directions
A lifelogging system supporting multimodal access
Today, technology has progressed to allow us to capture our lives digitally such as taking pictures, recording videos and gaining access to WiFi to share experiences using smartphones. People’s lifestyles are changing. One example is from the traditional memo writing to the digital lifelog. Lifelogging is the process of using digital tools to collect personal data in order to illustrate the user’s daily life (Smith et al., 2011). The availability of smartphones embedded with different sensors such as camera and GPS has encouraged the development of lifelogging. It also has brought new challenges in multi-sensor data collection, large volume data storage, data analysis and appropriate representation of lifelog data across different devices.
This study is designed to address the above challenges. A lifelogging system was developed to collect, store, analyse, and display multiple sensors’ data, i.e. supporting multimodal access. In this system, the multi-sensor data (also called data streams) is firstly transmitted from smartphone to server only when the phone is being charged. On the server side, six contexts are detected namely personal, time, location, social, activity and environment. Events are then segmented and a related narrative is generated. Finally, lifelog data is presented differently on three widely used devices which are the computer, smartphone and E-book reader.
Lifelogging is likely to become a well-accepted technology in the coming years. Manual logging is not possible for most people and is not feasible in the long-term. Automatic lifelogging is needed. This study presents a lifelogging system which can automatically collect multi-sensor data, detect contexts, segment events, generate meaningful narratives and display the appropriate data on different devices based on their unique characteristics. The work in this thesis therefore contributes to automatic lifelogging development and in doing so makes a valuable contribution to the development of the field