669 research outputs found
Quantifying aesthetics of visual design applied to automatic design
In today\u27s Instagram world, with advances in ubiquitous computing and access to social networks, digital media is adopted by art and culture. In this dissertation, we study what makes a good design by investigating mechanisms to bring aesthetics of design from realm of subjection to objection. These mechanisms are a combination of three main approaches: learning theories and principles of design by collaborating with professional designers, mathematically and statistically modeling good designs from large scale datasets, and crowdscourcing to model perceived aesthetics of designs from general public responses. We then apply the knowledge gained in automatic design creation tools to help non-designers in self-publishing, and designers in inspiration and creativity. Arguably, unlike visual arts where the main goals may be abstract, visual design is conceptualized and created to convey a message and communicate with audiences. Therefore, we develop a semantic design mining framework to automatically link the design elements, layout, color, typography, and photos to linguistic concepts. The inferred semantics are applied to a design expert system to leverage user interactions in order to create personalized designs via recommendation algorithms based on the user\u27s preferences
Understanding a large-scale IPTV network via system logs
Recently, there has been a global trend among the telecommunication industry on the rapid deployment of IPTV (Internet Protocol Television) infrastructure and services. While the industry rushes into the IPTV era, the comprehensive understanding of the status and dynamics of IPTV network lags behind. Filling this gap requires in-depth analysis of large amounts of measurement data across the IPTV network. One type of the data of particular interest is device or system log, which has not been systematically studied before. In this dissertation, we will explore the possibility of utilizing system logs to serve a wide range of IPTV network management purposes including health monitoring, troubleshooting and performance evaluation, etc. In particular, we develop a tool to convert raw router syslogs to meaningful network events. In addition, by analyzing set-top box (STB) logs, we propose a series of models to capture both channel popularity and dynamics, and users' activity on the IPTV network.Ph.D.Committee Chair: Jun Xu; Committee Member: Jia Wang; Committee Member: Mostafa H. Ammar; Committee Member: Nick Feamster; Committee Member: Xiaoli M
Data-driven approaches to content selection for data-to-text generation
Data-to-text systems are powerful in generating reports from data automatically and
thus they simplify the presentation of complex data. Rather than presenting data using
visualisation techniques, data-to-text systems use human language, which is the most
common way for human-human communication. In addition, data-to-text systems can
adapt their output content to usersâ preferences, background or interests and therefore
they can be pleasant for users to interact with. Content selection is an important part
of every data-to-text system, because it is the module that decides which from the
available information should be conveyed to the user.
This thesis makes three important contributions. Firstly, it investigates data-driven
approaches to content selection with respect to usersâ preferences. It develops, compares
and evaluates two novel content selection methods. The first method treats content
selection as a Markov Decision Process (MDP), where the content selection decisions
are made sequentially, i.e. given the already chosen content, decide what to talk about
next. The MDP is solved using Reinforcement Learning (RL) and is optimised with
respect to a cumulative reward function. The second approach considers all content
selection decisions simultaneously by taking into account data relationships and treats
content selection as a multi-label classification task. The evaluation shows that the users
significantly prefer the output produced by the RL framework, whereas the multi-label
classification approach scores significantly higher than the RL method in automatic
metrics. The results also show that the end usersâ preferences should be taken into
account when developing Natural Language Generation (NLG) systems.
NLG systems are developed with the assistance of domain experts, however the end
users are normally non-experts. Consider for instance a student feedback generation
system, where the system imitates the teachers. The system will produce feedback based
on the lecturersâ rather than the studentsâ preferences although students are the end
users. Therefore, the second contribution of this thesis is an approach that adapts the
content to âspeakersâ and âhearersâ simultaneously. It considers initially two types of
known stakeholders; lecturers and students. It develops a novel approach that analyses
the preferences of the two groups using Principal Component Regression and uses the derived knowledge to hand-craft a reward function that is then optimised using RL.
The results show that the end users prefer the output generated by this system, rather
than the output that is generated by a system that mimics the experts. Therefore, it is
possible to model the middle ground of the preferences of different known stakeholders.
In most real world applications however, first-time users are generally unknown,
which is a common problem for NLG and interactive systems: the system cannot adapt
to user preferences without prior knowledge. This thesis contributes a novel framework
for addressing unknown stakeholders such as first time users, using Multi-objective Optimisation
to minimise regret for multiple possible user types. In this framework, the
content preferences of potential users are modelled as objective functions, which are
simultaneously optimised using Multi-objective Optimisation. This approach outperforms
two meaningful baselines and minimises regret for unknown users
Remote Human Vital Sign Monitoring Using Multiple-Input Multiple-Output Radar at Millimeter-Wave Frequencies
Non-contact respiration rate (RR) and heart rate (HR) monitoring using millimeter-wave (mmWave) radars has gained lots of attention for medical, civilian, and military applications. These mmWave radars are small, light, and portable which can be deployed to various places. To increase the accuracy of RR and HR detection, distributed multi-input multi-output (MIMO) radar can be used to acquire non-redundant information of vital sign signals from different perspectives because each MIMO channel has different fields of view with respect to the subject under test (SUT). This dissertation investigates the use of a Frequency Modulated Continuous Wave (FMCW) radar operating at 77-81 GHz for this application. Vital sign signal is first reconstructed with Arctangent Demodulation (AD) method using phase changeâs information collected by the radar due to chest wall displacement from respiration and heartbeat activities. Since the heartbeat signals can be corrupted and concealed by the third/fourth harmonics of the respiratory signals as well as random body motion (RBM) from the SUT, we have developed an automatic Heartbeat Template (HBT) extraction method based on Constellation Diagrams of the received signals. The extraction method will automatically spot and extract signalsâ portions that carry good amount of heartbeat signals which are not corrupted by the RBM. The extracted HBT is then used as an adapted wavelet for Continuous Wavelet Transform (CWT) to reduce interferences from respiratory harmonics and RBM, as well as magnify the heartbeat signals. As the nature of RBM is unpredictable, the extracted HBT may not completely cancel the interferences from RBM. Therefore, to provide better HR detectionâs accuracy, we have also developed a spectral-based HR selection method to gather frequency spectra of heartbeat signals from different MIMO channels. Based on this gathered spectral information, we can determine an accurate HR even if the heartbeat signals are significantly concealed by the RBM. To further improve the detectionâs accuracy of RR and HR, two deep learning (DL) frameworks are also investigated. First, a Convolutional Neural Network (CNN) has been proposed to optimally select clean MIMO channels and eliminate MIMO channels with low SNR of heartbeat signals. After that, a Multi-layer Perceptron (MLP) neural network (NN) is utilized to reconstruct the heartbeat signals that will be used to assess and select the final HR with high confidence
Harvesting and summarizing user-generated content for advanced speech-based human-computer interaction
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 155-164).There have been many assistant applications on mobile devices, which could help people obtain rich Web content such as user-generated data (e.g., reviews, posts, blogs, and tweets). However, online communities and social networks are expanding rapidly and it is impossible for people to browse and digest all the information via simple search interface. To help users obtain information more efficiently, both the interface for data access and the information representation need to be improved. An intuitive and personalized interface, such as a dialogue system, could be an ideal assistant, which engages a user in a continuous dialogue to garner the user's interest and capture the user's intent, and assists the user via speech-navigated interactions. In addition, there is a great need for a type of application that can harvest data from the Web, summarize the information in a concise manner, and present it in an aggregated yet natural way such as direct human dialogue. This thesis, therefore, aims to conduct research on a universal framework for developing speech-based interface that can aggregate user-generated Web content and present the summarized information via speech-based human-computer interaction. To accomplish this goal, several challenges must be met. Firstly, how to interpret users' intention from their spoken input correctly? Secondly, how to interpret the semantics and sentiment of user-generated data and aggregate them into structured yet concise summaries? Lastly, how to develop a dialogue modeling mechanism to handle discourse and present the highlighted information via natural language? This thesis explores plausible approaches to tackle these challenges. We will explore a lexicon modeling approach for semantic tagging to improve spoken language understanding and query interpretation. We will investigate a parse-and-paraphrase paradigm and a sentiment scoring mechanism for information extraction from unstructured user-generated data. We will also explore sentiment-involved dialogue modeling and corpus-based language generation approaches for dialogue and discourse. Multilingual prototype systems in multiple domains have been implemented for demonstration.by Jingjing Liu.Ph.D
E-Learning
E-learning enables students to pace their studies according to their needs, making learning accessible to (1) people who do not have enough free time for studying - they can program their lessons according to their available schedule; (2) those far from a school (geographical issues), or the ones unable to attend classes due to some physical or medical restriction. Therefore, cultural, geographical and physical obstructions can be removed, making it possible for students to select their path and time for the learning course. Students are then allowed to choose the main objectives they are suitable to fulfill. This book regards E-learning challenges, opening a way to understand and discuss questions related to long-distance and lifelong learning, E-learning for people with special needs and, lastly, presenting case study about the relationship between the quality of interaction and the quality of learning achieved in experiences of E-learning formation
The electronic broadsheet : all the news that fits the display
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Architecture, 1991.Includes bibliographical references (leaves 82-84).HĂ„kon Wium.M.S
Recommended from our members
User-centred video abstraction
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University LondonThe rapid growth of digital video content in recent years has imposed the need for the development of technologies with the capability to produce condensed but semantically rich versions of the input video stream in an effective manner. Consequently, the topic of Video Summarisation is becoming increasingly popular in multimedia community and numerous video abstraction approaches have been proposed accordingly. These recommended techniques can be divided into two major categories of automatic and semi-automatic in accordance with the required level of human intervention in summarisation process. The fully-automated methods mainly adopt the low-level visual, aural and textual features alongside the mathematical and statistical algorithms in furtherance to extract the most significant segments of original video. However, the effectiveness of this type of techniques is restricted by a number of factors such as domain-dependency, computational expenses and the inability to understand the semantics of videos from low-level features. The second category of techniques however, attempts to alleviate the quality of summaries by involving humans in the abstraction process to bridge the semantic gap. Nonetheless, a single userâs subjectivity and other external contributing factors such as distraction will potentially deteriorate the performance of this group of approaches. Accordingly, in this thesis we have focused on the development of three user-centred effective video summarisation techniques that could be applied to different video categories and generate satisfactory results. According to our first proposed approach, a novel mechanism for a user-centred video summarisation has been presented for the scenarios in which multiple actors are employed in the video summarisation process in order to minimise the negative effects of sole user adoption. Based on our recommended algorithm, the video frames were initially scored by a group of video annotators âon the flyâ. This was followed by averaging these assigned scores in order to generate a singular saliency score for each video frame and, finally, the highest scored video frames alongside the corresponding audio and textual contents were extracted to be included into the final summary. The effectiveness of our approach has been assessed by comparing the video summaries generated based on our approach against the results obtained from three existing automatic summarisation tools that adopt different modalities for abstraction purposes. The experimental results indicated that our proposed method is capable of delivering remarkable outcomes in terms of Overall Satisfaction and Precision with an acceptable Recall rate, indicating the usefulness of involving user input in the video summarisation process. In an attempt to provide a better user experience, we have proposed our personalised video summarisation method with an ability to customise the generated summaries in accordance with the viewersâ preferences. Accordingly, the end-userâs priority levels towards different video scenes were captured and utilised for updating the average scores previously assigned by the video annotators. Finally, our earlier proposed summarisation method was adopted to extract the most significant audio-visual content of the video. Experimental results indicated the capability of this approach to deliver superior outcomes compared with our previously proposed method and the three other automatic summarisation tools. Finally, we have attempted to reduce the required level of audience involvement for personalisation purposes by proposing a new method for producing personalised video summaries. Accordingly, SIFT visual features were adopted to identify the video scenesâ semantic categories. Fusing this retrieved data with pre-built usersâ profiles, personalised video abstracts can be created. Experimental results showed the effectiveness of this method in delivering superior outcomes comparing to our previously recommended algorithm and the three other automatic summarisation techniques
Study of result presentation and interaction for aggregated search
The World Wide Web has always attracted researchers and commercial search engine companies due to the enormous amount of information available on it. "Searching" on web has become an integral part of today's world, and many people rely on it when looking for information. The amount and the diversity of information available on the Web has also increased dramatically. Due to which, the researchers and the search engine companies are making constant efforts in order to make this information accessible to the people effectively.
Not only there is an increase in the amount and diversity of information available online, users are now often seeking information on broader topics. Users seeking information on broad topics, gather information from various information sources (e.g, image, video, news, blog, etc). For such information requests, not only web results but results from different document genre and multimedia contents are also becoming relevant. For instance, users' looking for information on "Glasgow" might be interested in web results about Glasgow, Map of Glasgow, Images of Glasgow, News of Glasgow, and so on.
Aggregated search aims to provide access to this diverse information in a unified manner by aggregating results from different information sources on a single result page. Hence making information gathering process easier for broad topics.
This thesis aims to explore the aggregated search from the users' perspective. The thesis first and foremost focuses on understanding and describing the phenomena related to the users' search process in the context of the aggregated search. The goal is to participate in building theories and in understanding constraints, as well as providing insights into the interface design space. In building this understanding, the thesis focuses on the click-behavior, information need, source relevance, dynamics of search intents. The understanding comes partly from conducting users studies and, from analyzing search engine log data.
While the thematic (or topical) relevance of documents is important, this thesis argues that the "source type" (source-orientation) may also be an important dimension in the relevance space for investigating in aggregated search. Therefore, relevance is multi-dimensional (topical and source-orientated) within the context of aggregated search. Results from the study suggest that the effect of the source-orientation was a significant factor in an aggregated search scenario. Hence adds another dimension to the relevance space within the aggregated search scenario.
The thesis further presents an effective method which combines rule base and machine learning techniques to identify source-orientation behind a user query.
Furthermore, after analyzing log-data from a search engine company and conducting user study experiments, several design issues that may arise with respect to the aggregated search interface are identified. In order to address these issues, suitable design guidelines that can be beneficial from the interface perspective are also suggested.
To conclude, aim of this thesis is to explore the emerging aggregated search from users' perspective, since it is a very important for front-end technologies. An additional goal is to provide empirical evidence for influence of aggregated search on users searching behavior, and identify some of the key challenges of aggregated search. During this work several aspects of aggregated search will be uncovered. Furthermore, this thesis will provide a foundations for future research in aggregated search and will highlight the potential research directions
- âŠ