35 research outputs found
Prediction, Recommendation and Group Analytics Models in the domain of Mashup Services and Cyber-Argumentation Platform
Mashup application development is becoming a widespread software development practice due to its appeal for a shorter application development period. Application developers usually use web APIs from different sources to create a new streamlined service and provide various features to end-users. This kind of practice saves time, ensures reliability, accuracy, and security in the developed applications. Mashup application developers integrate these available APIs into their applications. Still, they have to go through thousands of available web APIs and chose only a few appropriate ones for their application. Recommending relevant web APIs might help application developers in this situation. However, very low API invocation from mashup applications creates a sparse mashup-web API dataset for the recommendation models to learn about the mashups and their web API invocation pattern. One research aims to analyze these mashup-specific critical issues, look for supplemental information in the mashup domain, and develop web API recommendation models for mashup applications. The developed recommendation model generates useful and accurate web APIs to reduce the impact of low API invocations in mashup application development.
Cyber-Argumentation platform also faces a similarly challenging issue. In large-scale cyber argumentation platforms, participants express their opinions, engage with one another, and respond to feedback and criticism from others in discussing important issues online. Argumentation analysis tools capture the collective intelligence of the participants and reveal hidden insights from the underlying discussions. However, such analysis requires that the issues have been thoroughly discussed and participant’s opinions are clearly expressed and understood. Participants typically focus only on a few ideas and leave others unacknowledged and underdiscussed. This generates a limited dataset to work with, resulting in an incomplete analysis of issues in the discussion. One solution to this problem would be to develop an opinion prediction model for cyber-argumentation. This model would predict participant’s opinions on different ideas that they have not explicitly engaged.
In cyber-argumentation, individuals interact with each other without any group coordination. However, the implicit group interaction can impact the participating user\u27s opinion, attitude, and discussion outcome. One of the objectives of this research work is to analyze different group analytics in the cyber-argumentation environment. The objective is to design an experiment to inspect whether the critical concepts of the Social Identity Model of Deindividuation Effects (SIDE) are valid in our argumentation platform. This experiment can help us understand whether anonymity and group sense impact user\u27s behavior in our platform. Another section is about developing group interaction models to help us understand different aspects of group interactions in the cyber-argumentation platform.
These research works can help develop web API recommendation models tailored for mashup-specific domains and opinion prediction models for the cyber-argumentation specific area. Primarily these models utilize domain-specific knowledge and integrate them with traditional prediction and recommendation approaches. Our work on group analytic can be seen as the initial steps to understand these group interactions
Recommended from our members
Modeling the Dynamics of Consumer Behavior from Massive Interaction Data
Recent technological innovations (e.g. e-commerce platforms, automated retail stores) have enabled dramatic changes in people's shopping experiences, as well as the accessibility to incredible volumes of consumer-product interaction data. As a result, machine learning (ML) systems can be widely developed to help people navigate relevant information and make decisions. Traditional ML systems have achieved great success on various well-defined problems such as speech recognition and facial recognition. Unlike these tasks where datasets and objectives are clearly benchmarked, modeling consumer behavior can be rather complicated; for example, consumer activities can be affected by real-time shopping contexts, collected interaction data can be noisy and biased, interests from multiple parties (both consumers and producers) can be involved in the predictive objectives.The primary goal of this dissertation is to address the obstacles in modeling consumer activities through computational approaches, but with careful considerations from economic and societal perspectives. Intellectually, such models help us to understand the forces that guide consumer behavior. Methodologically, I build algorithms capable of processing massive interaction datasets by connecting well-developed ML techniques and well-established economic theories. Practically, my work has applications ranging from recommender systems, e-commerce and business intelligence
Machine Learning Models for Educational Platforms
Scaling up education online and onlife is presenting numerous key challenges, such as hardly manageable classes, overwhelming content alternatives, and academic dishonesty while interacting remotely. However, thanks to the wider availability of learning-related data and increasingly higher performance computing, Artificial Intelligence has the potential to turn such challenges into an unparalleled opportunity. One of its sub-fields, namely Machine Learning, is enabling machines to receive data and learn for themselves, without being programmed with rules. Bringing this intelligent support to education at large scale has a number of advantages, such as avoiding manual error-prone tasks and reducing the chance that learners do any misconduct. Planning, collecting, developing, and predicting become essential steps to make it concrete into real-world education.
This thesis deals with the design, implementation, and evaluation of Machine Learning models in the context of online educational platforms deployed at large scale. Constructing and assessing the performance of intelligent models is a crucial step towards increasing reliability and convenience of such an educational medium. The contributions result in large data sets and high-performing models that capitalize on Natural Language Processing, Human Behavior Mining, and Machine Perception. The model decisions aim to support stakeholders over the instructional pipeline, specifically on content categorization, content recommendation, learners’ identity verification, and learners’ sentiment analysis. Past research in this field often relied on statistical processes hardly applicable at large scale. Through our studies, we explore opportunities and challenges introduced by Machine Learning for the above goals, a relevant and timely topic in literature.
Supported by extensive experiments, our work reveals a clear opportunity in combining human and machine sensing for researchers interested in online education. Our findings illustrate the feasibility of designing and assessing Machine Learning models for categorization, recommendation, authentication, and sentiment prediction in this research area. Our results provide guidelines on model motivation, data collection, model design, and analysis techniques concerning the above applicative scenarios. Researchers can use our findings to improve data collection on educational platforms, to reduce bias in data and models, to increase model effectiveness, and to increase the reliability of their models, among others. We expect that this thesis can support the adoption of Machine Learning models in educational platforms even more, strengthening the role of data as a precious asset. The thesis outputs are publicly available at https://www.mirkomarras.com
Recommended from our members
Scalable algorithms for latent variable models in machine learning
Latent variable modeling (LVM) is a popular approach in many machine learning applications, such as recommender systems and topic modeling, due to its ability to succinctly represent data, even in the presence of several missing entries. Existing learning methods for LVMs, while attractive, are infeasible for the large-scale datasets required in modern big data applications. In addition, such applications often come with various types of side information such as the text description of items and the social network among users in a recommender system. In this thesis, we present scalable learning algorithms for a wide range of latent variable models such as low-rank matrix factorization and latent Dirichlet allocation. We also develop simple but effective techniques to extend existing LVMs to exploit various types of side information and make better predictions in many machine learning applications such as recommender systems, multi-label learning, and high-dimensional time-series prediction. In addition, we also propose a novel approach for the maximum inner product search problem to accelerate the prediction phase of many latent variable models.Computer Science
Learning Representations of Social Media Users
User representations are routinely used in recommendation systems by platform
developers, targeted advertisements by marketers, and by public policy
researchers to gauge public opinion across demographic groups. Computer
scientists consider the problem of inferring user representations more
abstractly; how does one extract a stable user representation - effective for
many downstream tasks - from a medium as noisy and complicated as social media?
The quality of a user representation is ultimately task-dependent (e.g. does
it improve classifier performance, make more accurate recommendations in a
recommendation system) but there are proxies that are less sensitive to the
specific task. Is the representation predictive of latent properties such as a
person's demographic features, socioeconomic class, or mental health state? Is
it predictive of the user's future behavior?
In this thesis, we begin by showing how user representations can be learned
from multiple types of user behavior on social media. We apply several
extensions of generalized canonical correlation analysis to learn these
representations and evaluate them at three tasks: predicting future hashtag
mentions, friending behavior, and demographic features. We then show how user
features can be employed as distant supervision to improve topic model fit.
Finally, we show how user features can be integrated into and improve existing
classifiers in the multitask learning framework. We treat user representations
- ground truth gender and mental health features - as auxiliary tasks to
improve mental health state prediction. We also use distributed user
representations learned in the first chapter to improve tweet-level stance
classifiers, showing that distant user information can inform classification
tasks at the granularity of a single message.Comment: PhD thesi
Learning Representations of Social Media Users
User representations are routinely used in recommendation systems by platform
developers, targeted advertisements by marketers, and by public policy
researchers to gauge public opinion across demographic groups. Computer
scientists consider the problem of inferring user representations more
abstractly; how does one extract a stable user representation - effective for
many downstream tasks - from a medium as noisy and complicated as social media?
The quality of a user representation is ultimately task-dependent (e.g. does
it improve classifier performance, make more accurate recommendations in a
recommendation system) but there are proxies that are less sensitive to the
specific task. Is the representation predictive of latent properties such as a
person's demographic features, socioeconomic class, or mental health state? Is
it predictive of the user's future behavior?
In this thesis, we begin by showing how user representations can be learned
from multiple types of user behavior on social media. We apply several
extensions of generalized canonical correlation analysis to learn these
representations and evaluate them at three tasks: predicting future hashtag
mentions, friending behavior, and demographic features. We then show how user
features can be employed as distant supervision to improve topic model fit.
Finally, we show how user features can be integrated into and improve existing
classifiers in the multitask learning framework. We treat user representations
- ground truth gender and mental health features - as auxiliary tasks to
improve mental health state prediction. We also use distributed user
representations learned in the first chapter to improve tweet-level stance
classifiers, showing that distant user information can inform classification
tasks at the granularity of a single message.Comment: PhD thesi