315 research outputs found
Structuring Wikipedia Articles with Section Recommendations
Sections are the building blocks of Wikipedia articles. They enhance
readability and can be used as a structured entry point for creating and
expanding articles. Structuring a new or already existing Wikipedia article
with sections is a hard task for humans, especially for newcomers or less
experienced editors, as it requires significant knowledge about how a
well-written article looks for each possible topic. Inspired by this need, the
present paper defines the problem of section recommendation for Wikipedia
articles and proposes several approaches for tackling it. Our systems can help
editors by recommending what sections to add to already existing or newly
created Wikipedia articles. Our basic paradigm is to generate recommendations
by sourcing sections from articles that are similar to the input article. We
explore several ways of defining similarity for this purpose (based on topic
modeling, collaborative filtering, and Wikipedia's category system). We use
both automatic and human evaluation approaches for assessing the performance of
our recommendation system, concluding that the category-based approach works
best, achieving precision@10 of about 80% in the human evaluation.Comment: SIGIR '18 camera-read
Machine Learning Models for Educational Platforms
Scaling up education online and onlife is presenting numerous key challenges, such as hardly manageable classes, overwhelming content alternatives, and academic dishonesty while interacting remotely. However, thanks to the wider availability of learning-related data and increasingly higher performance computing, Artificial Intelligence has the potential to turn such challenges into an unparalleled opportunity. One of its sub-fields, namely Machine Learning, is enabling machines to receive data and learn for themselves, without being programmed with rules. Bringing this intelligent support to education at large scale has a number of advantages, such as avoiding manual error-prone tasks and reducing the chance that learners do any misconduct. Planning, collecting, developing, and predicting become essential steps to make it concrete into real-world education.
This thesis deals with the design, implementation, and evaluation of Machine Learning models in the context of online educational platforms deployed at large scale. Constructing and assessing the performance of intelligent models is a crucial step towards increasing reliability and convenience of such an educational medium. The contributions result in large data sets and high-performing models that capitalize on Natural Language Processing, Human Behavior Mining, and Machine Perception. The model decisions aim to support stakeholders over the instructional pipeline, specifically on content categorization, content recommendation, learners’ identity verification, and learners’ sentiment analysis. Past research in this field often relied on statistical processes hardly applicable at large scale. Through our studies, we explore opportunities and challenges introduced by Machine Learning for the above goals, a relevant and timely topic in literature.
Supported by extensive experiments, our work reveals a clear opportunity in combining human and machine sensing for researchers interested in online education. Our findings illustrate the feasibility of designing and assessing Machine Learning models for categorization, recommendation, authentication, and sentiment prediction in this research area. Our results provide guidelines on model motivation, data collection, model design, and analysis techniques concerning the above applicative scenarios. Researchers can use our findings to improve data collection on educational platforms, to reduce bias in data and models, to increase model effectiveness, and to increase the reliability of their models, among others. We expect that this thesis can support the adoption of Machine Learning models in educational platforms even more, strengthening the role of data as a precious asset. The thesis outputs are publicly available at https://www.mirkomarras.com
- …