5,887 research outputs found

    Multimodal Approach for Big Data Analytics and Applications

    Get PDF
    The thesis presents multimodal conceptual frameworks and their applications in improving the robustness and the performance of big data analytics through cross-modal interaction or integration. A joint interpretation of several knowledge renderings such as stream, batch, linguistics, visuals and metadata creates a unified view that can provide a more accurate and holistic approach to data analytics compared to a single standalone knowledge base. Novel approaches in the thesis involve integrating multimodal framework with state-of-the-art computational models for big data, cloud computing, natural language processing, image processing, video processing, and contextual metadata. The integration of these disparate fields has the potential to improve computational tools and techniques dramatically. Thus, the contributions place multimodality at the forefront of big data analytics; the research aims at mapping and under- standing multimodal correspondence between different modalities. The primary contribution of the thesis is the Multimodal Analytics Framework (MAF), a collaborative ensemble framework for stream and batch processing along with cues from multiple input modalities like language, visuals and metadata to combine benefits from both low-latency and high-throughput. The framework is a five-step process: Data ingestion. As a first step towards Big Data analytics, a high velocity, fault-tolerant streaming data acquisition pipeline is proposed through a distributed big data setup, followed by mining and searching patterns in it while data is still in transit. The data ingestion methods are demonstrated using Hadoop ecosystem tools like Kafka and Flume as sample implementations. Decision making on the ingested data to use the best-fit tools and methods. In Big Data Analytics, the primary challenges often remain in processing heterogeneous data pools with a one-method-fits all approach. The research introduces a decision-making system to select the best-fit solutions for the incoming data stream. This is the second step towards building a data processing pipeline presented in the thesis. The decision-making system introduces a Fuzzy Graph-based method to provide real-time and offline decision-making. Lifelong incremental machine learning. In the third step, the thesis describes a Lifelong Learning model at the processing layer of the analytical pipeline, following the data acquisition and decision making at step two for downstream processing. Lifelong learning iteratively increments the training model using a proposed Multi-agent Lambda Architecture (MALA), a collaborative ensemble architecture between the stream and batch data. As part of the proposed MAF, MALA is one of the primary contributions of the research.The work introduces a general-purpose and comprehensive approach in hybrid learning of batch and stream processing to achieve lifelong learning objectives. Improving machine learning results through ensemble learning. As an extension of the Lifelong Learning model, the thesis proposes a boosting based Ensemble method as the fourth step of the framework, improving lifelong learning results by reducing the learning error in each iteration of a streaming window. The strategy is to incrementally boost the learning accuracy on each iterating mini-batch, enabling the model to accumulate knowledge faster. The base learners adapt more quickly in smaller intervals of a sliding window, improving the machine learning accuracy rate by countering the concept drift. Cross-modal integration between text, image, video and metadata for more comprehensive data coverage than a text-only dataset. The final contribution of this thesis is a new multimodal method where three different modalities: text, visuals (image and video) and metadata, are intertwined along with real-time and batch data for more comprehensive input data coverage than text-only data. The model is validated through a detailed case study on the contemporary and relevant topic of the COVID-19 pandemic. While the remainder of the thesis deals with text-only input, the COVID-19 dataset analyzes both textual and visual information in integration. Post completion of this research work, as an extension to the current framework, multimodal machine learning is investigated as a future research direction

    Towards Efficient Lifelong Machine Learning in Deep Neural Networks

    Get PDF
    Humans continually learn and adapt to new knowledge and environments throughout their lifetimes. Rarely does learning new information cause humans to catastrophically forget previous knowledge. While deep neural networks (DNNs) now rival human performance on several supervised machine perception tasks, when updated on changing data distributions, they catastrophically forget previous knowledge. Enabling DNNs to learn new information over time opens the door for new applications such as self-driving cars that adapt to seasonal changes or smartphones that adapt to changing user preferences. In this dissertation, we propose new methods and experimental paradigms for efficiently training continual DNNs without forgetting. We then apply these methods to several visual and multi-modal perception tasks including image classification, visual question answering, analogical reasoning, and attribute and relationship prediction in visual scenes


    Get PDF
    E-learning enables students to pace their studies according to their needs, making learning accessible to (1) people who do not have enough free time for studying - they can program their lessons according to their available schedule; (2) those far from a school (geographical issues), or the ones unable to attend classes due to some physical or medical restriction. Therefore, cultural, geographical and physical obstructions can be removed, making it possible for students to select their path and time for the learning course. Students are then allowed to choose the main objectives they are suitable to fulfill. This book regards E-learning challenges, opening a way to understand and discuss questions related to long-distance and lifelong learning, E-learning for people with special needs and, lastly, presenting case study about the relationship between the quality of interaction and the quality of learning achieved in experiences of E-learning formation

    Big data for monitoring educational systems

    Get PDF
    This report considers “how advances in big data are likely to transform the context and methodology of monitoring educational systems within a long-term perspective (10-30 years) and impact the evidence based policy development in the sector”, big data are “large amounts of different types of data produced with high velocity from a high number of various types of sources.” Five independent experts were commissioned by Ecorys, responding to themes of: students' privacy, educational equity and efficiency, student tracking, assessment and skills. The experts were asked to consider the “macro perspective on governance on educational systems at all levels from primary, secondary education and tertiary – the latter covering all aspects of tertiary from further, to higher, and to VET”, prioritising primary and secondary levels of education

    A submodular optimization framework for never-ending learning : semi-supervised, online, and active learning.

    Get PDF
    The revolution in information technology and the explosion in the use of computing devices in people\u27s everyday activities has forever changed the perspective of the data mining and machine learning fields. The enormous amounts of easily accessible, information rich data is pushing the data analysis community in general towards a shift of paradigm. In the new paradigm, data comes in the form a stream of billions of records received everyday. The dynamic nature of the data and its sheer size makes it impossible to use the traditional notion of offline learning where the whole data is accessible at any time point. Moreover, no amount of human resources is enough to get expert feedback on the data. In this work we have developed a unified optimization based learning framework that approaches many of the challenges mentioned earlier. Specifically, we developed a Never-Ending Learning framework which combines incremental/online, semi-supervised, and active learning under a unified optimization framework. The established framework is based on the class of submodular optimization methods. At the core of this work we provide a novel formulation of the Semi-Supervised Support Vector Machines (S3VM) in terms of submodular set functions. The new formulation overcomes the non-convexity issues of the S3VM and provides a state of the art solution that is orders of magnitude faster than the cutting edge algorithms in the literature. Next, we provide a stream summarization technique via exemplar selection. This technique makes it possible to keep a fixed size exemplar representation of a data stream that can be used by any label propagation based semi-supervised learning technique. The compact data steam representation allows a wide range of algorithms to be extended to incremental/online learning scenario. Under the same optimization framework, we provide an active learning algorithm that constitute the feedback between the learning machine and an oracle. Finally, the developed Never-Ending Learning framework is essentially transductive in nature. Therefore, our last contribution is an inductive incremental learning technique for incremental training of SVM using the properties of local kernels. We demonstrated through this work the importance and wide applicability of the proposed methodologies

    MOOC (Massive Open Online Courses)

    Get PDF
    Massive Open Online Courses (MOOCs) are free online courses available to anyone who can sign up. MOOCs provide an affordable and flexible way to learn new skills, advance in careers, and provide quality educational experiences to a certain extent. Millions of people around the world use MOOCs for learning and their reasons are various, including career development, career change, college preparation, supplementary learning, lifelong learning, corporate e-Learning and training, and so on

    MOOClm: Learner Modelling for MOOCs

    Get PDF
    Massively Open Online Learning systems, or MOOCs, generate enormous quantities of learning data. Analysis of this data has considerable potential benefits for learners, educators, teaching administrators and educational researchers. How to realise this potential is still an open question. This thesis explores use of such data to create a rich Open Learner Model (OLM). The OLM is designed to take account of the restrictions and goals of lifelong learner model usage. Towards this end, we structure the learner model around a standard curriculum-based ontology. Since such a learner model may be very large, we integrate a visualisation based on a highly scalable circular treemap representation. The visualisation allows the student to either drill down further into increasingly detailed views of the learner model, or filter the model down to a smaller, selected subset. We introduce the notion of a set of Reference learner models, such as an ideal student, a typical student, or a selected set of learning objectives within the curriculum. Introducing these provides a foundation for a learner to make a meaningful evaluation of their own model by comparing against a reference model. To validate the work, we created MOOClm to implement this framework, then used this in the context of a Small Private Online Course (SPOC) run at the University of Sydney. We also report a qualitative usability study to gain insights into the ways a learner can make use of the OLM. Our contribution is the design and validation of MOOClm, a framework that harnesses MOOC data to create a learner model with an OLM interface for student and educator usage
    • …