3 research outputs found

    Non-Negative Discriminative Data Analytics

    Get PDF
    Due to advancements in data acquisition techniques, collecting datasets representing samples from multi-views has become more common recently (Jia et al. 2019). For instance, in genomics, a lymphoma patient’s dataset may include data on gene expression, single nucleotide polymorphism (SNP), and array Comparative genomic hybridization (aCGH) measurements. Learning from multiple views about the same objective, in general, obtains a better understanding of the hidden patterns of the data compared to learning from a single view data. Most of the existing multi-view learning techniques such as canonical correlation analysis (Hotelling et al. 1936) and multi-view support vector machine (Farquhar et al. 2006), multiple kernel learning (Zhang et al. 2016) are focused on extracting the shared information among multiple datasets. However, in some real-world applications, it’s appealing to extract the discriminative knowledge of multiple datasets, namely discriminative data analytics. For example, consider the one dataset as gene-expression measurements of cancer patients, and the other dataset as the gene-expression levels of healthy volunteers and the goal is to cluster cancer patients according to the molecular sub-types. Performing a single view analysis such as principal component analysis (PCA) on any of the dataset yields information related to the common knowledge between the two datasets (Garte et al. 1996). Addressing such challenge, contrastive PCA (Abid et al. 2017) and discriminative (d) PCA in (Jia et al. 2019) are proposed in to extract one dataset-specific information often missed by PCA. Inspired by dPCA, we propose a novel discriminative multi-view learning algorithm, namely Non-negative Discriminative Analysis (DNA), to extract the unique information of one dataset (a.k.a. view) with respect to the other dataset. This boils down to solving a non-negative matrix factorization problem. Furthermore, we apply the proposed DNA framework in various real-world down-stream machine learning applications such as feature selections, dimensionality reduction, classification, and clustering

    Optimal Task Scheduling in the Cloud Environment using a Mean Grey Wolf Optimization Algorithm

    Get PDF
    Cloud computing is one of the emerging areas in computing platforms, supporting heterogeneous, parallel and distributed environments. An important challenging issue in cloud computing is task scheduling, which directly influences system performance and its efficiency. The primary objective of task scheduling involves scheduling tasks related to resources and minimizing the time span of the schedule. In this study, we propose a Modified Mean Grey Wolf Optimization (MGWO) algorithm to enhance system performance, and consequently reduce scheduling issues. The main objective of this method is focused upon minimizing the makespan (execution time) and energy consumption.  These two objective functions are elaborated in the algorithm in order to suitably regulate the quality of results based on response, in order to achieve a near optimal solution. The implementation results of the proposed algorithm are evaluated using the CloudSim toolkit for standard workloads (normal and uniform). The advantage of the proposed method is evident from the simulation results, which show a comprehensive reduction in makespan and energy consumption. The outcomes of these results show that the proposed Mean GWO algorithm achieves a 8.85% makespan improvement compared to the PSO algorithm, and 3.09% compared to the standard GWO algorithm for the normal dataset. In addition, the proposed algorithm achieves 9.05% and 9.2% improvement in energy conservation compared to the PSO and standard GWO algorithms for the uniform dataset, respectively

    The main features that influence the academic success of bachelors’ students at Nova School of Business and Economics

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThe prediction of academic success is a major topic in higher education, especially among the academic community. In this dissertation, we are going to present a data mining approach taking into consideration the features that are the most relevant in terms of successful academic achievement of the Bachelors’ programs at Nova School of Business and Economics (Nova SBE). Initially, we are going to perform a literature review in order to understand the framework of academic success and also to make a summary of previous research on the field of educational data mining when used to assess student success. Subsequently, the empirical approach will start being developed with the extraction of socio-economic, socio-demographic, and academic data of students, which will result in our main dataset. Later, and after the data discovery, data cleansing, and transformation activities, a set of features are going to be taken into consideration according to their relevance for the subject. Based on the dataset containing these features, several predictive data-driven techniques are going to be applied, resulting in models which are going to be assessed in order to understand if the selected features are relevant enough to answer our problem or if there is a need to substitute them by other attributes. This process will result in several iterations that will confer credibility and robustness to the model that demonstrates the best performance in classifying students’ academic success. In the end, it is intended that the insights extracted from the model will provide the school key stakeholders with enough knowledge to capacitate them to take actions that will result in the maximization of the students learning success
    corecore