898 research outputs found

    Novel Strategies to Accelerate Search Algorithms in Data Reduction

    Get PDF
    In our current hyper-connected digital world where data is growing enormously, instance reduction is an essential pre-processing phase to obtain cleaner and smaller datasets that are free from noise, redundant or irrelevant samples (the so-called, Smart Data). The data after pre-processing may become more reliable, accurate and useful for subsequent data mining tasks. Instance reduction consists of two types: instance selection and instance generation; each can be formulated as a combinatorial/continuous optimisation problem depending on whether its decision variable is discrete or continuous, respectively. It is an emerging challenge characterised by multimodality and a large number of decision variables. Given such difficulties, derivative-free methods are likely promising approaches to address the problem. They are powerful search algorithms that seek the nearest local optimum and do not necessarily take into account the gradient computation of the objective function like derivative methods. Solutions for instance reduction fall into the intersection of machine learning, data mining and optimisation at which the process of a domain can take part in the execution of another. Thus, the synergy between domains is important to solve the problem more effectively, and this has attracted a significant interest from researchers. Among many different derivative-free search approaches, the family of direct search methods has introduced various strategies to tackle numerous modern numerical optimisation problems, where population-based meta-heuristics and pattern search can be considered two of the most prevalent in the literature. Population-based meta-heuristics are an iterative search framework composing several subordinate low-level heuristics to control exploration and exploitation for a pool of solution candidates. This set of methods searches for high-quality solutions from multi-points, and thus is usually associated with high computational expense. Pattern search methods seek an improved solution from candidates that are generated from different directions. They examine trial solutions sequentially by comparing each trial solution with the `best' solution found up to the present time. In this dissertation, we will investigate these derivative-free search strategies to address instance reduction, a critical optimisation problem in the field of data science. Although many derivative-free methods have been proved effective in addressing instance reduction, they are usually time-consuming, especially when handling relatively large datasets. This impediment limits their practicality in many data mining systems and thus necessitates a solution to accelerate the search process. The need for a fast and effective search framework for instance reduction has motivated us to develop novel search strategies in the family of direct search approaches, aiming to still obtain high quality solutions achieved by state-of-the-art techniques in the domain, but significantly reduce the runtime of the search process. Three major work packages presented in this thesis will cover two direct search approaches for two types of instance reduction, arranged in a progressive order at which findings at an earlier stage will contribute to the understanding of the later outcomes. Firstly, a novel evolutionary search framework for instance selection is proposed to balance the number of samples between classes to address a case study of imbalanced classification. Secondly, we develop another search framework for instance generation based on single-point search and memetic computing, namely Single-Point Memetic Structure. An accelerated mechanism for computing the objective function is embedded into the proposed search design, thus reducing significantly the runtime. Finally, a novel search framework for simultaneous instance selection and generation is designed to handle the instance reduction problem in both combinatorial and continuous search spaces. In summary, the research conducted here introduces a set of novel search strategies towards derivative-free methods to tackle instance reduction problems. They are different search frameworks which aim to produce a high quality reduced set from a relatively large original source within a reasonable amount of time. This is accomplished by either taking advantage of machine learning integration or the Single-Point Memetic Structure with an accelerated mechanism. The use of machine learning in a meta-heuristic search framework greatly speeds up the computation of the objective function while the Single-Point Memetic Search allows us to reuse virtually all prior calculations for computing the fitness value of newly evolved individuals. Hence, these novel search strategies can save vast computational cost. Finally, we leverage the insights previously found to propose another novel search framework that handles both instance selection and instance generation simultaneously, and operates in both combinatorial and continuous search spaces. These novel search strategies are examined with a large number of datasets in different hyper-parameter settings. The obtained numerical results are comprehensively analysed and verified by different statistical tests to prove the robustness of the proposed search strategies with respect to other state-of-the-art techniques in the domain

    Novel Strategies to Accelerate Search Algorithms in Data Reduction

    Get PDF
    In our current hyper-connected digital world where data is growing enormously, instance reduction is an essential pre-processing phase to obtain cleaner and smaller datasets that are free from noise, redundant or irrelevant samples (the so-called, Smart Data). The data after pre-processing may become more reliable, accurate and useful for subsequent data mining tasks. Instance reduction consists of two types: instance selection and instance generation; each can be formulated as a combinatorial/continuous optimisation problem depending on whether its decision variable is discrete or continuous, respectively. It is an emerging challenge characterised by multimodality and a large number of decision variables. Given such difficulties, derivative-free methods are likely promising approaches to address the problem. They are powerful search algorithms that seek the nearest local optimum and do not necessarily take into account the gradient computation of the objective function like derivative methods. Solutions for instance reduction fall into the intersection of machine learning, data mining and optimisation at which the process of a domain can take part in the execution of another. Thus, the synergy between domains is important to solve the problem more effectively, and this has attracted a significant interest from researchers. Among many different derivative-free search approaches, the family of direct search methods has introduced various strategies to tackle numerous modern numerical optimisation problems, where population-based meta-heuristics and pattern search can be considered two of the most prevalent in the literature. Population-based meta-heuristics are an iterative search framework composing several subordinate low-level heuristics to control exploration and exploitation for a pool of solution candidates. This set of methods searches for high-quality solutions from multi-points, and thus is usually associated with high computational expense. Pattern search methods seek an improved solution from candidates that are generated from different directions. They examine trial solutions sequentially by comparing each trial solution with the `best' solution found up to the present time. In this dissertation, we will investigate these derivative-free search strategies to address instance reduction, a critical optimisation problem in the field of data science. Although many derivative-free methods have been proved effective in addressing instance reduction, they are usually time-consuming, especially when handling relatively large datasets. This impediment limits their practicality in many data mining systems and thus necessitates a solution to accelerate the search process. The need for a fast and effective search framework for instance reduction has motivated us to develop novel search strategies in the family of direct search approaches, aiming to still obtain high quality solutions achieved by state-of-the-art techniques in the domain, but significantly reduce the runtime of the search process. Three major work packages presented in this thesis will cover two direct search approaches for two types of instance reduction, arranged in a progressive order at which findings at an earlier stage will contribute to the understanding of the later outcomes. Firstly, a novel evolutionary search framework for instance selection is proposed to balance the number of samples between classes to address a case study of imbalanced classification. Secondly, we develop another search framework for instance generation based on single-point search and memetic computing, namely Single-Point Memetic Structure. An accelerated mechanism for computing the objective function is embedded into the proposed search design, thus reducing significantly the runtime. Finally, a novel search framework for simultaneous instance selection and generation is designed to handle the instance reduction problem in both combinatorial and continuous search spaces. In summary, the research conducted here introduces a set of novel search strategies towards derivative-free methods to tackle instance reduction problems. They are different search frameworks which aim to produce a high quality reduced set from a relatively large original source within a reasonable amount of time. This is accomplished by either taking advantage of machine learning integration or the Single-Point Memetic Structure with an accelerated mechanism. The use of machine learning in a meta-heuristic search framework greatly speeds up the computation of the objective function while the Single-Point Memetic Search allows us to reuse virtually all prior calculations for computing the fitness value of newly evolved individuals. Hence, these novel search strategies can save vast computational cost. Finally, we leverage the insights previously found to propose another novel search framework that handles both instance selection and instance generation simultaneously, and operates in both combinatorial and continuous search spaces. These novel search strategies are examined with a large number of datasets in different hyper-parameter settings. The obtained numerical results are comprehensively analysed and verified by different statistical tests to prove the robustness of the proposed search strategies with respect to other state-of-the-art techniques in the domain

    ENGLISH-MAJORED STUDENTS' MOST COMMON CAREER OPTIONS AND THE LEVELS OF READINESS FOR THE CAREERS

    Get PDF
    EFL students nowadays have a variety of career options. Most of them, however, still do not have a well-understanding or a strong readiness for their job targets. This study was conducted to find out the EFL students’ most career options, some language standards when selecting careers and to what extent students feel that they are ready for their future careers. To answer these questions, we use exploratory questionnaires to survey the participants. The findings demonstrate that teaching English is the most attractive career that EFL students want to attend after graduating, followed by the freelancer. However, a group of students still cannot locate their future careers. Additionally, juniors and seniors are considered to have better preparation for their career prospects than freshmen and sophomores.  Article visualizations

    A Review of Occupational Stress among Certain Jobs in Vietnam

    Get PDF
    Background: Stress in the modern workplace is globally considered a risk factor for workers’ health and safety. However, a review of the prevalence and associated factors of occupational stress in developing countries like Vietnam was largely lacking. This review aimed to describe the situation of occupational stress among certain jobs from studies carried out in Vietnam. Methods: The review was implemented by using key words to search on online and offline, international and national database. After going through 2 stages of selections, total 25 eligible articles were chosen and used for this review. Results: The results showed the prevalence of occupational stress was varied and ranged from 6.4% to 90.4%. The study population focused on health workers, factory workers, students, academic staff and officers. The prevalence of each occupation ranged from 6.4% to 90.4% in health workers; 20.7% to 89.6% in factory workers; and 22.8% to 68.3% in students. Conclusions: In conclusion, the prevalence of occupational stress was very varied between and within each occupation. Therefore, a new way to develop in enhancing the occupational stress data, particularly in developing countries, is urgently needed

    Improving Multi-task Learning via Seeking Task-based Flat Regions

    Full text link
    Multi-Task Learning (MTL) is a widely-used and powerful learning paradigm for training deep neural networks that allows learning more than one objective by a single backbone. Compared to training tasks separately, MTL significantly reduces computational costs, improves data efficiency, and potentially enhances model performance by leveraging knowledge across tasks. Hence, it has been adopted in a variety of applications, ranging from computer vision to natural language processing and speech recognition. Among them, there is an emerging line of work in MTL that focuses on manipulating the task gradient to derive an ultimate gradient descent direction to benefit all tasks. Despite achieving impressive results on many benchmarks, directly applying these approaches without using appropriate regularization techniques might lead to suboptimal solutions on real-world problems. In particular, standard training that minimizes the empirical loss on the training data can easily suffer from overfitting to low-resource tasks or be spoiled by noisy-labeled ones, which can cause negative transfer between tasks and overall performance drop. To alleviate such problems, we propose to leverage a recently introduced training method, named Sharpness-aware Minimization, which can enhance model generalization ability on single-task learning. Accordingly, we present a novel MTL training methodology, encouraging the model to find task-based flat minima for coherently improving its generalization capability on all tasks. Finally, we conduct comprehensive experiments on a variety of applications to demonstrate the merit of our proposed approach to existing gradient-based MTL methods, as suggested by our developed theory.Comment: 29 pages, 11 figures, 6 table

    ARSENIC REMOVAL FROM GROUND WATER : RESEARCHES AND PRACTICAL IMPLEMENTATION CONDUCTED AT INSTITUTE OF CHEMISTRY, VIETNAMESE ACADEMY OF SCIENCE AND TECHNOLOGY

    Full text link
    Joint Research on Environmental Science and Technology for the Eart

    Robust Adaptive Cerebellar Model Articulation Controller for 1-DOF Nonlaminated Active Magnetic Bearings

    Get PDF
    This paper presents a robust adaptive cerebellar model articulation controller (RACMAC) for 1-DOF nonlaminated active magnetic bearings (AMBs) to achieve desired positions for the rotor using a robust sliding mode control based. The dynamic model of 1-DOF nonlaminated AMB is introduced in fractional order equations. However, it is challenging to design a controller based on the model\u27s parameters due to undefined components and external disturbances such as eddy current losses in the actuator, external disturbance, variant parameters of the model while operating. In order to tackle the problem, RACMAC, which has a cerebellar model to estimate nonlinear disturbances, is investigated to resolve this problem. Based on this estimation, a robust adaptive controller that approximates the ideal and compensation controllers is calculated. The online parameters of the neural network are adjusted using Lyapunov\u27s stability theory to ensure the stability of system. Simulation results are presented to demonstrate the effectiveness of the proposed controller.The simulation results indicate that the CMAC multiple nonlinear multiple estimators are close to the actual nonlinear disturbance value, and the effectiveness of the proposed RACMAC method compared with the FOPID and SMC controllers has been studied previously

    Data Structure Model on the Quality of Public Passenger Transport Services by Bus in Vietnam

    Get PDF
    The system of managing public passenger transport services (PPTS) by bus is a complex system, involving infrastructure, facilities, management and service communication activities between passengers and transport systems. In particular, quality information is the most important factor, providing necessary data for analysis, setting out measures to improve quality to meet the needs of passengers and meet the requirements of related parties. According to the peculiarity of the service, this study selects a database structure model to guide the process of computerizing the management of quality of PPTS by bus in urban areas in Vietnam. The results show that four database systems reflecting the quality information of the infrastructure, means of transport, transport operation and passenger service and each database system is structured by components that ensure proper implementation of QM process according to the continuous quality improvement cycle. The components of each database system are dispersed according to the scope of management to ensure the consistency for the quality management process and facilitate the collection, processing and distribution of information of related parties. Keywords: Quality, transport services, public passengers, Vietnam. DOI: 10.7176/JESD/10-10-13 Publication date:May 31st 201

    A Research on the Quality of Public Transportation Services by Bus in Vietnam

    Get PDF
    This study was conducted to assess the status of the quality of public passenger transport services by bus in Hanoi. Data were collected from regular passengers using buses as a means of transportation in the city, including passengers standing at stations, waiting shelters and on vehicles to make trips and students of some universities who use buses as a means of transportation. We employ descriptive statistics and hierarchical analysis to learn about the topic of research. The results indicate that the quality of public transport services by buses in Hanoi, which was judged by passengers quite well. In particular, the safety level, convenience, security and hygiene is up to 70%, which was higher than the highest quality level. Quality of fast level and reliability are low. Keywords: quality of services, public passenger transport, buses, Vietnam. DOI: 10.7176/RJFA/10-13-04 Publication date:July 31st 201
    corecore