15 research outputs found
Approximation contexts in addressing graph data structures
While the application of machine learning algorithms to practical problems has been expanded from fixed sized input data to sequences, trees or graphs input data, the composition of learning system has developed from a single model to integrated ones. Recent advances in graph based learning algorithms include: the SOMSD (Self Organizing Map for Structured Data), PMGraphSOM (Probability Measure Graph Self Organizing Map,GNN (Graph Neural Network) and GLSVM (Graph Laplacian Support Vector Machine). A main motivation of this thesis is to investigate if such algorithms, whether by themselves individually or modified, or in various combinations, would provide better performance over the more traditional artificial neural networks or kernel machine methods on some practical challenging problems. More succinctly, this thesis seeks to answer the main research question: when or under what conditions/contexts could graph based models be adjusted and tailored to be most efficacious in terms of predictive or classification performance on some challenging practical problems? There emerges a range of sub-questions including: how do we craft an effective neural learning system which can be an integration of several graph and non-graph based models? Integration of various graph based and non graph based kernel machine algorithms; enhancing the capability of the integrated model in working with challenging problems; tackling the problem of long term dependency issues which aggravate the performance of layer-wise graph based neural systems. This thesis will answer these questions.
Recent research on multiple staged learning models has demonstrated the efficacy of multiple layers of alternating unsupervised and supervised learning approaches. This underlies the very successful front-end feature extraction techniques in deep neural networks. However much exploration is still possible with the investigation of the number of layers required, and the types of unsupervised or supervised learning models which should be used. Such issues have not been considered so far, when the underlying input data structure is in the form of a graph. We will explore empirically the capabilities of models of increasing complexities, the combination of the unsupervised learning algorithms, SOM, or PMGraphSOM, with or without a cascade connection with a multilayer perceptron, and with or without being followed by multiple layers of GNN. Such studies explore the effects of including or ignoring context. A parallel study involving kernel machines with or without graph inputs has also been conducted empirically
Improving Heterogeneous Graph Learning with Weighted Mixed-Curvature Product Manifold
In graph representation learning, it is important that the complex geometric
structure of the input graph, e.g. hidden relations among nodes, is well
captured in embedding space. However, standard Euclidean embedding spaces have
a limited capacity in representing graphs of varying structures. A promising
candidate for the faithful embedding of data with varying structure is product
manifolds of component spaces of different geometries (spherical, hyperbolic,
or euclidean). In this paper, we take a closer look at the structure of product
manifold embedding spaces and argue that each component space in a product
contributes differently to expressing structures in the input graph, hence
should be weighted accordingly. This is different from previous works which
consider the roles of different components equally. We then propose
WEIGHTED-PM, a data-driven method for learning embedding of heterogeneous
graphs in weighted product manifolds. Our method utilizes the topological
information of the input graph to automatically determine the weight of each
component in product spaces. Extensive experiments on synthetic and real-world
graph datasets demonstrate that WEIGHTED-PM is capable of learning better graph
representations with lower geometric distortion from input data, and performs
better on multiple downstream tasks, such as word similarity learning, top-
recommendation, and knowledge graph embedding
GRADIENT KIẾN TẠO HIỆN ĐẠI KHU VỰC NINH THUẬN VÀ LÂN CẬN
The estimation of the present day tectonic movement and tectonic gradient (strain rate) has an important practical signification in the assessment of active fault and seismic hazards for the selection of Ninh Thuan nuclear power plant. Based on the three campaigns of GPS measurement between 2012 - 2013, we used BERNESE 5.0 software to determine present day slip rates of 13 stations in ITRF08 frame. The GPS stations move eastwards at the slip rates of 22 - 25 mm/yr, southwards at the velocities of 5 - 10 mm/yr. The standard errors in latitudinal and longitudinal directions are 1.2 mm/yr and 0.9 mm/yr, respectively. Combined with GPS data from the project of the study on actual geodynamics in Tay Nguyen TN3/06, we determined the strain rate ranging from 50 to 100 nanostrains with the standard error of 50 nanostrains. The direction of maximum compressive strain rate is from northwest - southeast to east - west.Đánh giá vận tốc chuyển động kiến tạo hiện đại và gradient kiến tạo hiện đại có ý nghĩa thực tiễn quan trọng trong việc đánh giá đứt gãy hoạt động nguy hiểm động đất phục vụ xây dụng nhà máy điện hạt nhân Ninh Thuận. Trên cơ sở đo 3 chu kỳ GPS vào các năm 2012 - 2013, sử dụng phần mềm BERNESE 5.0, chúng tôi đã xác lập được vận tốc chuyển động kiến tạo hiện đại tại 13 điểm đo GPS trong khu vực lân cận bao gồm kéo dài từ Nha Trang tới đảo Phú Quý. Vận tốc chuyển dịch kiến tạo hiện đại về phía đông thay đổi từ 22 - 25 mm/năm và chuyển dịch về phía nam với vận tốc giao động từ 5 - 10 mm/năm trên hệ tọa độ toàn cầu ITRF08. Sai số vận tốc chuyển dịch kiến tạo về phía đông giao động trong khoảng 1,2 - 1,5 mm/năm và về phía nam giao động trong khoảng 0,9 - 1,2 mm/năm. Liên kết với giá trị đo GPS từ đề tài nghiên cứu địa động lực hiện đại khu vực Tây Nguyên mã số TN3/T06, chúng tôi đã xác định được giá trị vận tốc biến dạng giao động từ 50 nano tới 100 nano biến dạng với sai số giao động trong khoảng 50 nano biến dạng. Trục biến dạng nén cực đại giao động theo phương thay đổi từ bắc nam sang đông bắc - tây nam. Trục biến dạng căng cực đại có phương thay đổi từ tây bắc - đông nam sang phương đông - tây
PRESENT DAY DEFORMATION IN THE EAST VIETNAM SEA AND SURROUNDING REGIONS
This paper presents velocities of present-day tectonic movement and strain rate in the East Vietnam Sea (South China Sea) and surroundings determined from GPS campaigns between 2007 and 2010. We determine absolute velocities of GPS stations in the ITRF05 frame. The result indicates that GPS stations in the North of East Vietnam Sea move eastwards with the slip rate of 30 - 39 mm/yr, southwards at the velocities of 8 - 11 mm/yr. Song Tu Tay offshore moves eastwards at the rate of ~24 mm/yr and southwards at ~9 mm/yr. GPS stations in the South of East Vietnam Sea move to the east at the rate of ~22 mm/yr and to the south at the velocities of 7 - 11 mm/yr. The effect of relative movement shows that the Western Margin Fault Zone activates as left lateral fault zone at the slip rate less than 4 mm/year.In Western plateau, the first result from 2012 - 2013 GPS measurement shows that the velocities to the east vary from 21.5 mm/yr to 24.7 mm/year. The velocities to the south vary from 10.5 mm/yr to 14.6 mm/year. GPS solutions determined from our campaigns are combined with data from various authors and international projects to determine the strain rate in the East Vietnam Sea. Principal strain rate changes from 15 nanostrain/yr to 9 nanostrain/yr in the East Vietnam Sea. Principal strain rate and maximum shear strain rate along the Red River Fault Zone are in order of 10 nanostrain/year. East Vietnam Sea is considered to belong to the Sunda block
Machine learning approaches to physical activity prediction in young children using accelerometer data
Early childhood development is arguably the most significant period in the course of life. It is widely recognized that physical activity (PA) during early childhood plays an influential role on current and future developments of the child [1]. Partially based on this evidence, the Australian Government has created the Physical Activity Recommendations which recommend that, among others, preschoolers should be physically active every day for at least three hours, spread throughout the day [1]. However, difficulties in accurately measuring physical activity in preschoolers have impeded the investigations in physical activity classifications using data modelling techniques and the use of such classifications in the estimation of the metabolic equivalents (METS1), a measure commonly used as a proxy for measuring the extent of the physical activity performed by a subject. Therefore the issue of quantifying the extent of physical activity performed by a child is transformed to an issue of physical activity classifications into categories, like “sedentary”, “light” activity, “medium” activity, “walking”, or “running”. Based on such classifications, the METS can be estimated, and as a result the daily recommended minimum METS can be monitored.
The research reported in this thesis is part of a larger research project which include the collection of raw data, over two separate and different small cohorts of young pre-school children, in 2014 (11 participants), and 2016 (16 participants) respectively, from accelerometry sensors mounted on various parts of the body. As these are pre-school children, they often did not adhere to the suggested activity, but instead engaged in unscripted activities during the 5 minute episodes of observations, thus introducing “noise” in the recordings. Despite such imperfection, the accelerometer recordings were labelled by the assigned activity type, irrespective of what the subject was doing during the episode thus challenging data driven modelling techniques
Cost-sensitive cascade Graph Neural Networks
This paper introduces a novel cost sensitive weighted samples approach to a cascade of Graph Neural Networks for learning from imbalanced data in the graph structured input domain. This is shown to be very effective in addressing the effects of imbalanced data distribution on learning systems. The proposed idea is based on a weighting mechanism which forces the network to encode misclassified graphs (or nodes) more strongly. We evaluate the approach through an application to the well known Web spam detection problem, and demonstrate that the general-ization performance is improved as a result. Indeed the results obtained reported in this paper are the best reported so far for both datasets
Prediction of activity type in preschool children using machine learning techniques
Objectives Recent research has shown that machine learning techniques can accurately predict activity classes from accelerometer data in adolescents and adults. The purpose of this study is to develop and test machine learning models for predicting activity type in preschool-aged children. Design Participants completed 12 standardised activity trials (TV, reading, tablet game, quiet play, art, treasure hunt, cleaning up, active game, obstacle course, bicycle riding) over two laboratory visits. Methods Eleven children aged 3-6 years (mean age = 4.8 ± 0.87; 55% girls) completed the activity trials while wearing an ActiGraph GT3X+ accelerometer on the right hip. Activities were categorised into five activity classes: sedentary activities, light activities, moderate to vigorous activities, walking, and running. A standard feed-forward Artificial Neural Network and a Deep Learning Ensemble Network were trained on features in the accelerometer data used in previous investigations (10th, 25th, 50th, 75th and 90th percentiles and the lag-one autocorrelation). Results Overall recognition accuracy for the standard feed forward Artificial Neural Network was 69.7%. Recognition accuracy for sedentary activities, light activities and games, moderate-to-vigorous activities, walking, and running was 82%, 79%, 64%, 36% and 46%, respectively. In comparison, overall recognition accuracy for the Deep Learning Ensemble Network was 82.6%. For sedentary activities, light activities and games, moderate-to-vigorous activities, walking, and running recognition accuracy was 84%, 91%, 79%, 73% and 73%, respectively. Conclusions Ensemble machine learning approaches such as Deep Learning Ensemble Network can accurately predict activity type from accelerometer data in preschool children
Sensor-enabled activity class recognition in preschoolers: Hip versus wrist data
PURPOSE: Pattern recognition approaches to accelerometer data processing have emerged as viable alternatives to cut-point methods. However, few studies have explored the validity of pattern recognition approaches in pre-schoolers; and none have compared supervised learning algorithms trained on hip and wrist data. To develop, test, and compare activity class recognition algorithms trained on hip, wrist, and combined hip and wrist accelerometer data in pre-schoolers. METHODS: 11 children aged 3 - 6 y (mean age 4.8 +/- 0.9 y) completed 12 developmentally appropriate PA trials while wearing an ActiGraph GT3X+ accelerometer on the right hip and non-dominant wrist. PA trials were categorised as sedentary (SED), light activity games (LG), moderate-to-vigorous games (MVG), walking (WA), and running (RU). Random forest (RF) and support vector machine (SVM) classifiers were trained using time and frequency domain features from the vector magnitude of the raw signal. Features were extracted from 15 s non-overlapping windows. Classifier performance was evaluated using leave-one-out-cross-validation. RESULTS: Cross-validation accuracy for the hip, wrist, and combine hip and wrist RF models was 0.80 (95% CI:0.79 - 0.82), 0.78 (95% CI:0.77-0.80), 0.82 (95% CI:0.80 - 0.83), respectively. Accuracy for Hact, Wact, and HWact SVM models was 0.81 (95% CI:0.80 - 0.83), 0.80 (95% CI:0.79-0.80), 0.85 (95% CI:0.84 - 0.86), respectively. Recognition accuracy was consistently excellent for SED (> 90%), moderate for LG, MVG, and RU (69-79%), and modest for WA (61-71%). CONCLUSIONS: Machine learning algorithms such as RF and SVM are useful for predicting PA class from accelerometer data collected in preschool children. While classifiers trained on hip or wrist data provided acceptable recognition accuracy, the combination of hip and wrist accelerometer delivered better performance
Prediction of activity type in preschool children using machine learning techniques
Objectives Recent research has shown that machine learning techniques can accurately predict activity classes from accelerometer data in adolescents and adults. The purpose of this study is to develop and test machine learning models for predicting activity type in preschool-aged children. Design Participants completed 12 standardised activity trials (TV, reading, tablet game, quiet play, art, treasure hunt, cleaning up, active game, obstacle course, bicycle riding) over two laboratory visits. Methods Eleven children aged 3–6 years (mean age = 4.8 ± 0.87; 55% girls) completed the activity trials while wearing an ActiGraph GT3X+ accelerometer on the right hip. Activities were categorised into five activity classes: sedentary activities, light activities, moderate to vigorous activities, walking, and running. A standard feed-forward Artificial Neural Network and a Deep Learning Ensemble Network were trained on features in the accelerometer data used in previous investigations (10th, 25th, 50th, 75th and 90th percentiles and the lag-one autocorrelation). Results Overall recognition accuracy for the standard feed forward Artificial Neural Network was 69.7%. Recognition accuracy for sedentary activities, light activities and games, moderate-to-vigorous activities, walking, and running was 82%, 79%, 64%, 36% and 46%, respectively. In comparison, overall recognition accuracy for the Deep Learning Ensemble Network was 82.6%. For sedentary activities, light activities and games, moderate-to-vigorous activities, walking, and running recognition accuracy was 84%, 91%, 79%, 73% and 73%, respectively. Conclusions Ensemble machine learning approaches such as Deep Learning Ensemble Network can accurately predict activity type from accelerometer data in preschool children