Search CORE

36 research outputs found

Time-Series Link Prediction Using Support Vector Machines

Author: Co Jan Miles
Fernandez Proceso L, Jr
Publication venue: Archīum Ateneo
Publication date: 01/06/2017
Field of study

The prominence of social networks motivates developments in network analysis, such as link prediction, which deals with predicting the existence or emergence of links on a given network. The Vector Auto Regression (VAR) technique has been shown to be one of the best for time-series based link prediction. One VAR technique implementation uses an unweighted adjacency matrix and five additional matrices based on the similarity metrics of Common Neighbor, Adamic-Adar, Jaccard’s Coefficient, Preferential Attachment and Research Allocation Index. In our previous work, we proposed the use of the Support Vector Machines (SVM) for such prediction task, and, using the same set of matrices, we gained better results. A dataset from DBLP was used to test the performance of the VAR and SVM link prediction models for two lags. In this study, we extended the VAR and SVM models by using three, four, and five lags, and these showed that both VAR and SVM improved with more data from the lags. The VAR and SVM models achieved their highest ROC-AUC values of 84.96% and 86.32% respectively using five lags compared to lower AUC values of 84.26% and 84.98% using two lags. Moreover, we identified that improving the predictive abilities of both models is constrained by the difficulty in the prediction of new links, which we define as links that do not exist in any of the corresponding lags. Hence, we created separate VAR and SVM models for the prediction of new links. The highest ROC-AUC was still achieved by using SVM with five lags, although at a lower value of 73.85%. The significant drop in the performance of VAR and SVM predictors for the prediction of new links indicate the need for more research in this problem space. Moreover, results showed that SVM can be used as an alternative method for time-series based link prediction

archīum.ATENEO (Ateneo de Manila Univ.)

Link Prediction in a Weighted Network Using Support Vector Machine

Author: Co Jan Miles
Fernandez Proceso L, Jr
Publication venue: Archīum Ateneo
Publication date: 01/06/2016
Field of study

Link prediction is a field under network analysis that deals with the existence or emergence of links. In this study, we investigate the effect of using weighted networks for two link prediction techniques, which are the Vector Auto Regression (VAR) technique and our proposed modified VAR that uses Support Vector Machine (SVM). Using a co-authorship network from DBLP as the dataset and the Area Under the Receiver Operating Curve (AUC-ROC) as the fitness metric, the results show that the performance of both VAR and SVM are surprisingly lower in the weighted network than in the unweighted network. In an attempt to improve the results in the weighted network, we incorporated features from the unweighted network into the features of the weighted network. This enhancement improved the performance of both VAR and SVM, but the results are still inferior to those in the unweighted networks. We identified that the true positive rate was generally lower in the weighted network, thus resulting to a lower AUC

archīum.ATENEO (Ateneo de Manila Univ.)

NBP 2.0: Updated Next Bar Predictor, an Improved Algorithmic Music Generator

Author: Dungan Belinda M.
Fernandez Proceso L, Jr
Publication venue: Archīum Ateneo
Publication date: 01/01/2022
Field of study

Deep neural network advancements have enabled machines to produce melodies emulating human-composed music. However, the implementation of such machines is costly in terms of resources. In this paper, we present NBP 2.0, a refinement of the previous model next bar predictor (NBP) with two notable improvements: first, transforming each training instance to anchor all the notes to its musical scale, and second, changing the model architecture itself. NBP 2.0 maintained its straightforward and lightweight implementation, which is an advantage over the baseline models. Improvements were assessed using quantitative and qualitative metrics and, based on the results, the improvements from these changes made are notable

archīum.ATENEO (Ateneo de Manila Univ.)

Time Advancement and Bounds Intersection Checking for Faster Broad-Phase Collision Detection of Paired Object Trajectories

Author: Fernandez Proceso L, Jr
Germar Mary Aveline
Publication venue: Archīum Ateneo
Publication date: 01/07/2018
Field of study

For self-driving mechanisms, the motion planning requires a reasonably fast algorithm for collision detection along the trajectories. We present three algorithms for the detection of collision among objects with predefined trajectories. The first algorithm uses the intersection of the path’s bounding box. The second algorithm sequentially checks for intersection between each pair of corresponding axis-aligned bounding boxes (AABB) from the trajectories of the two paths. Lastly, the latter algorithm is modified using iterative time advancement to an estimated earliest possible collision time. Simulation experiments on a variety of pair trajectories demonstrate a significant speedup of the proposed algorithms over the existing baseline algorithm. They are, therefore, preferable alternatives for faster broad-phase collision detection in applications such as motion planning

archīum.ATENEO (Ateneo de Manila Univ.)

Rice Blast Disease Forecasting for Northern Philippines

Author: Fernandez Proceso L, Jr
Malicdem Alvin R
Publication venue: Archīum Ateneo
Publication date: 01/01/2015
Field of study

Rice blast disease has become an enigmatic problem in several rice growing ecosystems of both tropical and temperate regions of the world. In this study, we develop models for predicting the occurrence and severity of rice blast disease, with the aim of helping to prevent or at least mitigate the spread of such disease. Data from 2 government agencies in selected provinces from northern Philippines were gathered, cleaned and synchronized for the purpose of building the predictive models. After the data synchronization, dimensionality reduction of the feature space was done, using Principal Component Analysis (PCA), to determine the most important weather features that contribute to the occurrence of the rice blast disease. Using these identified features, ANN and SVM binary classifiers (for prediction of the occurrence or non-occurrence of rice blast) and regression models (for estimation of the severity of an occurring rice blast) were built and tested. These classifiers and regression models produced sufficiently accurate results, with the SVM models showing a significantly better predictive power than the corresponding ANN models. These findings can be used in developing a system for forecasting rice blast, which may help reduce the occurrence of the disease

archīum.ATENEO (Ateneo de Manila Univ.)

Improving the vector auto regression technique for time-series link prediction by using support vector machine

Author: Co Jan Miles
Fernandez Proceso L, Jr
Publication venue: Archīum Ateneo
Publication date: 01/01/2016
Field of study

Predicting links between the nodes of a graph has become an important Data Mining task because of its direct applications to biology, social networking, communication surveillance, and other domains. Recent literature in time-series link prediction has shown that the Vector Auto Regression (VAR) technique is one of the most accurate for this problem. In this study, we apply Support Vector Machine (SVM) to improve the VAR technique that uses an unweighted adjacency matrix along with 5 matrices: Common Neighbor (CN), Adamic-Adar (AA), Jaccard’s Coefficient (JC), Preferential Attachment (PA), and Research Allocation Index (RA). A DBLP dataset covering the years from 2003 until 2013 was collected and transformed into time-sliced graph representations. The appropriate matrices were computed from these graphs, mapped to the feature space, and then used to build baseline VAR models with lag of 2 and some corresponding SVM classifiers. Using the Area Under the Receiver Operating Characteristic Curve (AUC-ROC) as the main fitness metric, the average result of 82.04% for the VAR was improved to 84.78% with SVM. Additional experiments to handle the highly imbalanced dataset by oversampling with SMOTE and undersampling with K-means clusters, however, did not improve the average AUC-ROC of the baseline SVM

archīum.ATENEO (Ateneo de Manila Univ.)

Classifying Mosquito Presence and Genera Using Median and Interquartile Values From 26-Filter Wingbeat Acoustic Properties

Author: Alar Hernan S.
Fernandez Proceso L, Jr
Publication venue: Archīum Ateneo
Publication date: 19/11/2021
Field of study

Mosquitoes are known to be one of the deadliest creatures in the world. There have been several studies that aim to identify mosquito presence and species using various techniques. The most common ones involve automatic identification of mosquito species from the sounds produced by flapping its wings. The development of these important concepts and technologies can help reduce the spread of mosquito-borne diseases. This paper presents a simple model based on mean and interquartile values that aim to solve the mosquito classification. Despite its simplicity, the proposed model significantly outperforms a Convolutional Neural Network (CNN) model in identifying the mosquito genus from the classes of Aedes, Anopheles and Culex, with an additional fourth class of No-Mosquito. A dataset of sound recordings from the Humbug Zooniverse, collected by researchers from Oxford University, and augmented with locally collected audio recordings of mosquitoes in the Philippines were used in this study. The proposed technique uses the numerical data from a series of 26 different pass-band filter values generated from spectrograms of audio recordings, specifically computing the statistical measures of median and interquartile values for each filter from instances of the same class. To predict the class of an instance, the sum of squares of differences was computed between the actual values of the instance against the expected values of each class on each of these three statistical measures. The average classification accuracy of our proposed model was 92.8%, and this was higher than the 86.6% classification accuracy yielded by the CNN model. Moreover, the proposed model required much less time for both training and classification than the CNN model. As the proposed model outperformed the CNN model in accuracy and efficiency, the results offer a promising technique that may also simplify the process of solving other sound-based classification problems

archīum.ATENEO (Ateneo de Manila Univ.)

Extending the Teknomo-Fernandez Background Image Generation Algorithm on the HSV Colour Space

Author: Abu Patricia Angela R
Fernandez Proceso L, Jr
Publication venue: Archīum Ateneo
Publication date: 01/11/2015
Field of study

Background subtraction, a procedure required in many video analysis applications such as object tracking , is dependent on the model background image. One efficient algorithm for background image generation is the Teknomo-Fernandez (TF) Algorithm, which uses modal values and a tournament-like strategy to produce a good background image very quickly. A previous study showed that the TF algorithm can be extended from the original 3 frames per tournament (T F 3) to T F 5 and T F 7, resulting in increased accuracies at a cost of increased processing times. In this study, we explore extending the T F 3, T F 5 and T F 7 from the original RGB colour space to the HSV colour space. A ground truth model background image for HSV was also developed for comparing the performances between the TF implementations on the RGB and HSV channels. The results show that the TF algorithm generates accurate background images when implemented on the HSV colour space. However, the RGB implementations still exhibit higher accuracies than the corresponding HSV implementations. Finally, background subtraction was applied on the HSV generated background images. A comparison with other promising baseline techniques validates the competitiveness of the TF algorithm implemented on HSV channels

archīum.ATENEO (Ateneo de Manila Univ.)

Artificial Neural Network (ANN) in a Small Dataset to determine Neutrality in the Pronunciation of English as a Foreign Language in Filipino Call Center Agents

Author: Baquirin Rey Benjamin M
Fernandez Proceso L, Jr
Publication venue: Archīum Ateneo
Publication date: 01/01/2018
Field of study

Artificial Neural Networks (ANNs) have continued to be efficient models in solving classification problems. In this paper, we explore the use of an ANN with a small dataset to accurately classify whether Filipino call center agents’ pronunciations are neutral or not based on their employer’s standards. Isolated utterances of the ten most commonly used words in the call center were recorded from eleven agents creating a dataset of 110 utterances. Two learning specialists were consulted to establish ground truths and Cohen’s Kappa was computed as 0.82, validating the reliability of the dataset. The first thirteen Mel-Frequency Cepstral Coefficients (MFCCs) were then extracted from each word and an ANN was trained with Ten-fold Stratified Cross Validation. Experimental results on the model recorded a classification accuracy of 89.60% supported by an overall F-Score of 0.92

archīum.ATENEO (Ateneo de Manila Univ.)

Troika Generative Adversarial Network (T-GAN): A Synthetic Image Generator That Improves Neural Network Training for Handwriting Classification

Author: Fernandez Proceso L, Jr
Milan Joe Anthony M
Publication venue: Archīum Ateneo
Publication date: 01/01/2020
Field of study

Training an artificial neural network for handwriting classification requires a sufficiently sized annotated dataset in order to avoid overfitting. In the absence of sufficient instances, data augmentation techniques are normally considered. In this paper, we propose the troika generative adversarial network (T-GAN) for data augmentation to address the scarcity of publicly labeled handwriting datasets. T-GAN has three generator subnetworks architectured to have some weight-sharing in order to learn the joint distribution from three specific domains. We used T-GAN to augment the data from a subset of the IAM Handwriting Database. We then compared this with other data augmentation techniques by measuring the improvements brought by each technique to the handwriting classification accuracies in three types of artificial neural networks (ANNs): deep ANN, convolutional neural network (CNN), and deep CNN. The data augmentation technique involving the T-GAN yielded the highest accuracy improvements in each of the three ANN classifier types – outperforming the standard techniques of image rotation, affine transformation, and combination of these two – as well as the technique that uses another GAN-based model, the coupled GAN (CoGAN). Furthermore, a paired t-test between the 10-fold cross-validation results of the T-GAN and CoGAN, the second-best augmentation technique in this study, on a deep CNN-made classifier confirmed the superiority of the data augmentation technique that uses the T-GAN. Finally, when the generated synthetic data instances from the T-GAN were further enhanced using the pepper noise removal and median filter, the classification accuracy of the trained CNN and deep CNN classifiers were further improved to 93.54% and 95.45%, respectively. Each of these is a big improvement from the original accuracies of 67.43% and 68.32%, respectively of the 2 classifiers trained on the original unaugmented dataset. Thus, data augmentation using T-GAN – coupled with the mentioned two image noise removal techniques – can be a preferred pre-training technique for augmenting handwriting datasets with insufficient data samples

archīum.ATENEO (Ateneo de Manila Univ.)