13 research outputs found
Predictive Data Mining: Promising Future and Applications
Predictive analytics is the branch of data mining concerned with the prediction of future probabilities and trends. The central element of predictive analytics is the predictor, a variable that can be measured for an individual or other entity to predict future behavior. For example, an insurance company is likely to take into account potential driving safety predictors such as age, gender, and driving record when issuing car insurance policies. Multiple predictors are combined into a predictive model, which, when subjected to analysis, can be used to forecast future probabilities with an acceptable level of reliability. In predictive modeling, data is collected, a statistical model is formulated, predictions are made and the model is validated (or revised) as additional data becomes available. Predictive analytics are applied to many research areas, including meteorology, security, genetics, economics, and marketing. In this paper, we have done an extensive study on various predictive techniques with all its future directions and applications in various areas are being explaine
A Novel PSO-FLANN Framework of Feature Selection and Classification for Microarray Data
AbstractFeature selection is a method of finding appropriate features from a given dataset. Last few years a number of feature selection methods have been proposed for handling the curse of dimensionality with microarray data set. Proposed framework has used two feature selection methods: Principal component analysis (PCA) and Factor analysis (FA). Typically microarray data contains number of genes with huge number of conditions. In such case there is a need of good classifier to classify the data. In this paper, particle swarm optimization (PSO) is used for classification because the parameters of PSO can be optimized for a given problem. In recent years PSO has been used increasingly as a novel technique for solving complex problems. To classify the microarray dataset, the functional link artificial neural network (FLANN) used the PSO to tune the parameters of FLANN. This PSO-FLANN classifier has been used to classify three different microarray data sets to achieve the accuracy. The proposed PSO-FLANN model has also been compared with discriminant Analysis (DA). Experiments were performed on the three microarray datasets and the simulation shows that PSO-FLANN gives more than 80% accuracy
HRI in Indian Education: Challenges Opportunities
With the recent advancements in the field of robotics and the increased focus
on having general-purpose robots widely available to the general public, it has
become increasingly necessary to pursue research into Human-robot interaction
(HRI). While there have been a lot of works discussing frameworks for teaching
HRI in educational institutions with a few institutions already offering
courses to students, a consensus on the course content still eludes the field.
In this work, we highlight a few challenges and opportunities while designing
an HRI course from an Indian perspective. These topics warrant further
deliberations as they have a direct impact on the design of HRI courses and
wider implications for the entire field.Comment: Presented at the Designing an Intro to HRI Course Workshop at HRI
2024 (arXiv:2403.05588
A Coding Theoretic Model for Error-detecting in DNA Sequences
AbstractA major problem in communication engineering system is the transmitting of information from source to receiver over a noisy channel. To check the error in information digits many error detecting and correcting codes have been developed. The main aim of these error correcting codes is to encode the information digits and decode these digits to detect and correct the common errors in transmission. This information theory concept helps to study the information transmission in biological systems and extend the field of coding theory into the biological domain. In the cellular level, the information in DNA is transformed into proteins. The sequence of bases like Adenine (A), Thymine (T), Guanine (G) and Cytosine (C) in DNA may be considered as digital codes which transmit genetic information. This paper shows the existence of any form error detecting code in the DNA structure, by encoding the DNA sequences using Hamming code
Phylogenetic Tree Construction for DNA Sequences using Clustering Methods
AbstractA phylogenetic tree or an evolutionary tree is a graph that shows the evolutionary relationships among various biological species based on their genetic closeness. In the proposed model, initially individual samples are selected and a matrix is generated which shows the genetic distances between individuals. Then by using the distance matrix, samples are divided into clusters. Then phylogenetic trees for each cluster are constructed independently. In order to find the clustering algorithm that gives the most effective clusters for biological data are k-means algorithm, k-medoid algorithm and density-based algorithms are used. By doing a comparative study it is concluded that density-based clustering (DBSCAN) is quite good for biological dataset because; this algorithm performs efficiently for low dimensional data and the algorithm is robust towards outliers and noise points. The phylogenetic tree for every individual clusters are formed and finally joined to create the final phylogenetic tree. From the experimental evaluation it has been found that, the DBSCAN is showing better result showing appropriate information and it is faster than other two methods
Performance assessment of Deep Learning procedures on Malaria dataset
Malaria detection is a time-consuming procedure. Only blood sample investigation is the practice which provides the confirmation. Now numerous computational methods have been used to make it faster. The proposed model uses the conception of Convolutional Neural Network (CNN) to lessen the time complexity in identification of Malaria. The prototypical model uses different deep learning algorithms which  uses the same dataset to validate the stability. Model uses the two various components of CNN like Sequential and  ResNet. ResNet uses more of number of hidden layers rather than sequential. The ResNet model achieved 96.50% accuracy on the training data, 96.78% accuracy on the validation data and 97% accuracy on the testing data. Sequential model on the other hand achieved 98% accuracy on the training data, 96% accuracy on the validation data and 96% accuracy on the testing data. From the initial hypothesis, we get to know that there is no significant difference in the accuracy when we have too many layers
Behavioral profiling for adaptive video summarization: From generalization to personalization
In today's world of managing multimedia content, dealing with the amount of CCTV footage poses challenges related to storage, accessibility and efficient navigation. To tackle these issues, we suggest an encompassing technique, for summarizing videos that merges machine-learning techniques with user engagement. Our methodology consists of two phases, each bringing improvements to video summarization. In Phase I we introduce a method for summarizing videos based on keyframe detection and behavioral analysis. By utilizing technologies like YOLOv5 for object recognition, Deep SORT for object tracking, and Single Shot Detector (SSD) for creating video summaries. In Phase II we present a User Interest Based Video summarization system driven by machine learning. By incorporating user preferences into the summarization process we enhance techniques with personalized content curation. Leveraging tools such as NLTK, OpenCV, TensorFlow, and the EfficientDET model enables our system to generate customized video summaries tailored to preferences. This innovative approach not only enhances user interactions but also efficiently handles the overwhelming amount of video data on digital platforms. By combining these two methodologies we make progress in applying machine learning techniques while offering a solution to the complex challenges presented by managing multimedia data
Machine Learning Styles for Diabetic Retinopathy Detection: A Review and Bibliometric Analysis
Diabetic retinopathy (DR) is a medical condition caused by diabetes. The development of retinopathy significantly depends on how long a person has had diabetes. Initially, there may be no symptoms or just a slight vision problem due to impairment of the retinal blood vessels. Later, it may lead to blindness. Recognizing the early clinical signs of DR is very important for intervening in and effectively treating DR. Thus, regular eye check-ups are necessary to direct the person to a doctor for a comprehensive ocular examination and treatment as soon as possible to avoid permanent vision loss. Nevertheless, due to limited resources, it is not feasible for screening. As a result, emerging technologies, such as artificial intelligence, for the automatic detection and classification of DR are alternative screening methodologies and thereby make the system cost-effective. People have been working on artificial-intelligence-based technologies to detect and analyze DR in recent years. This study aimed to investigate different machine learning styles that are chosen for diagnosing retinopathy. Thus, a bibliometric analysis was systematically done to discover different machine learning styles for detecting diabetic retinopathy. The data were exported from popular databases, namely, Web of Science (WoS) and Scopus. These data were analyzed using Biblioshiny and VOSviewer in terms of publications, top countries, sources, subject area, top authors, trend topics, co-occurrences, thematic evolution, factorial map, citation analysis, etc., which form the base for researchers to identify the research gaps in diabetic retinopathy detection and classification
Improving the accuracy of ensemble machine learning classification models using a novel bit-fusion algorithm for healthcare AI systems
Healthcare AI systems exclusively employ classification models for disease detection. However, with the recent research advances into this arena, it has been observed that single classification models have achieved limited accuracy in some cases. Employing fusion of multiple classifiers outputs into a single classification framework has been instrumental in achieving greater accuracy and performing automated big data analysis. The article proposes a bit fusion ensemble algorithm that minimizes the classification error rate and has been tested on various datasets. Five diversified base classifiers k- nearest neighbor (KNN), Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), Decision Tree (D.T.), and Naïve Bayesian Classifier (N.B.), are used in the implementation model. Bit fusion algorithm works on the individual input from the classifiers. Decision vectors of the base classifier are weighted transformed into binary bits by comparing with high-reliability threshold parameters. The output of each base classifier is considered as soft class vectors (CV). These vectors are weighted, transformed and compared with a high threshold value of initialized δ = 0.9 for reliability. Binary patterns are extracted, and the model is trained and tested again. The standard fusion approach and proposed bit fusion algorithm have been compared by average error rate. The error rate of the Bit-fusion algorithm has been observed with the values 5.97, 12.6, 4.64, 0, 0, 27.28 for Leukemia, Breast cancer, Lung Cancer, Hepatitis, Lymphoma, Embryonal Tumors, respectively. The model is trained and tested over datasets from UCI, UEA, and UCR repositories as well which also have shown reduction in the error rates