193,537 research outputs found
Attribute Selection for Classification
The selection of attributes used to construct a classification model is crucial in machine learning, in particular with instance similarity methods. We present a new algorithm to select and rank attributes based on weighing features according to their ability to help class prediction. The algorithm uses the same structure that holds training records for classification. Attribute values and their classes are projected into a one-dimensional space, to account for various degrees of the relationship between them. With the user deciding on the degree of this relation, any of several potential solutions can be used as criterion to determine attribute relevance. This low complexity algorithm increases classification predictive accuracy and also helps to reduce the feature dimension problem
A Social Network Image Classification Algorithm Based on Multimodal Deep Learning
The complex data structure and massive image data of social networks pose a huge challenge to the mining of associations between social information. For accurate classification of social network images, this paper proposes a social network image classification algorithm based on multimodal deep learning. Firstly, a social network association clustering model (SNACM) was established, and used to calculate trust and similarity, which represent the degree of similarity between users. Based on artificial ant colony algorithm, the SNACM was subject to weighted stacking, and the social network image association network was constructed. After that, the social network images of three modes, i.e. RGB (red-green-blue) image, grayscale image, and depth image, were fused. Finally, a three-dimensional neural network (3D NN) was constructed to extract the features of the multimodal social network image. The proposed algorithm was proved valid and accurate through experiments. The research results provide a reference for applying multimodal deep learning to classify the images in other fields
Improving ICD-based semantic similarity by accounting for varying degrees of comorbidity
Finding similar patients is a common objective in precision medicine,
facilitating treatment outcome assessment and clinical decision support.
Choosing widely-available patient features and appropriate mathematical methods
for similarity calculations is crucial. International Statistical
Classification of Diseases and Related Health Problems (ICD) codes are used
worldwide to encode diseases and are available for nearly all patients.
Aggregated as sets consisting of primary and secondary diagnoses they can
display a degree of comorbidity and reveal comorbidity patterns. It is possible
to compute the similarity of patients based on their ICD codes by using
semantic similarity algorithms. These algorithms have been traditionally
evaluated using a single-term expert rated data set.
However, real-word patient data often display varying degrees of documented
comorbidities that might impair algorithm performance. To account for this, we
present a scale term that considers documented comorbidity-variance. In this
work, we compared the performance of 80 combinations of established algorithms
in terms of semantic similarity based on ICD-code sets. The sets have been
extracted from patients with a C25.X (pancreatic cancer) primary diagnosis and
provide a variety of different combinations of ICD-codes. Using our scale term
we yielded the best results with a combination of level-based information
content, Leacock & Chodorow concept similarity and bipartite graph matching for
the set similarities reaching a correlation of 0.75 with our expert's ground
truth. Our results highlight the importance of accounting for comorbidity
variance while demonstrating how well current semantic similarity algorithms
perform.Comment: 11 pages, 6 figures, 1 tabl
Clustering Mining Algorithm of Internet of Things Database Based on Python Language
In order to solve the problems of reading delay in data mining of the Internet of Things database, a clustering mining algorithm of the Internet of Things database based on Python language is proposed. We designed an improved crawler algorithm based on the open-source structure of scratch through Python language, judge the similarity of recruitment data topics in the Internet of Things database through Bayesian classifier, and crawl the recruitment data in the Internet of Things database: the number of keywords in the text space, the degree of keyword extraction, and the number of keyword data in the text space. The time series model is used to eliminate the delay of text features. On this basis, the semi-supervised learning and semi-cluster analysis method is used to construct the corresponding classifier, complete the adaptive classification process of the text data stream and realize the clustering mining of the Internet of Things database based on Python language. The experimental results show that this method has a low reading delay, and can mine the attention, number of posts and click time frequency of the Internet of Things database from which the recruitment data are obtained
Recommended from our members
Incremental learning of independent, overlapping, and graded concept descriptions with an instance-based process framework
Supervised learning algorithms make several simplifying assumptions concerning the characteristics of the concept descriptions to be learned. For example, concepts are often assumed to be (1) defined with respect to the same set of relevant attributes, (2) disjoint in instance space, and (3) have uniform instance distributions. While these assumptions constrain the learning task, they unfortunately limit an algorithm's applicability. We believe that supervised learning algorithms should learn attribute relevancies independently for each concept, allow instances to be members of any subset of concepts, and represent graded concept descriptions. This paper introduces a process framework for instance-based learning algorithms that exploit only specific instance and performance feedback information to guide their concept learning processes. We also introduce Bloom, a specific instantiation of this framework. Bloom is a supervised, incremental, instance-based learning algorithm that learns relative attribute relevancies independently for each concept, allows instances to be members of any subset of concepts, and represents graded concept memberships. We describe empirical evidence to support our claims that Bloom can learn independent, overlapping, and graded concept descriptions
- …