1,128 research outputs found
Recommended from our members
DATA-DRIVEN APPROACH TO IMAGE CLASSIFICATION
Image classification has been a core topic in the computer vision community. Its recent success with convolutional neural network (CNN) algorithm has led to various real world applications such as large scale management of photos/videos on cloud/social-media, image based search for online retailers, self-driving cars, building robots and healthcare. Image classification can be broadly categorized into binary, multi-class and multi-label classification problems. Binary classification involves assigning one of the two class labels to an instance. In multi-class classification problem, an instance should be categorized into one of more than two classes. Multi-label classification is a generalized version of the multi-class classification problem where each image is assigned multiple labels as opposed to a single label.
In this work, we first present various methods that take advantage of deep representations (fully connected layer of pre-trained CNN on the ImageNet dataset) and yield better performance on multi-label classification when compared to methods that use over a dozen conventional visual features. Following the success of deep representations, we intend to build a generic end-to-end deep learning framework to address all three problem categories of image classification. However, there are still no well established guidelines (in terms of choosing the number of layers to go deeper, the number of kernels and the size, the type of regularizer, the choice of non-linear function, etc.) to build an efficient deep neural network and often network architecture design is specific to a problem/dataset. Hence, we present some initial efforts in building a computational framework called Deep Decision Network (DDN) which is completely data-driven. DDN is a tree-like structured built stage-wise. During the learning phase, starting from the root network node, DDN automatically builds a network that splits the data into disjoint clusters of classes which would be handled by the subsequent expert networks. This results in a tree-like structured network driven by the data. The proposed approach provides an insight into the data by identifying the group of classes that are hard to classify and require more attention when compared to others. This feature is crucial for people trying to solve the problem with little or no domain knowledge, especially for applications in medical domain. Initially, we evaluate DDN on a binary classification problem and later extend it to more challenging multi-class and multi-label classification problems. The extension of DDN to multi-class and multi-label involves some changes but they still operate under the same underlying principle. In all the three cases, the proposed approach is tested for its recognition performance and scalability on publicly available datasets providing comparison to other methods
The correlation between students’ habit in watching english movie and learning style toward listening Comprehension
The principal purpose of this study was to find out; (1)The correlation between students’ habit in watching english movie (X1) and listening comprehension (Y), (2) The correlation between learning style(X2) and listening comprehension (Y), and The correlation between students’ habit in watching english movie and learning style toward listening comprehension. It was carried out to the fourth semester students of IAIN Palangka Raya.
In this study, it was used quantitative method with correlation study to answer the problem of the study. Population of this study was consisted of fourth semester students at IAIN Palangkaraya in academic year 2018/2019 with total were 57 students. The sample used was total population. Test and questionnaire were used to collecte the data. After got the data, researcher was corrected and analyzed the result of the test.
The data were analyzed using Pearson Product Moment Corellation. The result of study showed that significance level between variable X1, X2 dan Y was higher than alpha (0.410≥0.05). it can be concluded that there was no possitive correlation between students’ habit in watching english movie and learning style toward listening comprehension. Null hypothesis (Ho) was accepted and alternative hypothesis (Ha) was rejected.
ABSTRACT
Tujuan dari penelitian ini untuk menemukan; (1) hubungan antara kebiasaan siswa dalam menonton film berbahasa inggris (X1) dan pemahaman mendengarkan (Y), (2) hubungan antara gaya belajar (X2) dan pemahaman mendengarkan (Y), dan korelasi antara kebiasaan siswa dalam menonton film berbahasa inggris dan gaya belajar terhadap pemahaman mendengarkan. Penelitian ini dilaksanakan pada mahasiswa semester empat di IAIN Palangkaraya.
Pada penelitian ini, peneliti menggunakan metode penelitian kuantitatif dengan studi korelasi untuk menjawab permasalahan penelitian tersebut. Populasi penelitian ini terdiri dari mahasiswa semester empat di IAIN Palangkaraya tahun ajaran 2018/2019 dengan jumlah sebanyak 57 siswa. Penentuan sampel dengan menggunakan total populasi. Tes dan angket digunakan untuk mengumpulkan data. Kemudian peneliti mengoreksi dan menganalisa hasil tes tersebut.
Data dianalisis menggunakan Pearson Product Moment Correlation. Hasil penelitian menunjukkan bahwa nilai signifikansi antara variabel X1, X2 dan Y adalah lebih dari nilai alpha yaitu (0.410≥0.05). dapat disimpulkan bahwa tidak ada hubungan positif antara kebiasaan menonton film bahasa inggris dan gaya belajar terhadap pemahaman mendengarkan siswa. Nol hipostesis (Ho) diterima dan alternatif hipotesis (Ha) ditolak
Spectral Geometric Methods for Deformable 3D Shape Retrieval
As 3D applications ranging from medical imaging to industrial design continue to grow, so does the importance of developing robust 3D shape retrieval systems. A key issue in developing an accurate shape retrieval algorithm is to design an efficient shape descriptor for which an index can be built, and similarity queries can be answered efficiently. While the overwhelming majority of prior work on 3D shape
analysis has concentrated primarily on rigid shape retrieval, many real objects such as articulated motions of humans are nonrigid and hence can exhibit a variety of poses and deformations. In this thesis, we present novel spectral geometric methods for analyzing and distinguishing between deformable 3D shapes. First, we comprehensively review recent shape descriptors based on the spectral decomposition of the Laplace-Beltrami operator, which provides a rich set of eigenbases that are invariant to intrinsic isometries. Then we provide a general and flexible framework for the analysis and design of shape signatures from the spectral graph wavelet perspective. In a bid to capture the global and local geometry, we propose a multiresolution shape signature based on a cubic spline wavelet generating kernel. This signature delivers best-in-class shape retrieval performance. Second, we investigate the ambiguity modeling of codebook for the densely distributed low-level shape descriptors. Inspired
by the ability of spatial cues to improve discrimination between shapes, we also propose to adopt the isocontours of the second eigenfunction of the Laplace-Beltrami operator to perform surface partition, which can significantly ameliorate the retrieval performance of the time-scaled local descriptors. To further enhance the shape retrieval accuracy, we introduce an intrinsic spatial pyramid matching approach. Extensive experiments are carried out on two 3D shape benchmarks to assess the performance of the proposed spectral geometric approaches in comparison with state-of-the-art methods
High-dimensional Sparse Count Data Clustering Using Finite Mixture Models
Due to the massive amount of available digital data, automating its analysis and modeling for
different purposes and applications has become an urgent need. One of the most challenging tasks
in machine learning is clustering, which is defined as the process of assigning observations sharing
similar characteristics to subgroups. Such a task is significant, especially in implementing complex
algorithms to deal with high-dimensional data. Thus, the advancement of computational power in
statistical-based approaches is increasingly becoming an interesting and attractive research domain.
Among the successful methods, mixture models have been widely acknowledged and successfully
applied in numerous fields as they have been providing a convenient yet flexible formal setting for
unsupervised and semi-supervised learning. An essential problem with these approaches is to develop
a probabilistic model that represents the data well by taking into account its nature. Count
data are widely used in machine learning and computer vision applications where an object, e.g.,
a text document or an image, can be represented by a vector corresponding to the appearance frequencies
of words or visual words, respectively. Thus, they usually suffer from the well-known
curse of dimensionality as objects are represented with high-dimensional and sparse vectors, i.e., a
few thousand dimensions with a sparsity of 95 to 99%, which decline the performance of clustering
algorithms dramatically. Moreover, count data systematically exhibit the burstiness and overdispersion
phenomena, which both cannot be handled with a generic multinomial distribution, typically
used to model count data, due to its dependency assumption.
This thesis is constructed around six related manuscripts, in which we propose several approaches
for high-dimensional sparse count data clustering via various mixture models based on hierarchical Bayesian modeling frameworks that have the ability to model the dependency of repetitive
word occurrences. In such frameworks, a suitable distribution is used to introduce the prior
information into the construction of the statistical model, based on a conjugate distribution to the
multinomial, e.g. the Dirichlet, generalized Dirichlet, and the Beta-Liouville, which has numerous
computational advantages. Thus, we proposed a novel model that we call the Multinomial
Scaled Dirichlet (MSD) based on using the scaled Dirichlet as a prior to the multinomial to allow
more modeling flexibility. Although these frameworks can model burstiness and overdispersion
well, they share similar disadvantages making their estimation procedure is very inefficient when
the collection size is large. To handle high-dimensionality, we considered two approaches. First,
we derived close approximations to the distributions in a hierarchical structure to bring them to
the exponential-family form aiming to combine the flexibility and efficiency of these models with
the desirable statistical and computational properties of the exponential family of distributions, including
sufficiency, which reduce the complexity and computational efforts especially for sparse
and high-dimensional data. Second, we proposed a model-based unsupervised feature selection approach
for count data to overcome several issues that may be caused by the high dimensionality of
the feature space, such as over-fitting, low efficiency, and poor performance.
Furthermore, we handled two significant aspects of mixture based clustering methods, namely,
parameters estimation and performing model selection. We considered the Expectation-Maximization
(EM) algorithm, which is a broadly applicable iterative algorithm for estimating the mixture model
parameters, with incorporating several techniques to avoid its initialization dependency and poor
local maxima. For model selection, we investigated different approaches to find the optimal number
of components based on the Minimum Message Length (MML) philosophy. The effectiveness of
our approaches is evaluated using challenging real-life applications, such as sentiment analysis, hate
speech detection on Twitter, topic novelty detection, human interaction recognition in films and TV
shows, facial expression recognition, face identification, and age estimation
IST Austria Thesis
The human ability to recognize objects in complex scenes has driven research in the computer vision field over couple of decades. This thesis focuses on the object recognition task in images. That is, given the image, we want the computer system to be able to predict the class of the object that appears in the image. A recent successful attempt to bridge semantic understanding of the image perceived by humans and by computers uses attribute-based models. Attributes are semantic properties of the objects shared across different categories, which humans and computers can decide on. To explore the attribute-based models we take a statistical machine learning approach, and address two key learning challenges in view of object recognition task: learning augmented attributes as mid-level discriminative feature representation, and learning with attributes as privileged information. Our main contributions are parametric and non-parametric models and algorithms to solve these frameworks. In the parametric approach, we explore an autoencoder model combined with the large margin nearest neighbor principle for mid-level feature learning, and linear support vector machines for learning with privileged information. In the non-parametric approach, we propose a supervised Indian Buffet Process for automatic augmentation of semantic attributes, and explore the Gaussian Processes classification framework for learning with privileged information. A thorough experimental analysis shows the effectiveness of the proposed models in both parametric and non-parametric views
Learning Taxonomy Adaptation in Large-scale Classification
International audienc
Domain knowledge, uncertainty, and parameter constraints
Ph.D.Committee Chair: Guy Lebanon; Committee Member: Alex Shapiro; Committee Member: Alexander Gray; Committee Member: Chin-Hui Lee; Committee Member: Hongyuan Zh
Automatic inference of causal reasoning chains from student essays
While there has been an increasing focus on higher-level thinking skills arising from the Common Core Standards, many high-school and middle-school students struggle to combine and integrate information from multiple sources when writing essays. Writing is an important learning skill, and there is increasing evidence that writing about a topic develops a deeper understanding in the student. However, grading essays is time consuming for teachers, resulting in an increasing focus on shallower forms of assessment that are easier to automate, such as multiple-choice tests. Existing essay grading software has attempted to ease this burden but relies on shallow lexico-syntactic features and is unable to understand the structure or validity of a student’s arguments or explanations. Without the ability to understand a student’s reasoning processes, it is impossible to write automated formative assessment systems to assist students with improving their thinking skills through essay writing.
In order to understand the arguments put forth in an explanatory essay in the science domain, we need a method of representing the causal structure of a piece of explanatory text. Psychologists use a representation called a causal model to represent a student\u27s understanding of an explanatory text. This consists of a number of core concepts, and a set of causal relations linking them into one or more causal chains, forming a causal model. In this thesis I present a novel system for automatically constructing causal models from student scientific essays using Natural Language Processing (NLP) techniques.
The problem was decomposed into 4 sub-problems - assigning essay concepts to words, detecting causal-relations between these concepts, resolving coreferences within each essay, and using the structure of the whole essay to reconstruct a causal model. Solutions to each of these sub-problems build upon the predictions from the solutions to earlier problems, forming a sequential pipeline of models. Designing a system in this way allows later models to correct for false positive predictions from downstream models. However, this also has the disadvantage that errors made in earlier models can propagate through the system, negatively impacting the upstream models, and limiting their accuracy. Producing robust solutions for the initial 2 sub problems, detecting concepts, and parsing causal relations between them, was critical in building a robust system.
A number of sequence labeling models were trained to classify the concepts associated with each word, with the most effective approach being a bidirectional recurrent neural network (RNN), a deep learning model commonly applied to word labeling problems. This is because the RNN used pre-trained word embeddings to better generalize to rarer words, and was able to use information from both ends of each sentence to infer a word\u27s concept. The concepts predicted by this model were then used to develop causal relation parsing models for detecting causal connections between these concepts. A shift-reduce dependency parsing model was trained using the SEARN algorithm and out-performed a number of other approaches by better utilizing the structure of the problem and directly optimizing the error metric used.
Two pre-trained coreference resolution systems were used to resolve coreferences within the essays. However a word tagging model trained to predict anaphors combined with a heuristic for determining the antecedent out-performed these two systems. Finally, a model was developed for parsing a causal model from an entire essay, utilizing the solutions to the three previous problems. A beam search algorithm was used to produce multiple parses for each sentence, which in turn were combined to generate multiple candidate causal models for each student essay. A reranking algorithm was then used to select the optimal causal model from all of the generated candidates.
An important contribution of this work is that it represents a system for parsing a complete causal model of a scientific essay from a student\u27s written answer. Existing systems have been developed to parse individual causal relations, but no existing system attempts to parse a sequence of linked causal relations forming a causal model from an explanatory scientific essay. It is hoped that this work can lead to the development of more robust essay grading software and formative assessment tools, and can be extended to build solutions for extracting causality from text in other domains. In addition, I also present 2 novel approaches for optimizing the micro-F1 score within the design of two of the algorithms studied: the dependency parser and the reranking algorithm. The dependency parser uses a custom cost function to estimate the impact of parsing mistakes on the overall micro-F1 score, while the reranking algorithm allows the micro-F1 score to be optimized by tuning the beam search parameter to balance recall and precision
A modular architecture for systematic text categorisation
This work examines and attempts to overcome issues caused by the lack of formal standardisation when defining text categorisation techniques and detailing how they might be appropriately integrated with each other. Despite text categorisation’s long history the concept of automation is relatively new, coinciding with the evolution of computing technology and subsequent increase in quantity and availability of electronic textual data. Nevertheless insufficient descriptions of the diverse algorithms discovered have lead to an acknowledged ambiguity when trying to accurately replicate methods, which has made reliable comparative evaluations impossible.
Existing interpretations of general data mining and text categorisation methodologies are analysed in the first half of the thesis and common elements are extracted to create a distinct set of significant stages. Their possible interactions are logically determined and a unique universal architecture is generated that encapsulates all complexities and highlights the critical components. A variety of text related algorithms are also comprehensively surveyed and grouped according to which stage they belong in order to demonstrate how they can be mapped.
The second part reviews several open-source data mining applications, placing an emphasis on their ability to handle the proposed architecture, potential for expansion and text processing capabilities. Finding these inflexible and too elaborate to be readily adapted, designs for a novel framework are introduced that focus on rapid prototyping through lightweight customisations and reusable atomic components.
Being a consequence of inadequacies with existing options, a rudimentary implementation is realised along with a selection of text categorisation modules. Finally a series of experiments are conducted that validate the feasibility of the outlined methodology and importance of its composition, whilst also establishing the practicality of the framework for research purposes. The simplicity of experiments and results gathered clearly indicate the potential benefits that can be gained when a formalised approach is utilised
- …