761 research outputs found
Using Stacked Sparse Auto-Encoder and Superpixel CRF for Long-Term Visual Scene Understanding of UGVs
Multiple images have been widely used for scene understanding and navigation of unmanned ground vehicles in long term operations. However, as the amount of visual data in multiple images is huge, the cumulative error in many cases becomes untenable. This paper proposes a novel method that can extract features from a large dataset of multiple images efficiently. Then the membership K-means clustering is used for high dimensional features, and the large dataset is divided into N subdatasets to train N conditional random field (CRF) models based on superpixel. A Softmax subdataset selector is used to decide which one of the N CRF models is chosen as the prediction model for labeling images. Furthermore, some experiments are conducted to evaluate the feasibility and performance of the proposed approach
Efficient audio signal processing for embedded systems
We investigated two design strategies that would allow us to efficiently process audio signals on embedded systems such as mobile phones and portable electronics. In the first strategy, we exploit properties of the human auditory system to process audio signals. We designed a sound enhancement algorithm to make piezoelectric loudspeakers sound "richer" and "fuller," using a combination of bass extension and dynamic range compression. We also developed an audio energy reduction algorithm for loudspeaker power management by suppressing signal energy below the masking threshold. In the second strategy, we use low-power analog circuits to process the signal before digitizing it. We designed an analog front-end for sound detection and implemented it on a field programmable analog array (FPAA). The sound classifier front-end can be used in a wide range of applications because programmable floating-gate transistors are employed to store classifier weights. Moreover, we incorporated a feature selection algorithm to simplify the analog front-end. A machine learning algorithm AdaBoost is used to select the most relevant features for a particular sound detection application. We also designed the circuits to implement the AdaBoost-based analog classifier.PhDCommittee Chair: Anderson, David; Committee Member: Hasler, Jennifer; Committee Member: Hunt, William; Committee Member: Lanterman, Aaron; Committee Member: Minch, Bradle
Multiple Cooperative Swarms for Data Clustering
Exploring a set of unlabeled data to extract the similar clusters,
known as data clustering, is an appealing problem in machine
learning. In other words, data clustering organizes the underlying
data into different groups using a notion of similarity between
patterns.
A new approach to solve the data clustering problem based on
multiple cooperative swarms is introduced. The proposed approach is
inspired by the social swarming behavior of biological bird flocks
which search for food situated in several places. The proposed
approach is composed of two main phases, namely, initialization and
exploitation. In the initialization phase, the aim is to distribute
the search space among several swarms. That is, a part of the search
space is assigned to each swarm in this phase. In the exploitation
phase, each swarm searches for the center of its associated cluster
while cooperating with other swarms. The search proceeds to converge
to a near-optimal solution. As compared to the single swarm
clustering approach, the proposed multiple cooperative swarms
provide better solutions in terms of fitness function measure for
the cluster centers, as the dimensionality of data and number of
clusters increase.
The multiple cooperative swarms clustering approach assumes that the
number of clusters is known a priori. The notion of stability
analysis is proposed to extract the number of clusters for the
underlying data using multiple cooperative swarms. The mathematical
explanations demonstrating why the proposed approach leads to more
stable and robust results than those of the single swarm clustering
are also provided.
Application of the proposed multiple cooperative swarms clustering
is considered for one of the most challenging problems in speech
recognition: phoneme recognition. The proposed approach is used to
decompose the recognition task into a number of subtasks or modules.
Each module involves a set of similar phonemes known as a phoneme
family. Basically, the goal is to obtain the best solution for
phoneme families using the proposed multiple cooperative swarms
clustering. The experiments using the standard TIMIT corpus indicate
that using the proposed clustering approach boosts the accuracy of
the modular approach for phoneme recognition considerably
Auxiliary Losses for Learning Generalizable Concept-based Models
The increasing use of neural networks in various applications has lead to
increasing apprehensions, underscoring the necessity to understand their
operations beyond mere final predictions. As a solution to enhance model
transparency, Concept Bottleneck Models (CBMs) have gained popularity since
their introduction. CBMs essentially limit the latent space of a model to
human-understandable high-level concepts. While beneficial, CBMs have been
reported to often learn irrelevant concept representations that consecutively
damage model performance. To overcome the performance trade-off, we propose
cooperative-Concept Bottleneck Model (coop-CBM). The concept representation of
our model is particularly meaningful when fine-grained concept labels are
absent. Furthermore, we introduce the concept orthogonal loss (COL) to
encourage the separation between the concept representations and to reduce the
intra-concept distance. This paper presents extensive experiments on real-world
datasets for image classification tasks, namely CUB, AwA2, CelebA and TIL. We
also study the performance of coop-CBM models under various distributional
shift settings. We show that our proposed method achieves higher accuracy in
all distributional shift settings even compared to the black-box models with
the highest concept accuracy.Comment: Neurips 202
The Acquisition Of Lexical Knowledge From The Web For Aspects Of Semantic Interpretation
This work investigates the effective acquisition of lexical knowledge from the Web to perform semantic interpretation. The Web provides an unprecedented amount of natural language from which to gain knowledge useful for semantic interpretation. The knowledge acquired is described as common sense knowledge, information one uses in his or her daily life to understand language and perception. Novel approaches are presented for both the acquisition of this knowledge and use of the knowledge in semantic interpretation algorithms. The goal is to increase accuracy over other automatic semantic interpretation systems, and in turn enable stronger real world applications such as machine translation, advanced Web search, sentiment analysis, and question answering. The major contributions of this dissertation consist of two methods of acquiring lexical knowledge from the Web, namely a database of common sense knowledge and Web selectors. The first method is a framework for acquiring a database of concept relationships. To acquire this knowledge, relationships between nouns are found on the Web and analyzed over WordNet using information-theory, producing information about concepts rather than ambiguous words. For the second contribution, words called Web selectors are retrieved which take the place of an instance of a target word in its local context. The selectors serve for the system to learn the types of concepts that the sense of a target word should be similar. Web selectors are acquired dynamically as part of a semantic interpretation algorithm, while the relationships in the database are useful to iii stand-alone programs. A final contribution of this dissertation concerns a novel semantic similarity measure and an evaluation of similarity and relatedness measures on tasks of concept similarity. Such tasks are useful when applying acquired knowledge to semantic interpretation. Applications to word sense disambiguation, an aspect of semantic interpretation, are used to evaluate the contributions. Disambiguation systems which utilize semantically annotated training data are considered supervised. The algorithms of this dissertation are considered minimallysupervised; they do not require training data created by humans, though they may use humancreated data sources. In the case of evaluating a database of common sense knowledge, integrating the knowledge into an existing minimally-supervised disambiguation system significantly improved results – a 20.5% error reduction. Similarly, the Web selectors disambiguation system, which acquires knowledge directly as part of the algorithm, achieved results comparable with top minimally-supervised systems, an F-score of 80.2% on a standard noun disambiguation task. This work enables the study of many subsequent related tasks for improving semantic interpretation and its application to real-world technologies. Other aspects of semantic interpretation, such as semantic role labeling could utilize the same methods presented here for word sense disambiguation. As the Web continues to grow, the capabilities of the systems in this dissertation are expected to increase. Although the Web selectors system achieves great results, a study in this dissertation shows likely improvements from acquiring more data. Furthermore, the methods for acquiring a database of common sense knowledge could be applied in a more exhaustive fashion for other types of common sense knowledge. Finally, perhaps the greatest benefits from this work will come from the enabling of real world technologies that utilize semantic interpretation
- …