360 research outputs found
A database with enterprise application for mining astronomical data obtained by MOA : a thesis submitted in partial fulfilment of the requirements for the degree of the Master of Information Science in Computer Science, Massey University at Albany, Auckland, New Zealand
The MOA (Microlensing Observations in Astrophysics) Project is one of a new generation of modern astronomy endeavours that generates huge volumes of data. These have enormous scientific data mining potential. However, it is common for astronomers to deal with millions and even billions of records. The challenge of how to manage these large data sets is an important case for researchers. A good database management system is vital for the research. With the modern observation equipments used, MOA suffers from the growing volume of the data and a database management solution is needed. This study analyzed the modern technology for database and enterprise application. After analysing the data mining requirements of MOA, a prototype data management system based on MVC pattern was developed. Furthermore, the application supports sharing MOA findings and scientific data on the Internet. It was tested on a 7GB subset of achieved MOA data set. After testing, it was found that the application could query data in an efficient time and support data mining
A new characterization of fuzzy ideals of semigroups and its applications
In this paper, we develop a new technique for constructing fuzzy ideals of a semigroup. By using generalized Green\u27s relations, fuzzy star ideals are constructed. It is shown that the new fuzzy ideal of a semigroup can be used to investigate the relationship between fuzzy sets and abundance and regularity for an arbitrary semigroup. Appropriate examples of such fuzzy ideals are given in order to illustrate the technique. Finally, we explain when a semigroup satisfies conditions of regularity
Resilient neural network training for accelerators with computing errors
—With the advancements of neural networks, customized accelerators are increasingly adopted in massive AI
applications. To gain higher energy efficiency or performance,
many hardware design optimizations such as near-threshold
logic or overclocking can be utilized. In these cases, computing
errors may happen and the computing errors are difficult
to be captured by conventional training on general purposed
processors (GPPs). Applying the offline trained neural network
models to the accelerators with errors directly may lead to
considerable prediction accuracy loss.
To address this problem, we explore the resilience of neural
network models and relax the accelerator design constraints to
enable aggressive design options. First of all, we propose to
train the neural network models using the accelerators’ forward
computing results such that the models can learn both the data
and the computing errors. In addition, we observe that some of
the neural network layers are more sensitive to the computing
errors. With this observation, we schedule the most sensitive
layer to the attached GPP to reduce the negative influence of
the computing errors. According to the experiments, the neural
network models obtained from the proposed training outperform
the original models significantly when the CNN accelerators are
affected by computing errors
Predicting the Silent Majority on Graphs: Knowledge Transferable Graph Neural Network
Graphs consisting of vocal nodes ("the vocal minority") and silent nodes
("the silent majority"), namely VS-Graph, are ubiquitous in the real world. The
vocal nodes tend to have abundant features and labels. In contrast, silent
nodes only have incomplete features and rare labels, e.g., the description and
political tendency of politicians (vocal) are abundant while not for ordinary
people (silent) on the twitter's social network. Predicting the silent majority
remains a crucial yet challenging problem. However, most existing
message-passing based GNNs assume that all nodes belong to the same domain,
without considering the missing features and distribution-shift between
domains, leading to poor ability to deal with VS-Graph. To combat the above
challenges, we propose Knowledge Transferable Graph Neural Network (KT-GNN),
which models distribution shifts during message passing and representation
learning by transferring knowledge from vocal nodes to silent nodes.
Specifically, we design the domain-adapted "feature completion and message
passing mechanism" for node representation learning while preserving domain
difference. And a knowledge transferable classifier based on KL-divergence is
followed. Comprehensive experiments on real-world scenarios (i.e., company
financial risk assessment and political elections) demonstrate the superior
performance of our method. Our source code has been open sourced.Comment: Paper was accepted by WWW202
A Robust Method for Speech Emotion Recognition Based on Infinite Student’s t
Speech emotion classification method, proposed in this paper, is based on Student’s t-mixture model with infinite component number (iSMM) and can directly conduct effective recognition for various kinds of speech emotion samples. Compared with the traditional GMM (Gaussian mixture model), speech emotion model based on Student’s t-mixture can effectively handle speech sample outliers that exist in the emotion feature space. Moreover, t-mixture model could keep robust to atypical emotion test data. In allusion to the high data complexity caused by high-dimensional space and the problem of insufficient training samples, a global latent space is joined to emotion model. Such an approach makes the number of components divided infinite and forms an iSMM emotion model, which can automatically determine the best number of components with lower complexity to complete various kinds of emotion characteristics data classification. Conducted over one spontaneous (FAU Aibo Emotion Corpus) and two acting (DES and EMO-DB) universal speech emotion databases which have high-dimensional feature samples and diversiform data distributions, the iSMM maintains better recognition performance than the comparisons. Thus, the effectiveness and generalization to the high-dimensional data and the outliers are verified. Hereby, the iSMM emotion model is verified as a robust method with the validity and generalization to outliers and high-dimensional emotion characters
Search-in-the-Chain: Towards the Accurate, Credible and Traceable Content Generation for Complex Knowledge-intensive Tasks
With the wide application of Large Language Models (LLMs) such as ChatGPT,
how to make the contents generated by LLM accurate and credible becomes very
important, especially in complex knowledge-intensive tasks. In this paper, we
propose a novel framework called Search-in-the-Chain (SearChain) to improve the
accuracy, credibility and traceability of LLM-generated content for multi-hop
question answering, which is a typical complex knowledge-intensive task.
SearChain is a framework that deeply integrates LLM and information retrieval
(IR). In SearChain, LLM constructs a chain-of-query, which is the decomposition
of the multi-hop question. Each node of the chain is a query-answer pair
consisting of an IR-oriented query and the answer generated by LLM for this
query. IR verifies, completes, and traces the information of each node of the
chain, so as to guide LLM to construct the correct chain-of-query, and finally
answer the multi-hop question. SearChain makes LLM change from trying to give a
answer to trying to construct the chain-of-query when faced with the multi-hop
question, which can stimulate the knowledge-reasoning ability and provides the
interface for IR to be deeply involved in reasoning process of LLM. IR
interacts with each node of chain-of-query of LLM. It verifies the information
of the node and provides the unknown knowledge to LLM, which ensures the
accuracy of the whole chain in the process of LLM generating the answer.
Besides, the contents returned by LLM to the user include not only the final
answer but also the reasoning process for the question, that is, the
chain-of-query and the supporting documents retrieved by IR for each node of
the chain, which improves the credibility and traceability of the contents
generated by LLM. Experimental results show SearChain outperforms related
baselines on four multi-hop question-answering datasets.Comment: work in progres
An Ultra-low Power TinyML System for Real-time Visual Processing at Edge
Tiny machine learning (TinyML), executing AI workloads on resource and power
strictly restricted systems, is an important and challenging topic. This brief
firstly presents an extremely tiny backbone to construct high efficiency CNN
models for various visual tasks. Then, a specially designed neural co-processor
(NCP) is interconnected with MCU to build an ultra-low power TinyML system,
which stores all features and weights on chip and completely removes both of
latency and power consumption in off-chip memory access. Furthermore, an
application specific instruction-set is further presented for realizing agile
development and rapid deployment. Extensive experiments demonstrate that the
proposed TinyML system based on our model, NCP and instruction set yields
considerable accuracy and achieves a record ultra-low power of 160mW while
implementing object detection and recognition at 30FPS. The demo video is
available on \url{https://www.youtube.com/watch?v=mIZPxtJ-9EY}.Comment: 5 pages, 5 figure
- …