276,157 research outputs found
The relationship between IR and multimedia databases
Modern extensible database systems support multimedia data through ADTs. However, because of the problems with multimedia query formulation, this support is not sufficient.\ud
\ud
Multimedia querying requires an iterative search process involving many different representations of the objects in the database. The support that is needed is very similar to the processes in information retrieval.\ud
\ud
Based on this observation, we develop the miRRor architecture for multimedia query processing. We design a layered framework based on information retrieval techniques, to provide a usable query interface to the multimedia database.\ud
\ud
First, we introduce a concept layer to enable reasoning over low-level concepts in the database.\ud
\ud
Second, we add an evidential reasoning layer as an intermediate between the user and the concept layer.\ud
\ud
Third, we add the functionality to process the users' relevance feedback.\ud
\ud
We then adapt the inference network model from text retrieval to an evidential reasoning model for multimedia query processing.\ud
\ud
We conclude with an outline for implementation of miRRor on top of the Monet extensible database system
Data Quality in Predictive Toxicology: Identification of Chemical Structures and Calculation of Chemical Descriptors
Every technique for toxicity prediction and for the detection of structure–activity relationships relies on the accurate estimation and representation of chemical and toxicologic properties. In this paper we discuss the potential sources of errors associated with the identification of compounds, the representation of their structures, and the calculation of chemical descriptors. It is based on a case study where machine learning techniques were applied to data from noncongeneric compounds and a complex toxicologic end point (carcinogenicity). We propose methods applicable to the routine quality control of large chemical datasets, but our main intention is to raise awareness about this topic and to open a discussion about quality assurance in predictive toxicology. The accuracy and reproducibility of toxicity data will be reported in another paper
Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data
Audio Word2Vec offers vector representations of fixed dimensionality for
variable-length audio segments using Sequence-to-sequence Autoencoder (SA).
These vector representations are shown to describe the sequential phonetic
structures of the audio segments to a good degree, with real world applications
such as query-by-example Spoken Term Detection (STD). This paper examines the
capability of language transfer of Audio Word2Vec. We train SA from one
language (source language) and use it to extract the vector representation of
the audio segments of another language (target language). We found that SA can
still catch phonetic structure from the audio segments of the target language
if the source and target languages are similar. In query-by-example STD, we
obtain the vector representations from the SA learned from a large amount of
source language data, and found them surpass the representations from naive
encoder and SA directly learned from a small amount of target language data.
The result shows that it is possible to learn Audio Word2Vec model from
high-resource languages and use it on low-resource languages. This further
expands the usability of Audio Word2Vec.Comment: arXiv admin note: text overlap with arXiv:1603.0098
Recommended from our members
Toward Fast and Reliable Potential Energy Surfaces for Metallic Pt Clusters by Hierarchical Delta Neural Networks.
Data-driven machine learning force fields (MLFs) are more and more popular in atomistic simulations and exploit machine learning methods to predict energies and forces for unknown structures based on the knowledge learned from an existing reference database. The latter usually comes from density functional theory calculations. One main drawback of MLFs is that physical laws are not incorporated in the machine learning models, and instead, MLFs are designed to be very flexible to simulate complex quantum chemistry potential energy surface (PES). In general, MLFs have poor transferability, and hence, a very large trainset is required to span all the target feature space to get a reliable MLF. This procedure becomes more troublesome when the PES is complicated, with a large number of degrees of freedom, in which building a large database is inevitable and very expensive, especially when accurate but costly exchange-correlation functionals have to be used. In this manuscript, we exploit a high-dimensional neural network potential (HDNNP) on Pt clusters of sizes from 6 to 20 as one example. Our standard level of energy calculation is DFT GGA (PBE) using a plane wave basis set. We introduce an approximate but fast level with the PBE functional and a minimal atomic orbital basis set, and then, a more accurate but expensive level, using a hybrid functional or nonlocal vdW functional and a plane wave basis set, is reliably predicted by learning the difference with HDNNP. The results show that such a differential approach (named ΔHDNNP) can deliver very accurate predictions (error <10 meV/atom) in reference to converged basis set energies as well as more accurate but expensive xc functionals. The overall speedup can be as large as 900 for a 20 atom Pt cluster. More importantly, ΔHDNNP shows much better transferability due to the intrinsic smoothness of the delta potential energy surface, and accordingly, one can use much smaller trainset data to obtain better accuracy than the conventional HDNNP. A multilayer ΔHDNNP is thus proposed to obtain very accurate predictions versus expensive nonlocal vdW functional calculations in which the required trainset is further reduced. The approach can be easily generalized to any other machine learning methods and opens a path to study the structure and dynamics of Pt clusters and nanoparticles
A web-based teaching/learning environment to support collaborative knowledge construction in design
A web-based application has been developed as part of a recently completed research which proposed a conceptual framework to collect, analyze and compare different design experiences and to construct structured representations of the emerging knowledge in digital architectural design. The paper introduces the theoretical and practical development of this application as a teaching/learning environment which has significantly contributed to the development and testing of the ideas developed throughout the research. Later in the paper, the application of BLIP in two experimental (design) workshops is reported and evaluated according to the extent to which the application facilitates generation, modification and utilization of design knowledge
Semantic web technology for web-based teaching and learning: A roadmap
The World-Wide Web has become the predominant platform for computer-aided instruction. Contentorientation, access and interactive features have made the Web a successful technology. The Web, however, is still evolving. We expect in particular Semantic Web technology to substantially impact Web-based teaching and learning. In this paper, we
examine the potential of this technology and how we expect it to influence content representation and the work of the instructor and the learner
Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure
Big data research has attracted great attention in science, technology,
industry and society. It is developing with the evolving scientific paradigm,
the fourth industrial revolution, and the transformational innovation of
technologies. However, its nature and fundamental challenge have not been
recognized, and its own methodology has not been formed. This paper explores
and answers the following questions: What is big data? What are the basic
methods for representing, managing and analyzing big data? What is the
relationship between big data and knowledge? Can we find a mapping from big
data into knowledge space? What kind of infrastructure is required to support
not only big data management and analysis but also knowledge discovery, sharing
and management? What is the relationship between big data and science paradigm?
What is the nature and fundamental challenge of big data computing? A
multi-dimensional perspective is presented toward a methodology of big data
computing.Comment: 59 page
A conceptual architecture for interactive educational multimedia
Learning is more than knowledge acquisition; it often involves the active participation of the learner in a variety of knowledge- and skills-based learning and training activities. Interactive multimedia technology can support the variety of interaction channels and languages required to facilitate interactive learning and teaching.
A conceptual architecture for interactive educational multimedia can support the development of such multimedia systems. Such an architecture needs to embed multimedia technology into a coherent educational context. A framework based on an integrated interaction model is needed to capture learning and training activities in an online setting from an educational perspective, to describe them in the human-computer context, and to integrate them with mechanisms and principles of multimedia interaction
- …