12,206 research outputs found
A Novel Framework for Highlight Reflectance Transformation Imaging
We propose a novel pipeline and related software tools for processing the multi-light image collections (MLICs) acquired in different application contexts to obtain shape and appearance information of captured surfaces, as well as to derive compact relightable representations of them. Our pipeline extends the popular Highlight Reflectance Transformation Imaging (H-RTI) framework, which is widely used in the Cultural Heritage domain. We support, in particular, perspective camera modeling, per-pixel interpolated light direction estimation, as well as light normalization correcting vignetting and uneven non-directional illumination. Furthermore, we propose two novel easy-to-use software tools to simplify all processing steps. The tools, in addition to support easy processing and encoding of pixel data, implement a variety of visualizations, as well as multiple reflectance-model-fitting options. Experimental tests on synthetic and real-world MLICs demonstrate the usefulness of the novel algorithmic framework and the potential benefits of the proposed tools for end-user applications.Terms: "European Union (EU)" & "Horizon 2020" / Action: H2020-EU.3.6.3. - Reflective societies - cultural heritage and European identity / Acronym: Scan4Reco / Grant number: 665091DSURF project (PRIN 2015) funded by the Italian Ministry of University and ResearchSardinian Regional Authorities under projects VIGEC and Vis&VideoLa
Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean
A new tightly coupled speech and natural language integration model is
presented for a TDNN-based continuous possibly large vocabulary speech
recognition system for Korean. Unlike popular n-best techniques developed for
integrating mainly HMM-based speech recognition and natural language processing
in a {\em word level}, which is obviously inadequate for morphologically
complex agglutinative languages, our model constructs a spoken language system
based on a {\em morpheme-level} speech and language integration. With this
integration scheme, the spoken Korean processing engine (SKOPE) is designed and
implemented using a TDNN-based diphone recognition module integrated with a
Viterbi-based lexical decoding and symbolic phonological/morphological
co-analysis. Our experiment results show that the speaker-dependent continuous
{\em eojeol} (Korean word) recognition and integrated morphological analysis
can be achieved with over 80.6% success rate directly from speech inputs for
the middle-level vocabularies.Comment: latex source with a4 style, 15 pages, to be published in computer
processing of oriental language journa
Cross-linguistic activation in bilingual sentence processing: the role of word class meaning
This study investigates how categorial (word class) semantics influences cross-linguistic interactions when reading in L2. Previous homograph studies paid little attention to the possible influence of different word classes in the stimulus material on cross-linguistic activation. The present study examines the word recognition performance of Dutch-English bilinguals who performed a lexical decision task to word targets appearing in a sentence. To determine the influence of word class meaning, the critical words either showed a word class overlap (e. g. the homograph tree [ noun], which means "step" in Dutch) or not (e.g. big [ADJ], which is a noun in Dutch meaning "piglet"). In the condition of word class overlap, a facilitation effect was observed, suggesting that both languages were active. When there was no word class overlap, the facilitation effect disappeared. This result suggests that categorial meaning affects the word recognition process of bilinguals
Space and camera path reconstruction for omni-directional vision
In this paper, we address the inverse problem of reconstructing a scene as
well as the camera motion from the image sequence taken by an omni-directional
camera. Our structure from motion results give sharp conditions under which the
reconstruction is unique. For example, if there are three points in general
position and three omni-directional cameras in general position, a unique
reconstruction is possible up to a similarity. We then look at the
reconstruction problem with m cameras and n points, where n and m can be large
and the over-determined system is solved by least square methods. The
reconstruction is robust and generalizes to the case of a dynamic environment
where landmarks can move during the movie capture. Possible applications of the
result are computer assisted scene reconstruction, 3D scanning, autonomous
robot navigation, medical tomography and city reconstructions
Coalescent Assimilation Across Wordboundaries in American English and in Polish English
Coalescent assimilation (CA), where alveolar obstruents /t, d, s, z/ in word-final position merge with word-initial /j/ to produce postalveolar /tʃ, dʒ, ʃ, ʒ/, is one of the most wellknown connected speech processes in English. Due to its commonness, CA has been discussed in numerous textbook descriptions of English pronunciation, and yet, upon comparing them it is difficult to get a clear picture of what factors make its application likely. This paper aims to investigate the application of CA in American English to see a) what factors increase the likelihood of its application for each of the four alveolar obstruents, and b) what is the allophonic realization of plosives /t, d/ if the CA does not apply. To do so, the Buckeye Corpus (Pitt et al. 2007) of spoken American English is analyzed quantitatively. As a second step, these results are compared with Polish English; statistics analogous to the ones listed above for American English are gathered for Polish English based on the PLEC corpus (Pęzik 2012). The last section focuses on what consequences for teaching based on a native speaker model the findings have. It is argued that a description of the phenomenon that reflects the behavior of speakers of American English more accurately than extant textbook accounts could be beneficial to the acquisition of these patterns
Photometric stereo for strong specular highlights
Photometric stereo (PS) is a fundamental technique in computer vision known
to produce 3-D shape with high accuracy. The setting of PS is defined by using
several input images of a static scene taken from one and the same camera
position but under varying illumination. The vast majority of studies in this
3-D reconstruction method assume orthographic projection for the camera model.
In addition, they mainly consider the Lambertian reflectance model as the way
that light scatters at surfaces. So, providing reliable PS results from real
world objects still remains a challenging task. We address 3-D reconstruction
by PS using a more realistic set of assumptions combining for the first time
the complete Blinn-Phong reflectance model and perspective projection. To this
end, we will compare two different methods of incorporating the perspective
projection into our model. Experiments are performed on both synthetic and real
world images. Note that our real-world experiments do not benefit from
laboratory conditions. The results show the high potential of our method even
for complex real world applications such as medical endoscopy images which may
include high amounts of specular highlights
3D head tracking using normal flow constraints in a vehicle environment
Head tracking is a key component in applications such as human computer interaction, person monitoring, driver monitoring, video conferencing, and object-based compression. The motion of a driver’s head can tell us a lot about his/her mental state; e.g. whether he/she is drowsy, alert, aggressive,
comfortable, tense, distracted, etc. This paper reviews an optical flow based method to track the head pose, both orientation and position, of a person and presents results from real world data recorded in a car environment
- …