Search CORE

91,969 research outputs found

Second language learning in the context of MOOCs

Author: Fitzgerald Alannah
Witten Ian H.
Wu Shaoqun
Publication venue: 'Scitepress'
Publication date: 01/01/2014
Field of study

Massive Open Online Courses are becoming popular educational vehicles through which universities reach out to non-traditional audiences. Many enrolees hail from other countries and cultures, and struggle to cope with the English language in which these courses are invariably offered. Moreover, most such learners have a strong desire and motivation to extend their knowledge of academic English, particularly in the specific area addressed by the course. Online courses provide a compelling opportunity for domain-specific language learning. They supply a large corpus of interesting linguistic material relevant to a particular area, including supplementary images (slides), audio and video. We contend that this corpus can be automatically analysed, enriched, and transformed into a resource that learners can browse and query in order to extend their ability to understand the language used, and help them express themselves more fluently and eloquently in that domain. To illustrate this idea, an existing online corpus-based language learning tool (FLAX) is applied to a Coursera MOOC entitled Virology 1: How Viruses Work, offered by Columbia University

Concordia University Research Repository

Research Commons@Waikato

Finding Person Relations in Image Data of the Internet Archive

Author: A Gangemi
A Moro
C Ding
I Masi
L Best-Rowden
R Navigli
Y Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/05/2019
Field of study

The multimedia content in the World Wide Web is rapidly growing and contains valuable information for many applications in different domains. For this reason, the Internet Archive initiative has been gathering billions of time-versioned web pages since the mid-nineties. However, the huge amount of data is rarely labeled with appropriate metadata and automatic approaches are required to enable semantic search. Normally, the textual content of the Internet Archive is used to extract entities and their possible relations across domains such as politics and entertainment, whereas image and video content is usually neglected. In this paper, we introduce a system for person recognition in image content of web news stored in the Internet Archive. Thus, the system complements entity recognition in text and allows researchers and analysts to track media coverage and relations of persons more precisely. Based on a deep learning face recognition approach, we suggest a system that automatically detects persons of interest and gathers sample material, which is subsequently used to identify them in the image data of the Internet Archive. We evaluate the performance of the face recognition system on an appropriate standard benchmark dataset and demonstrate the feasibility of the approach with two use cases

arXiv.org e-Print Archive

Crossref

Fast and Lean Immutable Multi-Maps on the JVM based on Heterogeneous Hash-Array Mapped Tries

Author: Steindorfer Michael J.
Vinju Jurgen J.
Publication venue
Publication date: 02/08/2016
Field of study

An immutable multi-map is a many-to-many thread-friendly map data structure with expected fast insert and lookup operations. This data structure is used for applications processing graphs or many-to-many relations as applied in static analysis of object-oriented systems. When processing such big data sets the memory overhead of the data structure encoding itself is a memory usage bottleneck. Motivated by reuse and type-safety, libraries for Java, Scala and Clojure typically implement immutable multi-maps by nesting sets as the values with the keys of a trie map. Like this, based on our measurements the expected byte overhead for a sparse multi-map per stored entry adds up to around 65B, which renders it unfeasible to compute with effectively on the JVM. In this paper we propose a general framework for Hash-Array Mapped Tries on the JVM which can store type-heterogeneous keys and values: a Heterogeneous Hash-Array Mapped Trie (HHAMT). Among other applications, this allows for a highly efficient multi-map encoding by (a) not reserving space for empty value sets and (b) inlining the values of singleton sets while maintaining a (c) type-safe API. We detail the necessary encoding and optimizations to mitigate the overhead of storing and retrieving heterogeneous data in a hash-trie. Furthermore, we evaluate HHAMT specifically for the application to multi-maps, comparing them to state-of-the-art encodings of multi-maps in Java, Scala and Clojure. We isolate key differences using microbenchmarks and validate the resulting conclusions on a real world case in static analysis. The new encoding brings the per key-value storage overhead down to 30B: a 2x improvement. With additional inlining of primitive values it reaches a 4x improvement

arXiv.org e-Print Archive

CWI's Institutional Repository

Clear Visual Separation of Temporal Event Sequences

Author: Grønbæk Kaj
Mathisen Andreas
Publication venue
Publication date: 17/10/2017
Field of study

Extracting and visualizing informative insights from temporal event sequences becomes increasingly difficult when data volume and variety increase. Besides dealing with high event type cardinality and many distinct sequences, it can be difficult to tell whether it is appropriate to combine multiple events into one or utilize additional information about event attributes. Existing approaches often make use of frequent sequential patterns extracted from the dataset, however, these patterns are limited in terms of interpretability and utility. In addition, it is difficult to assess the role of absolute and relative time when using pattern mining techniques. In this paper, we present methods that addresses these challenges by automatically learning composite events which enables better aggregation of multiple event sequences. By leveraging event sequence outcomes, we present appropriate linked visualizations that allow domain experts to identify critical flows, to assess validity and to understand the role of time. Furthermore, we explore information gain and visual complexity metrics to identify the most relevant visual patterns. We compare composite event learning with two approaches for extracting event patterns using real world company event data from an ongoing project with the Danish Business Authority.Comment: In Proceedings of the 3rd IEEE Symposium on Visualization in Data Science (VDS), 201

arXiv.org e-Print Archive

Crossref

Tooth characters of protohippine horses with special reference to species from the Merychippus zone, California

Author: Bode Francis D.
Publication venue: Carnegie institution of Washington
Publication date: 20/12/1934
Field of study

The critical review of equine tooth characters attempted in this paper is the result of a study of the protohippine horses obtained from the Merychippus Zone of the north Coalinga district, California. During the conduct of extensive excavations in this zone since 1928 by the California Institute, more than two thousand teeth of the genus Merychippus have been collected. In addition to the types represented by the equine material, a number of associated land mammals have been secured. The faunal list, which includes some fifteen species, suggests that this locality occupies a stratigraphic position approximately late middle Miocene in age. The variation displayed in the dental characters of the merychippine material from the Merychippus Zone necessitated comparisons with cheek-teeth of Equidae from practically all of the Miocene formations furnishing vertebrate remains in the Pacific Coast and Great Basin Provinces. A comprehensive study of these collections clearly demonstrates that many of the cheek-tooth characters employed in the description of type specimens of fossil horses are variable to an extent which renders them unreliable in a determination of species. The variation of these characters within a large collection also indicates that it is possible for teeth referable to a particular species to have a wider stratigraphic range than has been hitherto appreciated. The conclusion is reached that the presence of a species has less value in reaching an age determination of the strata in which it occurs than evidence furnished by an association of several species

Caltech Authors

Depth Fields: Extending Light Field Techniques to Time-of-Flight Imaging

Author: Jayasuriya Suren
Molnar Alyosha
Pediredla Adithya
Sivaramakrishnan Sriram
Veeraraghavan Ashok
Publication venue
Publication date: 02/09/2015
Field of study

A variety of techniques such as light field, structured illumination, and time-of-flight (TOF) are commonly used for depth acquisition in consumer imaging, robotics and many other applications. Unfortunately, each technique suffers from its individual limitations preventing robust depth sensing. In this paper, we explore the strengths and weaknesses of combining light field and time-of-flight imaging, particularly the feasibility of an on-chip implementation as a single hybrid depth sensor. We refer to this combination as depth field imaging. Depth fields combine light field advantages such as synthetic aperture refocusing with TOF imaging advantages such as high depth resolution and coded signal processing to resolve multipath interference. We show applications including synthesizing virtual apertures for TOF imaging, improved depth mapping through partial and scattering occluders, and single frequency TOF phase unwrapping. Utilizing space, angle, and temporal coding, depth fields can improve depth sensing in the wild and generate new insights into the dimensions of light's plenoptic function.Comment: 9 pages, 8 figures, Accepted to 3DV 201

arXiv.org e-Print Archive

CiteSeerX

Crossref