243 research outputs found
A Coherent Unsupervised Model for Toponym Resolution
Toponym Resolution, the task of assigning a location mention in a document to
a geographic referent (i.e., latitude/longitude), plays a pivotal role in
analyzing location-aware content. However, the ambiguities of natural language
and a huge number of possible interpretations for toponyms constitute
insurmountable hurdles for this task. In this paper, we study the problem of
toponym resolution with no additional information other than a gazetteer and no
training data. We demonstrate that a dearth of large enough annotated data
makes supervised methods less capable of generalizing. Our proposed method
estimates the geographic scope of documents and leverages the connections
between nearby place names as evidence to resolve toponyms. We explore the
interactions between multiple interpretations of mentions and the relationships
between different toponyms in a document to build a model that finds the most
coherent resolution. Our model is evaluated on three news corpora, two from the
literature and one collected and annotated by us; then, we compare our methods
to the state-of-the-art unsupervised and supervised techniques. We also examine
three commercial products including Reuters OpenCalais, Yahoo! YQL Placemaker,
and Google Cloud Natural Language API. The evaluation shows that our method
outperforms the unsupervised technique as well as Reuters OpenCalais and Google
Cloud Natural Language API on all three corpora; also, our method shows a
performance close to that of the state-of-the-art supervised method and
outperforms it when the test data has 40% or more toponyms that are not seen in
the training data.Comment: 9 pages (+1 page reference), WWW '18 Proceedings of the 2018 World
Wide Web Conferenc
Examining Scientific Writing Styles from the Perspective of Linguistic Complexity
Publishing articles in high-impact English journals is difficult for scholars
around the world, especially for non-native English-speaking scholars (NNESs),
most of whom struggle with proficiency in English. In order to uncover the
differences in English scientific writing between native English-speaking
scholars (NESs) and NNESs, we collected a large-scale data set containing more
than 150,000 full-text articles published in PLoS between 2006 and 2015. We
divided these articles into three groups according to the ethnic backgrounds of
the first and corresponding authors, obtained by Ethnea, and examined the
scientific writing styles in English from a two-fold perspective of linguistic
complexity: (1) syntactic complexity, including measurements of sentence length
and sentence complexity; and (2) lexical complexity, including measurements of
lexical diversity, lexical density, and lexical sophistication. The
observations suggest marginal differences between groups in syntactical and
lexical complexity.Comment: 6 figure
A Google trends spatial clustering approach for a worldwide Twitter user geolocation
User location data is valuable for diverse social media analytics. In this paper, we address the non-trivial task of estimating a worldwide city-level Twitter user location considering only historical tweets. We propose a purely unsupervised approach that is based on a synthetic geographic sampling of Google Trends (GT) city-level frequencies of tweet nouns and three clustering algorithms. The approach was validated empirically by using a recently collected dataset, with 3,268 worldwide city-level locations of Twitter users, obtaining competitive results when compared with a state-of-the-art Word Distribution (WD) user location estimation method. The best overall results were achieved by the GT noun DBSCAN (GTN-DB) method, which is computationally fast, and correctly predicts the ground truth locations of 15%, 23%, 39% and 58% of the users for tolerance distances of 250 km, 500 km, 1,000 km and 2,000 km.The work of P. Cortez was supported by FCT â Funda ̧c Ìao para a CiËencia eTecnologia within the R&D Units Project Scope: UIDB/00319/2020. We wouldalso like to thank the anonymous reviewers for their helpful suggestions
A Survey on Cross-domain Recommendation: Taxonomies, Methods, and Future Directions
Traditional recommendation systems are faced with two long-standing
obstacles, namely, data sparsity and cold-start problems, which promote the
emergence and development of Cross-Domain Recommendation (CDR). The core idea
of CDR is to leverage information collected from other domains to alleviate the
two problems in one domain. Over the last decade, many efforts have been
engaged for cross-domain recommendation. Recently, with the development of deep
learning and neural networks, a large number of methods have emerged. However,
there is a limited number of systematic surveys on CDR, especially regarding
the latest proposed methods as well as the recommendation scenarios and
recommendation tasks they address. In this survey paper, we first proposed a
two-level taxonomy of cross-domain recommendation which classifies different
recommendation scenarios and recommendation tasks. We then introduce and
summarize existing cross-domain recommendation approaches under different
recommendation scenarios in a structured manner. We also organize datasets
commonly used. We conclude this survey by providing several potential research
directions about this field
Luck of the Draw III: Using AI to Examine DecisionâMaking in Federal Court Stays of Removal
This article examines decisionâmaking in Federal Court of Canada immigration law applications for stays of removal, focusing on how the rates at which stays are granted depend on which judge decides the case. The article deploys a form of computational natural language processing, using a largeâlanguage model machine learning process (GPTâ3) to extract data from online Federal Court dockets. The article reviews patterns in outcomes in thousands of stay of removal applications identified through this process and reveals a wide range in stay grant rates across many judges. The article argues that the Federal Court should take measures to encourage more consistency in stay decisionâmaking and cautions against relying heavily on stays of removal to ensure that deportation complies with constitutional procedural justice protections. The article is also a demonstration of how machine learning can be used to pursue empirical legal research projects that would have been costâprohibitive or technically challenging only a few years ago â and shows how technology that is increasingly used to enhance the power of the state at the expense of marginalized migrants can instead be used to scrutinize legal decisionâmaking in the immigration law field, hopefully in ways that enhance the rights of migrants. The article also contributes to the broader field of computational legal research in Canada by making available to other nonâcommercial researchers the code used for the project, as well as a large dataset of Federal Court dockets
Designing semantic Application Programming Interfaces for open government data
Many countries currently maintain a national data catalog, which provides access to the
available datasets â sometimes via an Application Programming Interface (API). These APIs play a
crucial role in realizing the benefits of open data as they are the means by which data is
discovered and accessed by applications that make use of it. This article proposes semantic APIs
as a way of improving access to open data. A semantic API helps to retrieve datasets according to
their type (e.g., sensor, climate, finance), and facilitates reasoning about and learning from
data. The article examines categories of open datasets from 40 European open data catalogs to
gather some insights into types of datasets which should be considered while building semantic
APIs for open government data. The results show that the probability of inter-country agreement
between open data catalogs is less than 30 percent, and that few categories stand out as
candidates for a transnational semantic API. They stress the need for coordination - at the local,
regional, and national level - between data providers of Germany, France, Spain, and the United
Kingdom.The authors gratefully acknowledge funding from the European Union through
the GEO-C project (H2020-MSCA-ITN-2014, Grant Agreement Number 642332, http://www.geoc.
eu/). Carlos Granell has been funded by the RamĂłn y Cajal Programme (grant number RYC-
2014-16913). Sergio Trilles has been funded by the postdoctoral programme Vali+d (GVA) (grant
number APOSTD/2016/058)
Educational Technology and Related Education Conferences for June to December 2011
This potpourri of educational technology conferences includes gems such as âSaving Your Organisation from Boring eLearningâ and âLessons and Insights from Ten eLearning Mastersâ. And, if you wish, you can âBe an Open Learning Heroâ. You will also find that the number of mobile learning conferences (and conferences that have a mobile learning component) have increased significantly. Countries such as China, Indonesia, Japan, and Thailand have shown a keen interest in mobile learning.
It would be impossible for you to be present at all the conferences that you would like to attend. But, you could go to the conference website/url during and after the conference. Many conference organizers post abstracts, full papers, and/or videos of conference presentations. Thus, you can visit the conference virtually and may encounter information and contacts that would be useful in your work.
The list below covers selected events focused primarily on the use of technology in educational settings and on teaching, learning, and educational administration. Only listings until December 2011 are complete as dates, locations, or URLs are not available for a number of events held after December 2011. But, take a look at the conference organizers who planned ahead in 2012.
A Word 2003 format is used to enable people who do not have access to Word 2007 or higher version and those with limited or high-cost Internet access to find a conference that is congruent with their interests or obtain conference proceedings. (If you are seeking a more interactive listing, refer to online conference sites.) Consider using the âFindâ tool under Microsoft Wordâs âEditâ tab or similar tab in OpenOffice to locate the name of a particular conference, association, city, or country. If you enter the country âAustraliaâ or âSingaporeâ in the âFindâ tool, all conferences that occur in Australia or Singapore will be highlighted. Or, enter the word âresearchâ. Then, âcut and pasteâ a list of suitable events for yourself and your colleagues.
Please note that events, dates, titles, and locations may change; thus, CHECK the specific conference website. Note also that some events will be cancelled at a later date. All Internet addresses were verified at the time of publication. No liability is assumed for any errors that may have been introduced inadvertently during the assembly of this conference list. If possible, do not remove the contact information when you re-distribute the list as that is how I receive updates and corrections. If you mount the list on the web, please note its source
Exploring attributes, sequences, and time in Recommender Systems: From classical to Point-of-Interest recommendation
Tesis Doctoral inĂ©dita leĂda en la Universidad AutĂłnoma de Madrid, Escuela PolitĂ©cnica Superior, Departamento de Ingenieria InformĂĄtica. Fecha de lectura: 08-07-2021Since the emergence of the Internet and the spread of digital communications
throughout the world, the amount of data stored on the Web has been
growing exponentially. In this new digital era, a large number of companies
have emerged with the purpose of ltering the information available on the
web and provide users with interesting items. The algorithms and models
used to recommend these items are called Recommender Systems. These
systems are applied to a large number of domains, from music, books, or
movies to dating or Point-of-Interest (POI), which is an increasingly popular
domain where users receive recommendations of di erent places when
they arrive to a city.
In this thesis, we focus on exploiting the use of contextual information, especially
temporal and sequential data, and apply it in novel ways in both
traditional and Point-of-Interest recommendation. We believe that this type
of information can be used not only for creating new recommendation models
but also for developing new metrics for analyzing the quality of these
recommendations. In one of our rst contributions we propose di erent
metrics, some of them derived from previously existing frameworks, using
this contextual information. Besides, we also propose an intuitive algorithm
that is able to provide recommendations to a target user by exploiting the
last common interactions with other similar users of the system.
At the same time, we conduct a comprehensive review of the algorithms
that have been proposed in the area of POI recommendation between 2011
and 2019, identifying the common characteristics and methodologies used.
Once this classi cation of the algorithms proposed to date is completed, we
design a mechanism to recommend complete routes (not only independent
POIs) to users, making use of reranking techniques. In addition, due to the
great di culty of making recommendations in the POI domain, we propose
the use of data aggregation techniques to use information from di erent
cities to generate POI recommendations in a given target city.
In the experimental work we present our approaches on di erent datasets
belonging to both classical and POI recommendation. The results obtained
in these experiments con rm the usefulness of our recommendation proposals,
in terms of ranking accuracy and other dimensions like novelty, diversity,
and coverage, and the appropriateness of our metrics for analyzing temporal
information and biases in the recommendations producedDesde la aparici on de Internet y la difusi on de las redes de comunicaciones
en todo el mundo, la cantidad de datos almacenados en la red ha crecido
exponencialmente. En esta nueva era digital, han surgido un gran n umero
de empresas con el objetivo de ltrar la informaci on disponible en la red
y ofrecer a los usuarios art culos interesantes. Los algoritmos y modelos
utilizados para recomendar estos art culos reciben el nombre de Sistemas de
Recomendaci on. Estos sistemas se aplican a un gran n umero de dominios,
desde m usica, libros o pel culas hasta las citas o los Puntos de Inter es (POIs,
en ingl es), un dominio cada vez m as popular en el que los usuarios reciben
recomendaciones de diferentes lugares cuando llegan a una ciudad.
En esta tesis, nos centramos en explotar el uso de la informaci on contextual,
especialmente los datos temporales y secuenciales, y aplicarla de forma novedosa
tanto en la recomendaci on cl asica como en la recomendaci on de POIs.
Creemos que este tipo de informaci on puede utilizarse no s olo para crear
nuevos modelos de recomendaci on, sino tambi en para desarrollar nuevas
m etricas para analizar la calidad de estas recomendaciones. En una de
nuestras primeras contribuciones proponemos diferentes m etricas, algunas
derivadas de formulaciones previamente existentes, utilizando esta informaci
on contextual. Adem as, proponemos un algoritmo intuitivo que es
capaz de proporcionar recomendaciones a un usuario objetivo explotando
las ultimas interacciones comunes con otros usuarios similares del sistema.
Al mismo tiempo, realizamos una revisi on exhaustiva de los algoritmos que
se han propuesto en el a mbito de la recomendaci o n de POIs entre 2011 y
2019, identi cando las caracter sticas comunes y las metodolog as utilizadas.
Una vez realizada esta clasi caci on de los algoritmos propuestos hasta la
fecha, dise~namos un mecanismo para recomendar rutas completas (no s olo
POIs independientes) a los usuarios, haciendo uso de t ecnicas de reranking.
Adem as, debido a la gran di cultad de realizar recomendaciones en el
ambito de los POIs, proponemos el uso de t ecnicas de agregaci on de datos
para utilizar la informaci on de diferentes ciudades y generar recomendaciones
de POIs en una determinada ciudad objetivo.
En el trabajo experimental presentamos nuestros m etodos en diferentes
conjuntos de datos tanto de recomendaci on cl asica como de POIs. Los
resultados obtenidos en estos experimentos con rman la utilidad de nuestras
propuestas de recomendaci on en t erminos de precisi on de ranking y de
otras dimensiones como la novedad, la diversidad y la cobertura, y c omo de
apropiadas son nuestras m etricas para analizar la informaci on temporal y
los sesgos en las recomendaciones producida
Making Sense of Document Collections with Map-Based Visualizations
As map-based visualizations of documents become more ubiquitous, there is a greater need for them to support intellectual and creative high-level cognitive activities with collections of non-cartographic materials -- documents. This dissertation concerns the conceptualization of map-based visualizations as tools for sensemaking and collection understanding. As such, map-based visualizations would help people use georeferenced documents to develop understanding, gain insight, discover knowledge, and construct meaning. This dissertation explores the role of graphical representations (such as maps, Kohonen maps, pie charts, and other) and interactions with them for developing map-based visualizations capable of facilitating sensemaking activities such as collection understanding. While graphical representations make document collections more perceptually and cognitively accessible, interactions allow users to adapt representations to usersâ contextual needs. By interacting with representations of documents or collections and being able to construct representations of their own, people are better able to make sense of information, comprehend complex structures, and integrate new information into their existing mental models. In sum, representations and interactions may reduce cognitive load and consequently expedite the overall time necessary for completion of sensemaking activities, which typically take much time to accomplish. The dissertation proceeds in three phases. The first phase develops a conceptual framework for translating ontological properties of collections to representations and for supporting visual tasks by means of graphical representations. The second phase concerns the cognitive benefits of interaction. It conceptualizes how interactions can help people during complex sensemaking activities. Although the interactions are explained on the example of a prototype built with Google Maps, they are independent iv of Google Maps and can be applicable to various other technologies. The third phase evaluates the utility, analytical capabilities and usability of the additional representations when users interact with a visualization prototype â VIsual COLlection EXplorer. The findings suggest that additional representations can enhance understanding of map-based visualizations of library collections: specifically, they can allow users to see trends, gaps, and patterns in ontological properties of collections
Languages of games and play: A systematic mapping study
Digital games are a powerful means for creating enticing, beautiful, educational, and often highly addictive interactive experiences that impact the lives of billions of players worldwide. We explore what informs the design and construction of good games to learn how to speed-up game development. In particular, we study to what extent languages, notations, patterns, and tools, can offer experts theoretical foundations, systematic techniques, and practical solutions they need to raise their productivity and improve the quality of games and play. Despite the growing number of publications on this topic there is currently no overview describing the state-of-the-art that relates research areas, goals, and applications. As a result, efforts and successes are often one-off, lessons learned go overlooked, language reuse remains minimal, and opportunities for collaboration and synergy are lost. We present a systematic map that identifies relevant publications and gives an overview of research areas and publication venues. In addition, we categorize research perspectives along common objectives, techniques, and approaches, illustrated by summaries of selected languages. Finally, we distill challenges and opportunities for future research and development
- âŠ