7,171 research outputs found
Relation Discovery from Web Data for Competency Management
This paper describes a technique for automatically discovering associations between people and expertise from an analysis of very large data sources (including web pages, blogs and emails), using a family of algorithms that perform accurate named-entity recognition, assign different weights to terms according to an analysis of document structure, and access distances between terms in a document. My contribution is to add a social networking approach called BuddyFinder which relies on associations within a large enterprise-wide "buddy list" to help delimit the search space and also to provide a form of 'social triangulation' whereby the system can discover documents from your colleagues that contain pertinent information about you. This work has been influential in the information retrieval community generally, as it is the basis of a landmark system that achieved overall first place in every category in the Enterprise Search Track of TREC2006
A systematic literature review
Albuquerque, V., Dias, M. S., & Bacao, F. (2021). Machine learning approaches to bike-sharing systems: A systematic literature review. ISPRS International Journal of Geo-Information, 10(2), 1-25. [62]. https://doi.org/10.3390/ijgi10020062Cities are moving towards new mobility strategies to tackle smart cities’ challenges such as carbon emission reduction, urban transport multimodality and mitigation of pandemic hazards, emphasising on the implementation of shared modes, such as bike-sharing systems. This paper poses a research question and introduces a corresponding systematic literature review, focusing on machine learning techniques’ contributions applied to bike-sharing systems to improve cities’ mobility. The preferred reporting items for systematic reviews and meta-analyses (PRISMA) method was adopted to identify specific factors that influence bike-sharing systems, resulting in an analysis of 35 papers published between 2015 and 2019, creating an outline for future research. By means of systematic literature review and bibliometric analysis, machine learning algorithms were identified in two groups: classification and prediction.publishersversionpublishe
Social Search with Missing Data: Which Ranking Algorithm?
Online social networking tools are extremely popular, but can miss potential discoveries latent in the social 'fabric'. Matchmaking services which can do naive profile matching with old database technology are too brittle in the absence of key data, and even modern ontological markup, though powerful, can be onerous at data-input time. In this paper, we present a system called BuddyFinder which can automatically identify buddies who can best match a user's search requirements specified in a term-based query, even in the absence of stored user-profiles. We deploy and compare five statistical measures, namely, our own CORDER, mutual information (MI), phi-squared, improved MI and Z score, and two TF/IDF based baseline methods to find online users who best match the search requirements based on 'inferred profiles' of these users in the form of scavenged web pages. These measures identify statistically significant relationships between online users and a term-based query. Our user evaluation on two groups of users shows that BuddyFinder can find users highly relevant to search queries, and that CORDER achieved the best average ranking correlations among all seven algorithms and improved the performance of both baseline methods
Negative Statements Considered Useful
Knowledge bases (KBs), pragmatic collections of knowledge about notable entities, are an important asset in applications such as search, question answering and dialogue. Rooted in a long tradition in knowledge representation, all popular KBs only store positive information, while they abstain from taking any stance towards statements not contained in them. In this paper, we make the case for explicitly stating interesting statements which are not true. Negative statements would be important to overcome current limitations of question answering, yet due to their potential abundance, any effort towards compiling them needs a tight coupling with ranking. We introduce two approaches towards compiling negative statements. (i) In peer-based statistical inferences, we compare entities with highly related entities in order to derive potential negative statements, which we then rank using supervised and unsupervised features. (ii) In query-log-based text extraction, we use a pattern-based approach for harvesting search engine query logs. Experimental results show that both approaches hold promising and complementary potential. Along with this paper, we publish the first datasets on interesting negative information, containing over 1.1M statements for 100K popular Wikidata entities
Social Network Based Substance Abuse Prevention via Network Modification (A Preliminary Study)
Substance use and abuse is a significant public health problem in the United
States. Group-based intervention programs offer a promising means of preventing
and reducing substance abuse. While effective, unfortunately, inappropriate
intervention groups can result in an increase in deviant behaviors among
participants, a process known as deviancy training. This paper investigates the
problem of optimizing the social influence related to the deviant behavior via
careful construction of the intervention groups. We propose a Mixed Integer
Optimization formulation that decides on the intervention groups, captures the
impact of the groups on the structure of the social network, and models the
impact of these changes on behavior propagation. In addition, we propose a
scalable hybrid meta-heuristic algorithm that combines Mixed Integer
Programming and Large Neighborhood Search to find near-optimal network
partitions. Our algorithm is packaged in the form of GUIDE, an AI-based
decision aid that recommends intervention groups. Being the first quantitative
decision aid of this kind, GUIDE is able to assist practitioners, in particular
social workers, in three key areas: (a) GUIDE proposes near-optimal solutions
that are shown, via extensive simulations, to significantly improve over the
traditional qualitative practices for forming intervention groups; (b) GUIDE is
able to identify circumstances when an intervention will lead to deviancy
training, thus saving time, money, and effort; (c) GUIDE can evaluate current
strategies of group formation and discard strategies that will lead to deviancy
training. In developing GUIDE, we are primarily interested in substance use
interventions among homeless youth as a high risk and vulnerable population.
GUIDE is developed in collaboration with Urban Peak, a homeless-youth serving
organization in Denver, CO, and is under preparation for deployment
Bibliometric Analysis of Bioscience Trends Journal (2007-2017): Knowledge dynamics and visualization
BioScience Trends (BST) is a peer-reviewed journal belongs to the International Research and Cooperation Association for Bio & Socio-Sciences Advancement (IRCA-BSSA) Group of Japan. Despite a decade of existence, no study was performed to measure the bibliometric profile of the journal. The objective of this study was to investigate the bibliometric characteristic of BST. A bibliometric analysis will specifically measure: 1) growth rate of the scientific publications, 2) dynamics of authorship and collaboration pattern; 3) core research themes of articles that have been published, and 4) citation pattern of BST. Bibliographical archives of BST were obtained from the Core Collection database of the Web of Science (WoS). We divided the dataset into three interval periods, 2007-2010, 2011-2014 and 2015-2017 respectively. Data processing and analysis was performed using Bibliometrix, a bibliometric analysis package in R software, VOSViewer 1.66, Orange 3.15 and CitNetExplorer. Within one decade of scientific production, BST continues to attract global researchers in life sciences. However, it is still dominated by authors from China and Japan. Annual journal growth of BST is 12.83 %. Reaching the end of the first decade, number of first author and the country origin multiplied, 20 and 5 times respectively, compared to the first-year. Research themes are consistent with the Aims and Scope of the Journal with strong emphasizes on molecular biology, biochemistry, and clinical research. Entering the second decade, strategies to promote and enlarge authors participation from countries that are not in the current list are encouraged
- …