325 research outputs found

    Weighted Random Walk Sampling for Multi-Relational Recommendation

    Full text link
    In the information overloaded web, personalized recommender systems are essential tools to help users find most relevant information. The most heavily-used recommendation frameworks assume user interactions that are characterized by a single relation. However, for many tasks, such as recommendation in social networks, user-item interactions must be modeled as a complex network of multiple relations, not only a single relation. Recently research on multi-relational factorization and hybrid recommender models has shown that using extended meta-paths to capture additional information about both users and items in the network can enhance the accuracy of recommendations in such networks. Most of this work is focused on unweighted heterogeneous networks, and to apply these techniques, weighted relations must be simplified into binary ones. However, information associated with weighted edges, such as user ratings, which may be crucial for recommendation, are lost in such binarization. In this paper, we explore a random walk sampling method in which the frequency of edge sampling is a function of edge weight, and apply this generate extended meta-paths in weighted heterogeneous networks. With this sampling technique, we demonstrate improved performance on multiple data sets both in terms of recommendation accuracy and model generation efficiency

    Fairness in Information Access Systems

    Get PDF
    Recommendation, information retrieval, and other information access systems pose unique challenges for investigating and applying the fairness and non-discrimination concepts that have been developed for studying other machine learning systems. While fair information access shares many commonalities with fair classification, the multistakeholder nature of information access applications, the rank-based problem setting, the centrality of personalization in many cases, and the role of user response complicate the problem of identifying precisely what types and operationalizations of fairness may be relevant, let alone measuring or promoting them. In this monograph, we present a taxonomy of the various dimensions of fair information access and survey the literature to date on this new and rapidly-growing topic. We preface this with brief introductions to information access and algorithmic fairness, to facilitate use of this work by scholars with experience in one (or neither) of these fields who wish to learn about their intersection. We conclude with several open problems in fair information access, along with some suggestions for how to approach research in this space

    Exploring Author Gender in Book Rating and Recommendation

    Get PDF
    Collaborative filtering algorithms find useful patterns in rating and consumption data and exploit these patterns to guide users to good items. Many of the patterns in rating datasets reflect important real-world differences between the various users and items in the data; other patterns may be irrelevant or possibly undesirable for social or ethical reasons, particularly if they reflect undesired discrimination, such as gender or ethnic discrimination in publishing. In this work, we examine the response of collaborative filtering recommender algorithms to the distribution of their input data with respect to a dimension of social concern, namely content creator gender. Using publicly-available book ratings data, we measure the distribution of the genders of the authors of books in user rating profiles and recommendation lists produced from this data. We find that common collaborative filtering algorithms differ in the gender distribution of their recommendation lists, and in the relationship of that output distribution to user profile distribution

    Searching for transits in the Wide Field Camera Transit Survey with difference-imaging light curves

    Get PDF
    The Wide Field Camera Transit Survey is a pioneer program aiming at for searching extra-solar planets in the near-infrared. The images from the survey are processed by a data reduction pipeline, which uses aperture photometry to construct the light curves. We produce an alternative set of light curves using the difference-imaging method for the most complete field in the survey and carry out a quantitative comparison between the photometric precision achieved with both methods. The results show that differencephotometry light curves present an important improvement for stars with J > 16. We report an implementation on the box-fitting transit detection algorithm, which performs a trapezoid-fit to the folded light curve, providing more accurate results than the boxfitting model. We describe and optimize a set of selection criteria to search for transit candidates, including the V-shape parameter calculated by our detection algorithm. The optimized selection criteria are applied to the aperture photometry and difference-imaging light curves, resulting in the automatic detection of the best 200 transit candidates from a sample of ~475 000 sources. We carry out a detailed analysis in the 18 best detections and classify them as transiting planet and eclipsing binary candidates. We present one planet candidate orbiting a late G-type star. No planet candidate around M-stars has been found, confirming the null detection hypothesis and upper limits on the occurrence rate of short-period giant planets around M-dwarfs presented in a prior study. We extend the search for transiting planets to stars with J ≤ 18, which enables us to set a stricter upper limit of 1.1%. Furthermore, we present the detection of five faint extremely-short period eclipsing binaries and three M-dwarf/M-dwarf binary candidates. The detections demonstrate the benefits of using the difference-imaging light curves, especially when going to fainter magnitudes.Peer reviewe

    Warp signatures of the Galactic disk as seen in mid infrared from Midcourse Space Experiment

    Get PDF
    The gross features in the distribution of stars as well as warm (T >~ 100 K) interstellar dust in the Galactic disk have been investigated using the recent mid infrared survey by Midcourse Space Experiment (MSX) at 8, 12, 14 & 21 micron bands. An attempt has been made to determine the location of the Galactic mid-plane at various longitudes, using two approaches : (i) fitting exponential functions to the latitude profiles and (ii) statistical indicators.The former method is successful for the inner Galaxy (-90 < l < 90), and quantifies characteristic angular scales along latitude, which have been translated to linear scale heights (z_h) and radial length scales (R_l) using geometric description of the Galactic disk. The distribution of warm dust in the Galactic disk is found to be characterised by R_l < 6 kpc and 60 < z_h <~ 100 pc, in agreement with other studies. The location of the Galactic mid-plane as a function of longitude, for stars as well as warm dust, has been searched for signatures of warp-like feature in their distribution, by fitting sinusoid with phase and amplitude as parameters. In every case, the warp signature has been detected. An identical analysis of the DIRBE/COBE data in all its ten bands covering the entire infrared spectrum (1.25-240 micron), also leads to detection of warp signatures with very similar phase as found from the MSX data. Our results have been compared with those from other studies.Comment: To be published in 'Astronomy and Astrophysics' (12 pages including 9 figures & 4 tables

    Accuracy of optical spectroscopy for the detection of cervical intraepithelial neoplasia without colposcopic tissue information; a step toward automation for low resource settings

    Get PDF
    Optical spectroscopy has been proposed as an accurate and low-cost alternative for detection of cervical intraepithelial neoplasia. We previously published an algorithm using optical spectroscopy as an adjunct to colposcopy and found good accuracy (sensitivity ¼ 1.00 [95% confidence interval ðCIÞ ¼ 0.92 to 1.00], specificity ¼ 0.71 [95% CI ¼ 0.62 to 0.79]). Those results used measurements taken by expert colposcopists as well as the colposcopy diagnosis. In this study, we trained and tested an algorithm for the detection of cervical intraepithelial neoplasia (i.e., identifying those patients who had histology reading CIN 2 or worse) that did not include the colposcopic diagnosis. Furthermore, we explored the interaction between spectroscopy and colposcopy, examining the importance of probe placement expertise. The colposcopic diagnosis-independent spectroscopy algorithm had a sensitivity of 0.98 (95% CI ¼ 0.89 to 1.00) and a specificity of 0.62 (95% CI ¼ 0.52 to 0.71). The difference in the partial area under the ROC curves between spectroscopy with and without the colposcopic diagnosis was statistically significant at the patient level (p ¼ 0.05) but not the site level (p ¼ 0.13). The results suggest that the device has high accuracy over a wide range of provider accuracy and hence could plausibly be implemented by providers with limited training

    Strategies and challenges associated with recruiting retirement village communities and residents into a group exercise intervention

    Get PDF
    Background: Randomized controlled trials (RCTs) provide the highest level of scientific evidence, but successful participant recruitment is critical to ensure the external and internal validity of results. This study describes the strategies associated with recruiting older adults at increased falls risk residing in retirement villages into an 18-month cluster RCT designed to evaluate the effects of a dual-task exercise program on falls and physical and cognitive function. Methods: Recruitment of adults aged ≥65 at increased falls risk residing within retirement villages (size 60–350 residents) was initially designed to occur over 12 months using two distinct cohorts (C). Recruitment occurred via a three-stage approach that included liaising with: 1) village operators, 2) independent village managers, and 3) residents. To recruit residents, a variety of different approaches were used, including distribution of information pack, on-site presentations, free muscle and functional testing, and posters displayed in common areas. Results: Due to challenges with recruitment, three cohorts were established between February 2014 and April 2015 (14 months). Sixty retirement villages were initially invited, of which 32 declined or did not respond, leaving 28 villages that expressed interest. A total of 3947 individual letters of invitation were subsequently distributed to residents of these villages, from which 517 (13.1%) expressions of interest (EOI) were received. Across three cohorts with different recruitment strategies adopted there were only modest differences in the number of EOI received (10.5 to 15.3%), which suggests that no particular recruitment approach was most effective. Following the initial screening of these residents, 398 (77.0%) participants were deemed eligible to participate, but a final sample of 300 (58.0% of the 517 EOI) consented and was randomized; 7.6% of the 3947 residents invited. Principal reasons for not participating, despite being eligible, were poor health, lack of time and no GP approval. Conclusion: This study highlights that there are significant challenges associated with recruiting sufficient numbers of older adults from independent living retirement villages into an exercise intervention designed to improve health and well-being. Trial registration: Australian New Zealand Clinical Trials Registry: ACTRN12613001 161718. Date registered 23rd October 2013

    Implications of climate change for agricultural productivity in the early twenty-first century

    Get PDF
    This paper reviews recent literature concerning a wide range of processes through which climate change could potentially impact global-scale agricultural productivity, and presents projections of changes in relevant meteorological, hydrological and plant physiological quantities from a climate model ensemble to illustrate key areas of uncertainty. Few global-scale assessments have been carried out, and these are limited in their ability to capture the uncertainty in climate projections, and omit potentially important aspects such as extreme events and changes in pests and diseases. There is a lack of clarity on how climate change impacts on drought are best quantified from an agricultural perspective, with different metrics giving very different impressions of future risk. The dependence of some regional agriculture on remote rainfall, snowmelt and glaciers adds to the complexity. Indirect impacts via sea-level rise, storms and diseases have not been quantified. Perhaps most seriously, there is high uncertainty in the extent to which the direct effects of CO2 rise on plant physiology will interact with climate change in affecting productivity. At present, the aggregate impacts of climate change on global-scale agricultural productivity cannot be reliably quantified

    Seismic risk assessment for developing countries : Pakistan as a case study

    Get PDF
    Modern Earthquake Risk Assessment (ERA) methods usually require seismo-tectonic information for Probabilistic Seismic Hazard Assessment (PSHA) that may not be readily available in developing countries. To bypass this drawback, this paper presents a practical event-based PSHA method that uses instrumental seismicity, available historical seismicity, as well as limited information on geology and tectonic setting. Historical seismicity is integrated with instrumental seismicity to determine the long-term hazard. The tectonic setting is included by assigning seismic source zones associated with known major faults. Monte Carlo simulations are used to generate earthquake catalogues with randomized key hazard parameters. A case study region in Pakistan is selected to demonstrate the effectiveness of the method. The results indicate that the proposed method produces seismic hazard maps consistent with previous studies, thus being suitable for generating such maps in regions where limited data are available. The PSHA procedure is developed as an integral part of an ERA framework named EQRAM. The framework is also used to determine seismic risk in terms of annual losses for the study region
    corecore