1,793 research outputs found
Semantic modelling of user interests based on cross-folksonomy analysis
The continued increase in Web usage, in particular participation in folksonomies, reveals a trend towards a more dynamic and interactive Web where individuals can organise and share resources. Tagging has emerged as the de-facto standard for the organisation of such resources, providing a versatile and reactive knowledge management mechanism that users find easy to use and understand. It is common nowadays for users to have multiple profiles in various folksonomies, thus distributing their tagging activities. In this paper, we present a method for the automatic consolidation of user profiles across two popular social networking sites, and subsequent semantic modelling of their interests utilising Wikipedia as a multi-domain model. We evaluate how much can be learned from such sites, and in which domains the knowledge acquired is focussed. Results show that far richer interest profiles can be generated for users when multiple tag-clouds are combine
Online visualization of bibliography Using Visualization Techniques
Visualization is a concept where we can represent some raw data in the form of graphs, images, charts, etc. which will be very helpful for the end-user to correlate and be able to understand the relationships between the data elements in a single screen. Representing the bibliographic information of the computer science journals and proceedings using Visualization technique would help user choose a particular author and navigate through the hierarchy and find out what papers the author has published, the keywords of the papers, what papers cite them, the co-authors along with the main author, and how many papers are published by the author selected by the user and so on in a single page. These information is right now present in a scattered manner and the user has to search on websites like Google Scholar [1], Cite Seer [2] to get these bibliographic records. By the use of visualization techniques, all the information can be accessed on a single page by having a graph like points on the page, where the user can search for a particular author and the author and its co-authors are represented in the form of points. The goal of this project is to enhance current bibliography web services with an intuitive interactive visualization interface and to improve user understanding and conceptualization. In this project, we develop a simple web-interface which will take a search query from the user and find the related information like author\u27s name, the co-authors, number of papers published by him, related keywords, citations referred etc. The project uses the bibliographic records which are available as XML files from the Citeseer database[2], extracts the data into the database and then queries the database for the results using a web service. The data which is extracted is then presented visually to allow the user to conceptualize the results in a better way and help him/her find the articles of interest with utmost ease. In addition the user can interactively navigate the visual results to get more information about any of the article or the author displayed. So here we present both paper centric view and author centric view to the user by representing data in terms of graphs. The nodes in the graphs obtained for paper centric views and author centric views are color coded based on the paper’s weight parameter ( popularity of the paper ). For the paper centric view, the papers which are referring other papers are represented by providing a directed arrow from referred paper to referenced paper. Overall the idea here was to represent this related data in the form of a tree, so that the user can correlate all the data and get the relationships between them
Estimating and Sampling Graphs with Multidimensional Random Walks
Estimating characteristics of large graphs via sampling is a vital part of
the study of complex networks. Current sampling methods such as (independent)
random vertex and random walks are useful but have drawbacks. Random vertex
sampling may require too many resources (time, bandwidth, or money). Random
walks, which normally require fewer resources per sample, can suffer from large
estimation errors in the presence of disconnected or loosely connected graphs.
In this work we propose a new -dimensional random walk that uses
dependent random walkers. We show that the proposed sampling method, which we
call Frontier sampling, exhibits all of the nice sampling properties of a
regular random walk. At the same time, our simulations over large real world
graphs show that, in the presence of disconnected or loosely connected
components, Frontier sampling exhibits lower estimation errors than regular
random walks. We also show that Frontier sampling is more suitable than random
vertex sampling to sample the tail of the degree distribution of the graph
Recommended from our members
Micromobility evolution and expansion: Understanding how docked and dockless bikesharing models complement and compete – A case study of San Francisco
Shared micromobility – the shared use of bicycles, scooters, or other low-speed modes – is an innovative transportation strategy growing across the United States that includes various service models such as docked, dockless, and e-bike service models. This research focuses on understanding how docked bikesharing and dockless e-bikesharing models complement and compete with respect to user travel behaviors. To inform our analysis, we used two datasets from February 2018 of Ford GoBike (docked) and JUMP (dockless electric) bikesharing trips in San Francisco. We employed three methodological approaches: 1) travel behavior analysis, 2) discrete choice analysis with a destination choice model, and 3) geospatial suitability analysis based on the Spatial Temporal Economic Physiological Social (STEPS) to Transportation Equity framework. We found that dockless e-bikesharing trips were longer in distance and duration than docked trips. The average JUMP trip was about a third longer in distance and about twice as long in duration than the average GoBike trip. JUMP users were far less sensitive to estimated total elevation gain than were GoBike users, making trips with total elevation gain about three times larger than those of GoBike users, on average. The JUMP system achieved greater usage rates than GoBike, with 0.8 more daily trips per bike and 2.3 more miles traveled on each bike per day, on average. The destination choice model results suggest that JUMP users traveled to lower-density destinations, and GoBike users were largely traveling to dense employment areas. Bike rack density was a significant positive factor for JUMP users. The location of GoBike docking stations may attract users and/or be well-placed to the destination preferences of users. The STEPS-based bikeability analysis revealed opportunities for the expansion of both bikesharing systems in areas of the city where high-job density and bike facility availability converge with older resident populations
NOSQL design for analytical workloads: Variability matters
Big Data has recently gained popularity and has strongly questioned relational databases as universal storage systems, especially in the presence of analytical workloads. As result, co-relational alternatives, commonly known as NOSQL (Not Only SQL) databases, are extensively used for Big Data. As the primary focus of NOSQL is on performance, NOSQL databases are directly designed at the physical level, and consequently the resulting schema is tailored to the dataset and access patterns of the problem in hand. However, we believe that NOSQL design can also benefit from traditional design approaches. In this paper we present a method to design databases for analytical workloads. Starting from the conceptual model and adopting the classical 3-phase design used for relational databases, we propose a novel design method considering the new features brought by NOSQL and encompassing relational and co-relational design altogether.Peer ReviewedPostprint (author's final draft
- …