3,141 research outputs found
Mapping the UK Webspace: Fifteen Years of British Universities on the Web
This paper maps the national UK web presence on the basis of an analysis of
the .uk domain from 1996 to 2010. It reviews previous attempts to use web
archives to understand national web domains and describes the dataset. Next, it
presents an analysis of the .uk domain, including the overall number of links
in the archive and changes in the link density of different second-level
domains over time. We then explore changes over time within a particular
second-level domain, the academic subdomain .ac.uk, and compare linking
practices with variables, including institutional affiliation, league table
ranking, and geographic location. We do not detect institutional affiliation
affecting linking practices and find only partial evidence of league table
ranking affecting network centrality, but find a clear inverse relationship
between the density of links and the geographical distance between
universities. This echoes prior findings regarding offline academic activity,
which allows us to argue that real-world factors like geography continue to
shape academic relationships even in the Internet age. We conclude with
directions for future uses of web archive resources in this emerging area of
research.Comment: To appear in the proceeding of WebSci 201
Finding Relevant Answers in Software Forums
AbstractāOnline software forums provide a huge amount of valuable content. Developers and users often ask questions and receive answers from such forums. The availability of a vast amount of thread discussions in forums provides ample opportunities for knowledge acquisition and summarization. For a given search query, current search engines use traditional information retrieval approach to extract webpages containin
Intent-Aware Contextual Recommendation System
Recommender systems take inputs from user history, use an internal ranking
algorithm to generate results and possibly optimize this ranking based on
feedback. However, often the recommender system is unaware of the actual intent
of the user and simply provides recommendations dynamically without properly
understanding the thought process of the user. An intelligent recommender
system is not only useful for the user but also for businesses which want to
learn the tendencies of their users. Finding out tendencies or intents of a
user is a difficult problem to solve.
Keeping this in mind, we sought out to create an intelligent system which
will keep track of the user's activity on a web-application as well as
determine the intent of the user in each session. We devised a way to encode
the user's activity through the sessions. Then, we have represented the
information seen by the user in a high dimensional format which is reduced to
lower dimensions using tensor factorization techniques. The aspect of intent
awareness (or scoring) is dealt with at this stage. Finally, combining the user
activity data with the contextual information gives the recommendation score.
The final recommendations are then ranked using filtering and collaborative
recommendation techniques to show the top-k recommendations to the user. A
provision for feedback is also envisioned in the current system which informs
the model to update the various weights in the recommender system. Our overall
model aims to combine both frequency-based and context-based recommendation
systems and quantify the intent of a user to provide better recommendations.
We ran experiments on real-world timestamped user activity data, in the
setting of recommending reports to the users of a business analytics tool and
the results are better than the baselines. We also tuned certain aspects of our
model to arrive at optimized results.Comment: Presented at the 5th International Workshop on Data Science and Big
Data Analytics (DSBDA), 17th IEEE International Conference on Data Mining
(ICDM) 2017; 8 pages; 4 figures; Due to the limitation "The abstract field
cannot be longer than 1,920 characters," the abstract appearing here is
slightly shorter than the one in the PDF fil
Scatter networks: a new approach for analysing information scatter
Information on any given topic is often scattered across the Web. Previously this scatter has been characterized through the inequality of distribution of facts (i.e. pieces of information) across webpages. Such an approach conceals how specific facts (e.g. rare facts) occur in specific types of pages (e.g. fact-rich pages). To reveal such regularities, we construct bipartite networks, consisting of two types of vertices: the facts contained in webpages and the webpages themselves. Such a representation enables the application of a series of network analysis techniques, revealing structural features such as connectivity, robustness and clustering. Not only does network analysis yield new insights into information scatter, but we also illustrate the benefit of applying new and existing analysis techniques directly to a bipartite network as opposed to its one-mode projection. We discuss the implications of each network feature to the usersā ability to find comprehensive information online. Finally, we compare the bipartite graph structure of webpages and facts with the hyperlink structure between the webpages.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/58170/2/njp7_7_231.pd
Geoportals: an internet marketing perspective
A geoportal is a web site that presents an entry point to geo-products (including geo-data) on the web. Despite their importance in (spatial) data infrastructures, literature suggest stagnating or even declining trends in visitor numbers. In this paper relevant ideas and techniques for improving performance are derived from internet marketing literature. We tested the extent to which these ideas are already applied in practice through a survey among 48 geoportals worldwide. Results show in many cases positive correlation with trends in visitor numbers. The ideas can be useful for geoportal managers developing their marketing strateg
- ā¦