308 research outputs found
A Model for Personalized Keyword Extraction from Web Pages using Segmentation
The World Wide Web caters to the needs of billions of users in heterogeneous
groups. Each user accessing the World Wide Web might have his / her own
specific interest and would expect the web to respond to the specific
requirements. The process of making the web to react in a customized manner is
achieved through personalization. This paper proposes a novel model for
extracting keywords from a web page with personalization being incorporated
into it. The keyword extraction problem is approached with the help of web page
segmentation which facilitates in making the problem simpler and solving it
effectively. The proposed model is implemented as a prototype and the
experiments conducted on it empirically validate the model's efficiency.Comment: 6 Pages, 2 Figure
Morpes: A Model for Personalized Rendering of Web Content on Mobile Devices
With the tremendous growth in the information communication sector, the
mobile phones have become the prime information communication devices. The
convergence of traditional telephony with the modern web enabled communication
in the mobile devices has made the communication much effective and simpler. As
mobile phones are becoming the crucial source of accessing the contents of the
World Wide Web which was originally designed for personal computers, has opened
up a new challenge of accommodating the web contents in to the smaller mobile
devices. This paper proposes an approach towards building a model for rendering
the web pages in mobile devices. The proposed model is based on a
multi-dimensional web page segment evaluation model. The incorporation of
personalization in the proposed model makes the rendering user-centric. The
proposed model is validated with a prototype implementation.Comment: 10 Pages, 2 Figure
Applying digital content management to support localisation
The retrieval and presentation of digital content such as that on the World Wide Web (WWW) is a substantial area of research. While recent years have seen huge expansion in the size of web-based archives that can be searched efficiently by commercial search engines, the presentation of potentially relevant content is still limited to ranked document lists represented by simple text snippets or image keyframe surrogates. There is expanding interest in techniques to personalise the presentation of content to improve the richness and effectiveness of the user experience. One of the most significant challenges to achieving this is the increasingly multilingual nature of this data, and the need to provide suitably localised responses to users based on this content. The Digital Content Management (DCM) track of the Centre for Next Generation Localisation (CNGL) is seeking to develop technologies to support advanced personalised access and presentation of information by combining elements from the existing research areas of Adaptive Hypermedia and Information Retrieval. The combination of these technologies is intended to produce significant improvements in the way users access information. We review key features of these technologies and introduce early ideas for how these technologies can support localisation and localised content before concluding with some impressions of future directions in DCM
CaSePer: An efficient model for personalized web page change detection based on segmentation
AbstractUsers who visit a web page repeatedly at frequent intervals are more interested in knowing the recent changes that have occurred on the page than the entire contents of the web page. Because of the increased dynamism of web pages, it would be difficult for the user to identify the changes manually. This paper proposes an enhanced model for detecting changes in the pages, which is called CaSePer (Change detection based on Segmentation with Personalization). The change detection is micro-managed by introducing web page segmentation. The web page change detection process is made efficient by having it perform a dual-step process. The proposed method reduces the complexity of the change-detection by focusing only on the segments in which the changes have occurred. The user-specific personalized change detection is also incorporated in the proposed model. The model is validated with the help of a prototype implementation. The experiments conducted on the prototype implementation confirm a 77.8% improvement and a 97.45% accuracy rate
Mining user-generated comments
International audienceâSocial-media websites, such as newspapers, blogs, and forums, are the main places of generation and exchange of user-generated comments. These comments are viable sources for opinion mining, descriptive annotations and information extraction. User-generated comments are formatted using a HTML template, they are therefore entwined with the other information in the HTML document. Their unsupervised extraction is thus a taxing issue â even greater when considering the extraction of nested answers by different users. This paper presents a novel technique (CommentsMiner) for unsupervised users comments extraction. Our approach uses both the theoretical framework of frequent subtree mining and data extraction techniques. We demonstrate that the comment mining task can be modelled as a constrained closed induced subtree mining problem followed by a learning-to-rank problem. Our experimental evaluations show that CommentsMiner solves the plain comments and nested comments extraction problems for 84% of a representative and accessible dataset, while outperforming existing baselines techniques
Extraction de commentaires utilisateurs sur le Web
National audienceDans cet article, nous présentons CommentsMiner, une solution d'ex-traction non supervisée pour l'extraction de commentaires utilisateurs. Notre approche se base sur une combinaison de techniques de fouille de sous-arbres fréquents, d'extraction de données et d'apprentissage de classement. Nos expéri-mentations montrent que CommentsMiner permet de résoudre le problÚme d'ex-traction de commentaires sur 84% d'un jeu de données représentatif et publique-ment accessible, loin devant les techniques existantes d'extraction
A personalized web page content filtering model based on segmentation
In the view of massive content explosion in World Wide Web through diverse
sources, it has become mandatory to have content filtering tools. The filtering
of contents of the web pages holds greater significance in cases of access by
minor-age people. The traditional web page blocking systems goes by the Boolean
methodology of either displaying the full page or blocking it completely. With
the increased dynamism in the web pages, it has become a common phenomenon that
different portions of the web page holds different types of content at
different time instances. This paper proposes a model to block the contents at
a fine-grained level i.e. instead of completely blocking the page it would be
efficient to block only those segments which holds the contents to be blocked.
The advantages of this method over the traditional methods are fine-graining
level of blocking and automatic identification of portions of the page to be
blocked. The experiments conducted on the proposed model indicate 88% of
accuracy in filtering out the segments.Comment: 11 Pages, 6 Figure
- âŠ