610 research outputs found

    User Identification, Classification and Recommendation in Web Usage Mining -An Approach for Personalized Web Mining

    Get PDF
    Abstract In recent years, Web Analytics (WA) is turning out to be an emerging research topic due to the extensive advancements in the techniques that aid in accessing the web contents, which millions of people have shared on the web. The information that has connection to the theme being searched may not be recognized always, if the personalization system operates in accordance with the usage-dependent outcomes alone. In this research work a new method is introduced for Personalized Web Search system, wherein, the users are enabled to have access to the relevant web pages as per their choice from the URL list. The first stage of this research deals with Semantic Web Personalization, which provides a merging between the content semantics as well as the usage data that are stated as ontology terms. This system supports the computation of the navigational patterns that are semantically improvised, so that constructive recommendations can be successfully engendered. It can be perceived that no other systems excluding the semantic web personalization system described here is employed in nonsemantic web sites. The second stage of the work is to assist in augmenting the quality of the recommendations depending on the structure lying beneath the website. Finally, the testing is achieved through the utilization of a prolonged database link. The analysis of variation that exists among the different classes of parameters is made later, when the privacy is formulated using the memory usage and the period of execution

    Technologies to enhance self-directed learning from hypertext

    Get PDF
    With the growing popularity of the World Wide Web, materials presented to learners in the form of hypertext have become a major instructional resource. Despite the potential of hypertext to facilitate access to learning materials, self-directed learning from hypertext is often associated with many concerns. Self-directed learners, due to their different viewpoints, may follow different navigation paths, and thus they will have different interactions with knowledge. Therefore, learners can end up being disoriented or cognitively-overloaded due to the potential gap between what they need and what actually exists on the Web. In addition, while a lot of research has gone into supporting the task of finding web resources, less attention has been paid to the task of supporting the interpretation of Web pages. The inability to interpret the content of pages leads learners to interrupt their current browsing activities to seek help from other human resources or explanatory learning materials. Such activity can weaken learner engagement and lower their motivation to learn. This thesis aims to promote self-directed learning from hypertext resources by proposing solutions to the above problems. It first presents Knowledge Puzzle, a tool that proposes a constructivist approach to learn from the Web. Its main contribution to Web-based learning is that self-directed learners will be able to adapt the path of instruction and the structure of hypertext to their way of thinking, regardless of how the Web content is delivered. This can effectively reduce the gap between what they need and what exists on the Web. SWLinker is another system proposed in this thesis with the aim of supporting the interpretation of Web pages using ontology based semantic annotation. It is an extension to the Internet Explorer Web browser that automatically creates a semantic layer of explanatory information and instructional guidance over Web pages. It also aims to break the conventional view of Web browsing as an individual activity by leveraging the notion of ontology-based collaborative browsing. Both of the tools presented in this thesis were evaluated by students within the context of particular learning tasks. The results show that they effectively fulfilled the intended goals by facilitating learning from hypertext without introducing high overheads in terms of usability or browsing efforts

    Applying Wikipedia to Interactive Information Retrieval

    Get PDF
    There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Providing personalised information based on individual interests and preferences.

    Get PDF
    The main aim of personalised Information Retrieval (IR) is to provide an effective IR system whereby relevant information can be presented according to individual users' interests and preferences. In response to their queries, all Web users expect to obtain the search results in a rank order with the most relevant items at the lowest ranks. Effective IR systems rank the less relevant documents below the relevant documents. However, a commonly stated problem of Web browsers is to match the users' queries to the information base. The key challenge is to return a list of search results containing a low level of non-relevant documents while not missing out the relevant documents.To address this problem, keyword-based search of Vector Space Model is employed as an IR technique to model the Web users and build their interest profiles. Semantic-based search through Ontology is further employed to represent documents matching the users' needs without being directly contained in the users' specified keywords. The users' log files are one of the most important sources from which implicit feedback is detected through their profiles. These provide valuable information based on which alternative learning approaches (i.e. dwell-based search) can be incorporated into the IR standard measures (i.e. tf-idf) allowing a further improvement of personalisation of Web document search, thus increasing the performance of IR systems.To incorporate such a non-textual data type (i.e. dwell) into the hybridisation of the keyword-based and semantic-based searches entails a complex interaction of information attributes in the index structure. A dwell-based filter called dwell-tf-ldf that allows a standard tokeniser to be converted into a keyword tokeniser is thus proposed. The proposed filter uses an efficient hybrid indexing technique to bring textual and non-textual data types under one umbrella, thus making a move beyond simple keyword matching to improve future retrieval applications for web browsers. Adopting precision and recall, the most common evaluation measure, the superiority of the hybridisation of these approaches lies in pushing significantly relevant documents to the top of the ranked lists, as compared to any traditional search system. The results were empirically confirmed through human subjects who conducted several real-life Web searches

    The development of a model of information seeking behaviour of students in higher education when using internet search engines.

    Get PDF
    This thesis develops a model of Web information seeking behaviour of postgraduate students with a specific focus on Web search engines' use. It extends Marchionini's eight stage model of information seeking, geared towards electronic environments, to holistically encompass the physical, cognitive, affective and social dimensions of Web users' behaviour. The study recognises the uniqueness of the Web environment as a vehicle for information dissemination and retrieval, drawing on the distinction between information searching and information seeking and emphasises the importance of following user-centred holistic approaches to study information seeking behaviour. It reviews the research in the field and demonstrates that there is no comprehensive model that explains the behaviour of Web users when employing search engines for information retrieval. The methods followed to develop the study are explained with a detailed analysis of the four dimensions of information seeking (physical, cognitive affective, social). Emphasis is placed on the significance of combined methods (qualitative and quantitative) and the ways in which they can enrich the examination of human behaviour. This is concluded with a discussion of methodological issues. The study is supported by an empirical investigation, which examines the relationship between interactive information retrieval using Web search engines and human information seeking processes. This investigates the influence of cognitive elements (such as learning and problem style, and creative ability) and affective characteristics (e. g. confidence, loyalty, familiarity, ease of use), as well as the role that system experience, domain knowledge and demographics play in information seeking behaviour and in user overall satisfaction with the retrieval result. The influence of these factors is analysed by identifying users' patterns of behaviour and tactics, adopted to solve specific problems. The findings of the empirical study are incorporated into an enriched information-seeking model, encompassing use of search engines, which reveals a complex interplay between physical, cognitive, affective and social elements and that none of these characteristics can be seen in isolation when attempting to explain the complex phenomenon of information seeking behaviour. Although the model is presented in a linear fashion the dynamic, reiterative and circular character of the information seeking process is explained through an emphasis on transition patterns between the different stages. The research concludes with a discussion of problems encountered by Web information seekers which provides detailed analysis of the reasons why users express satisfaction or dissatisfaction with the results of Web searching, areas in which Web search engines can be improved and issues related to the need for students to be given additional training and support are identified. These include planning and organising information, recognising different dimensions of information intents and needs, emphasising the importance of variety in Web information seeking, promoting effective formulation of queries and ranking, reducing overload of information and assisting effective selection of Web sites and critical examination of results

    A study of lawyers’ information behaviour leading to the development of two methods for evaluating electronic resources

    Get PDF
    In this thesis we examine the information behaviour displayed by a broad cross-section of academic and practicing lawyers and feed our findings into the development of the Information Behaviour (IB) methods - two novel methods for evaluating the functionality and usability of electronic resources. We captured lawyers’ information behaviour by conducting naturalistic observations, where we asked participants to think aloud whilst using existing resources to ‘find information required for their work.’ Lawyers’ information behaviours closely matched those observed in other disciplines by Ellis and others, serving to validate Ellis’s existing model in the legal domain. Our findings also extend Ellis’s model to include behaviours pertinent to legal information-seeking, broaden the scope of the model to cover information use (in addition to information-seeking) behaviours and enhance the potential analytical detail of the model through the identification of a range of behavioural ‘subtypes’ and levels at which behaviours can operate. The identified behaviours were used as the basis for developing two methods for evaluating electronic resources – the IB functionality method (which mainly involves examining whether and how information behaviours are currently, or might in future be, supported by an electronic resource) and the IB usability method (which involves setting users behaviour-focused tasks, asking them to think aloud whilst performing the tasks, and identifying usability issues from the think- aloud data). Finally the IB methods were themselves evaluated by stakeholders working for LexisNexis Butterworths – a large electronic legal resource development firm. Stakeholders were recorded using the methods and focus group and questionnaire data was collected, with the aim of ascertaining how usable, useful and learnable they considered the methods to be and how likely they would be to use them in future. Overall, findings were positive regarding both methods and useful suggestions for improving the methods were made

    Facebook’s Anticompetitive Lean in Strategies

    Get PDF
    Facebook is under fire on several fronts and with good reason. Regulators strive to make sense of and address a plethora of seemingly unrelated issues that arise from the operation of its platform. These range from antitrust, privacy violations, dissemination of harmful content and speech, deception and polarisation to political manipulation. This paper identifies Facebook’s unrestricted and excessive data collection as a unifying theme that requires immediate antitrust action. Once a privacy-oriented social network, Facebook soon mutated into a surveillance machine designed to hoover people’s personal data to identify and understand people’s interests, preferences and emotions and turn that knowledge into profit through the sale of targeted ads. Since people’s innate preference for privacy stood in the way of Facebook’s growth, Facebook resorted to privacy intrusions and deception to access as much user data as possible, thereby gaining market power. Currently, its overwhelming dominant position in the social media market means that no matter how much data Facebook extracts from users, how transparent its information about its data processing practices is and how many privacy scandals ensue from its reckless handling of data, users have nowhere else to go. This paper provides a course of action to correct this unacceptable anticompetitive outcome. The imposition of unfair commercial terms on consumers, the distortion of the competitive process through privacy violations and misleading practices, the squeezing of news publishers’ traffic and foreclosure of actual and potential competitors by Facebook, can be stopped. A combination of data and consumer protection measures alone cannot stop Facebook’s actions, but antitrust enforcement can be used to curb Facebook’s ability to reinforce its data-driven abuse of its market power
    • 

    corecore