283,538 research outputs found

    Malware distributions and graph structure of the Web

    Full text link
    Knowledge about the graph structure of the Web is important for understanding this complex socio-technical system and for devising proper policies supporting its future development. Knowledge about the differences between clean and malicious parts of the Web is important for understanding potential treats to its users and for devising protection mechanisms. In this study, we conduct data science methods on a large crawl of surface and deep Web pages with the aim to increase such knowledge. To accomplish this, we answer the following questions. Which theoretical distributions explain important local characteristics and network properties of websites? How are these characteristics and properties different between clean and malicious (malware-affected) websites? What is the prediction power of local characteristics and network properties to classify malware websites? To the best of our knowledge, this is the first large-scale study describing the differences in global properties between malicious and clean parts of the Web. In other words, our work is building on and bridging the gap between \textit{Web science} that tackles large-scale graph representations and \textit{Web cyber security} that is concerned with malicious activities on the Web. The results presented herein can also help antivirus vendors in devising approaches to improve their detection algorithms

    A survey of UK university web management: staffing, systems and issues

    No full text
    Purpose: The purpose of the paper is to summarize the findings of a survey of UK universities about how their web site is managed and resourced, which technologies are in use and what are seen as the main issues and priorities. Methodology/approach: The paper is based on a web based questionnaire distributed in summer 2006, and which received 104 usable responses from 87 insitutions. Findings: The survey showed that some web teams were based in IT and some in external relations, yet in both cases the site typically served internal and external audiences. The role of web manager is partly management of resources, time and people, partly about marketing and liaison and partly also concerned with more technical aspects including interface design and HTML. But it is a diverse role with a wide spread of responsibilities. On the whole web teams were relatively small. Three quarters of responding institutions had a CMS, but specific systems in use were diverse. 60% had a portal. There was evidence of increasing use of blogs and wikis. The key driver for the web site is student recruitment, with instituitional reputation and information to stakeholders also being important. The biggest perceived weaknesses were maintaining consistency with devolved content creation and currency of content; lack of resourcing a key threat while comprehensiveness was a key strength. Current and wished for projects pointed again to the diversity of the sector. Research implications/limitations: The lack of comparative data and difficulties of interpreting responses to closed questions where respondents could have quite different status (partly reflecting divergent patterns of governance of the web across the sector) create issues with the reliability of the research. Practical implications: Data about resourcing of web management, technology in use etc at comparable institutions is invaluable for practitioners in their efforts to gain resource in their own context. Originality/value of paper: The paper adds more systematic, current data to our limited knowledge about how university web sites are managed

    Why People Search for Images using Web Search Engines

    Get PDF
    What are the intents or goals behind human interactions with image search engines? Knowing why people search for images is of major concern to Web image search engines because user satisfaction may vary as intent varies. Previous analyses of image search behavior have mostly been query-based, focusing on what images people search for, rather than intent-based, that is, why people search for images. To date, there is no thorough investigation of how different image search intents affect users' search behavior. In this paper, we address the following questions: (1)Why do people search for images in text-based Web image search systems? (2)How does image search behavior change with user intent? (3)Can we predict user intent effectively from interactions during the early stages of a search session? To this end, we conduct both a lab-based user study and a commercial search log analysis. We show that user intents in image search can be grouped into three classes: Explore/Learn, Entertain, and Locate/Acquire. Our lab-based user study reveals different user behavior patterns under these three intents, such as first click time, query reformulation, dwell time and mouse movement on the result page. Based on user interaction features during the early stages of an image search session, that is, before mouse scroll, we develop an intent classifier that is able to achieve promising results for classifying intents into our three intent classes. Given that all features can be obtained online and unobtrusively, the predicted intents can provide guidance for choosing ranking methods immediately after scrolling

    If you build it, will they come? How researchers perceive and use web 2.0

    Get PDF
    Over the past 15 years, the web has transformed the way we seek and use information. In the last 5 years in particular a set of innovative techniques – collectively termed ‘web 2.0’ – have enabled people to become producers as well as consumers of information. It has been suggested that these relatively easy-to-use tools, and the behaviours which underpin their use, have enormous potential for scholarly researchers, enabling them to communicate their research and its findings more rapidly, broadly and effectively than ever before. This report is based on a study commissioned by the Research Information Network to investigate whether such aspirations are being realised. It seeks to improve our currently limited understanding of whether, and if so how, researchers are making use of various web 2.0 tools in the course of their work, the factors that encourage or inhibit adoption, and researchers’ attitudes towards web 2.0 and other forms of communication. Context: How researchers communicate their work and their findings varies in different subjects or disciplines, and in different institutional settings. Such differences have a strong influence on how researchers approach the adoption – or not – of new information and communications technologies. It is also important to stress that ‘web 2.0’ encompasses a wide range of interactions between technologies and social practices which allow web users to generate, repurpose and share content with each other. We focus in this study on a range of generic tools – wikis, blogs and some social networking systems – as well as those designed specifically by and for people within the scholarly community. Method: Our study was designed not only to capture current attitudes and patterns of adoption but also to identify researchers’ needs and aspirations, and problems that they encounter. We began with an online survey, which collected information about researchers’ information gathering and dissemination habits and their attitudes towards web 2.0. This was followed by in-depth, semi-structured interviews with a stratified sample of survey respondents to explore in more depth their experience of web 2.0, including perceived barriers as well as drivers to adoption. Finally, we undertook five case studies of web 2.0 services to investigate their development and adoption across different communities and business models. Key findings: Our study indicates that a majority of researchers are making at least occasional use of one or more web 2.0 tools or services for purposes related to their research: for communicating their work; for developing and sustaining networks and collaborations; or for finding out about what others are doing. But frequent or intensive use is rare, and some researchers regard blogs, wikis and other novel forms of communication as a waste of time or even dangerous. In deciding if they will make web 2.0 tools and services part of their everyday practice, the key questions for researchers are the benefits they may secure from doing so, and how it fits with their use of established services. Researchers who use web 2.0 tools and services do not see them as comparable to or substitutes for other channels and means of communication, but as having their own distinctive role for specific purposes and at particular stages of research. And frequent use of one kind of tool does not imply frequent use of others as well
    • …