79 research outputs found

    Hybrid focused crawling on the Surface and the Dark Web

    Get PDF
    Focused crawlers enable the automatic discovery of Web resources about a given topic by automatically navigating through the Web link structure and selecting the hyperlinks to follow by estimating their relevance to the topic of interest. This work proposes a generic focused crawling framework for discovering resources on any given topic that reside on the Surface or the Dark Web. The proposed crawler is able to seamlessly navigate through the Surface Web and several darknets present in the Dark Web (i.e., Tor, I2P, and Freenet) during a single crawl by automatically adapting its crawling behavior and its classifier-guided hyperlink selection strategy based on the destination network type and the strength of the local evidence present in the vicinity of a hyperlink. It investigates 11 hyperlink selection methods, among which a novel strategy proposed based on the dynamic linear combination of a link-based and a parent Web page classifier. This hybrid focused crawler is demonstrated for the discovery of Web resources containing recipes for producing homemade explosives. The evaluation experiments indicate the effectiveness of the proposed focused crawler both for the Surface and the Dark Web

    A Broad Evaluation of the Tor English Content Ecosystem

    Full text link
    Tor is among most well-known dark net in the world. It has noble uses, including as a platform for free speech and information dissemination under the guise of true anonymity, but may be culturally better known as a conduit for criminal activity and as a platform to market illicit goods and data. Past studies on the content of Tor support this notion, but were carried out by targeting popular domains likely to contain illicit content. A survey of past studies may thus not yield a complete evaluation of the content and use of Tor. This work addresses this gap by presenting a broad evaluation of the content of the English Tor ecosystem. We perform a comprehensive crawl of the Tor dark web and, through topic and network analysis, characterize the types of information and services hosted across a broad swath of Tor domains and their hyperlink relational structure. We recover nine domain types defined by the information or service they host and, among other findings, unveil how some types of domains intentionally silo themselves from the rest of Tor. We also present measurements that (regrettably) suggest how marketplaces of illegal drugs and services do emerge as the dominant type of Tor domain. Our study is the product of crawling over 1 million pages from 20,000 Tor seed addresses, yielding a collection of over 150,000 Tor pages. We make a dataset of the intend to make the domain structure publicly available as a dataset at https://github.com/wsu-wacs/TorEnglishContent.Comment: 11 page

    HOMER: A semantically enhanced knowledge management approach in the domain of homemade explosives intelligence.

    Get PDF
    This paper presents a new approach, in handling data (encoding, managing and retrieving) in secure sensitive and classified organisations (such as Law Enforcement Agencies (LEAs)), that utilises Web 3.0 technologies as well as knowledge management techniques and pushing of information. This approach signals a departure from current use of databases and pulling of information technologies as well as allowing separation of concerns between how data are organised/structured and how data are manipulated/processed. Such an approach utilises an adaptive knowledge management platform capable of supporting organisational operations of LEAs using data aggregated from assorted, heterogeneous and online sources. Such knowledge is then pushed to the users, using recommenders, in an effortless manner addressing the needs of the organisation. Moreover, the system is designed to afford easier change of operational needs through the addition and removal of multiple folksonomies (representing changes in focus or new trends). These changes are further enriched with semantics providing specialised domain-specific content recommendations and semantically enriched search capabilities. This approach to knowledge retrieval has been applied to the domain of homemade explosives and counter-terrorism efforts as part of the HOMER project, where data are aggregated from sources such as police databases, online forums and explosives wikis. Data are stored in an unstructured manner and annotated by the users, ultimately being categorised as per the knowledge retrieval needs of the organisation, which in this case is to carry out efficient and effective investigations regarding homemade explosives. We describe the architecture of a system that can efficiently and effectively support related investigatory activities, and we also present an evaluation from the perspective of the end-users

    An evidence synthesis of strategies, enablers and barriers for keeping secrets online regarding the procurement and supply of illicit drugs

    Get PDF
    This systematic review attempts to understand how people keep secrets online, and in particular how people use the internet when engaging in covert behaviours and activities regarding the procurement and supply of illicit drugs. With the Internet and social media being part of everyday life for most people in western and non-western countries, there are ever-growing opportunities for individuals to engage in covert behaviours and activities online that may be considered illegal or unethical. A search strategy using Medical Subject Headings terms and relevant key words was developed. A comprehensive literature search of published and unpublished studies in electronic databases was conducted. Additional studies were identified from reference lists of previous studies and (systematic) reviews that had similar objectives as this search, and were included if they fulfilled our in-clusion criteria. Two researchers independently screened abstracts and full-texts for study eligibility and evaluated the quality of included studies. Disagreements were resolved by a consensus procedure. The systematic review includes 33 qualitative studies and one cross-sectional study, published between 2006 and 2018. Five covert behaviours were identified: the use of communication channels; anonymity; visibility reduction; limited posts in public; following forum rules and recommendations. The same technologies that provide individuals with easy access to information, such as social networking sites and forums, digital devices, digital tools and services, also increase the prevalence of inaccurate information, loss of privacy, identity theft and disinhibited communication. This review takes a rigorous interdisciplinary approach to synthesising knowledge on the strategies adopted by people in keeping secrets online. Whilst the focus is on the procurement and supply ofillicit drugs, this knowledge is transferrable to a range of contexts where people keep secrets online. It has particular significance for those who design online/social media applications, and for law enforcement and security agencies

    Data-driven Technology Foresight: Text Analysis of Emerging Technologies

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 공과대학 산업·조선공학부, 2018. 2. 박용태.This dissertation argues for new directions in the field of technology foresight. Technology foresight was formulated on the basis of qualitative and participatory research. Initially, most foresight activities were triggered by the prospect of a handful number of experts, but recent studies highlight theoretical paradigm shifts toward a more comprehensive and data-driven approach to creating shared insights on the future of emerging technologies. Much of the research up to now, however, has been descriptive in nature, and a definite method of realizing the notion has not yet been addressed in the existing literature to a large extent. To this end, we have attempted to formalize the concept of data-driven technology foresight by incorporating unconventional data sources – future-oriented web data, Wikipedia data, and scientific publication data – and different analytical tools – Latent Semantic Analysis, IdeaGraph, and Morphological Analysis. Four distinct foresight frameworks were proposed for the proactive management process of emerging technologies: impact identification, impact analysis, plan development, and technology ideation. The study was guided by the following research questions: (1) what kinds of data sources are available on the web and which of those are considered useful in foresight studies? (2) Where could we incorporate these data sources and which techniques are most suitable for the given purposes? (3) Which foresight-related fields would particularly benefit from applying a data-driven approach and what are the positive effects? The proposals outlined should be considered exploratory and open-ended. It is designed to determine the nature of the problem, rather than to offer definitive and conclusive answers. Nevertheless, the proposed scheme may well provide not just a rationale but a theoretical grounding for this newly introduced notion. This dissertation is expected to yield a foothold for the readers to better comprehend and act on this new shift in the field of technology foresight.Chapter 1 Introduction 1 1.1 Emergence of Technology Foresight 1 1.2 Towards a Data-driven Technology Foresight 3 1.3 Problem Statement 6 1.4 Dissertation Overview 8 Chapter 2 Data Sources and Methodologies 15 2.1 Data Sources 15 2.1.1 Future-oriented Web Data 15 2.1.2 Wikipedia Data 17 2.1.3 Scientific Publication Data 19 2.2 Methodologies 21 2.2.1 Latent Semantic Analysis (LSA) 21 2.2.2 IdeaGraph 25 2.2.3 Morphological Analysis (MA) 29 Chapter 3 Foresight for Impact Identification 31 3.1 Introduction 32 3.2 Emerging Technology and its Social Impacts 36 3.2.1 Distinctive Nature of Emerging Technology 36 3.2.2 Technology Assessment 39 3.3 LSA for Constructing Scenarios 43 3.4 Research Framework 44 3.4.1 Step 1: Data Collection 46 3.4.2 Step 2: Scenario Development 49 3.4.2.1 Pre-LSA: Preprocessing Future-oriented Web Data 49 3.4.2.2 LSA: Applying Latent Semantic Analysis 52 3.4.2.3 Post-LSA: Constructing Scenarios 54 3.5 Illustrative Case Study: Drone Technology 55 3.6 Discussion 65 3.6.1 Categorization of Social Impacts 65 3.6.2 Comparative Analysis 72 3.6.3 Implication for Theory, Practice, and Policy 74 3.7 Conclusion 76 Chapter 4 Foresight for Impact Analysis 79 4.1 Introduction 80 4.2 Uncertainty and Complexity 82 4.3 Data-driven Foresight Process 84 4.4 Scenario Building Beyond the Obvious 86 4.4.1 Capturing Plausibility using LSA 90 4.4.2 Capturing Creativity using IdeaGraph 92 4.5 Research Framework 93 4.5.1 Step 1. Pre-Analysis: Data Preparation 94 4.5.1.1 Target Technology Selection 94 4.5.1.2 Data Acquisition 95 4.5.1.3 Data Preprocessing 95 4.5.2 Step 2. Text Analysis: Scenario Building 96 4.5.2.1 General Glimpse using Overt Structures 96 4.5.2.2 Hidden Details using Latent Structures 98 4.5.3 Step 3. Post-Analysis: Analytical Interpretation 101 4.5.3.1 Individual Impact Scenario 101 4.5.3.2 Overall Latent Impacts 101 4.6 Illustrative Case Study: 3D Printing Technology 102 4.7 Discussion 110 4.7.1 Scenarios Beyond the Obvious 110 4.7.2 Comparative Analysis 113 4.8 Conclusion 115 Chapter 5 Foresight for Plan Development 117 5.1 Introduction 118 5.2 Theoretical Paradigm Shift 120 5.2.1 Technology-focused vs. Society-focused 120 5.2.2 Co-evolution of Technology and Society 122 5.2.3 Responsible Development 125 5.3 Methodological Paradigm Shift 127 5.3.1 Participatory Approach 127 5.3.2 Data-driven Approach 129 5.4 Rationale for using LSA 131 5.5 Research Framework 132 5.5.1 Step 1. Envisioning Social Issues 133 5.5.1.1 Collection of Future-oriented Web Data 133 5.5.1.2 Construction of Impact Scenarios 135 5.5.1.3 Conceptualization of Impact Scenarios 137 5.5.2 Step 2. Deriving Technical Solutions 138 5.5.2.1 Collection of Scientific Publication Data 138 5.5.2.2 Construction of Solution Concepts 139 5.6 Illustrative Case Study: Autonomous Vehicle 140 5.7 Discussion 149 5.7.1 Comparative Analysis 149 5.7.2 Major Strengths in Envisioning Social Impacts 152 5.7.3 Major Strengths in Overviewing Solutions 154 5.8 Conclusion 156 Chapter 6 Foresight for Technology Ideation 158 6.1 Introduction 159 6.2 Related Studies 161 6.2.1 Generating Creative Ideas 161 6.2.2 Data-driven Morphological Analysis 163 6.3 Technology Foresight using Wikipedia 165 6.3.1 Wikipedia as a Good Remedy 165 6.3.2 Preliminaries: How to Apply Wikipedia 168 6.4 Research Framework 173 6.4.1 Basic Model 174 6.4.2 Extended Model 175 6.4.2.1 Phase 1: Preliminary Phase 177 6.4.2.2 Phase 2: Dimension Development Phase 177 6.4.2.3 Phase 3: Value Development Phase 179 6.4.2.4 Phase 4: Sub-dimension Development Phase 182 6.5 Illustrative Case Study: Drone Technology 183 6.5.1 Basic Model 183 6.5.2 Extended Model 185 6.6 Comparative Analysis 193 6.6.1 Experimental Setup 193 6.6.2 Comparison of Results 195 6.7 Intrinsic Limitations of Applying Wikipedia 199 6.8 Conclusion 201 Chapter 7 Concluding Remarks 203 Bibliography 211 Appendix 236 Appendix A Result of overt and latent structures of each impact scenario 236 Appendix B Result of Wikipedia-based morphological matrix (basic model) 240 Appendix C Result of Wikipedia-based morphological matrix using superordinate seed terms (extended model) 241 Appendix D Result of Wikipedia-based morphological matrix after applying subordinate value seed terms (extended model) 243 Appendix E Result of Wikipedia-based morphological matrix after developing sub-dimensions (extended model) 247Docto

    The Resonance of Unseen Things: Poetics, Power, Captivity, and UFOs in the American Uncanny

    Get PDF
    The Resonance of Unseen Things offers an ethnographic meditation on the “uncanny” persistence and cultural freight of conspiracy theory. The project is a reading of conspiracy theory as an index of a certain strain of late 20th-century American despondency and malaise, especially as understood by people experiencing downward social mobility. Written by a cultural anthropologist with a literary background, this deeply interdisciplinary book focuses on the enduring American preoccupation with captivity in a rapidly transforming world. Captivity is a trope that appears in both ordinary and fantastic iterations here, and Susan Lepselter shows how multiple troubled histories—of race, class, gender, and power—become compressed into stories of uncanny memory. “We really don’t have anything like this in terms of a focused, sympathetic, open-minded ethnographic study of UFO experiencers. . . . The author’s semiotic approach to the paranormal is immensely productive, positive, and, above all, resonant with what actually happens in history.” —Jeffrey J. Kripal, J. Newton Rayzor Professor of Religion, Rice University “Lepselter relates a weave of intimate alien sensibilities in out-off-the-way places which are surprisingly, profoundly, close to home. Readers can expect to share her experience of contact with complex logics of feeling, and to do so in a contemporary America they may have thought they understood.” —Debbora Battaglia, Mount Holyoke College “An original and beautifully written study of contemporary American cultural poetics. . . . The book convincingly brings into relief the anxieties of those at the margins of American economic and civic life, their perceptions of state power, and the narrative continuities that bond them to histories of violence and expansion in the American West.” —Deirdre de la Cruz, University of Michiga

    Robotics and the Future of International Asymmetric Warfare

    Get PDF
    In the post-Cold War world, the world's most powerful states have cooperated or avoided conflict with each other, easily defeated smaller state governments, engaged in protracted conflicts against insurgencies and resistance networks, and lost civilians to terrorist attacks. This dissertation explores various explanations for this pattern, proposing that some non-state networks adapt to major international transitions more quickly than bureaucratic states. Networks have taken advantage of the information technology revolution to enhance their capabilities, but states have begun to adjust, producing robotic systems with the potential to grant them an advantage in asymmetric warfare
    corecore