Search CORE

2,418 research outputs found

D-cores: measuring collaboration of directed graphs based on degeneracy

Author: Giatsidis Christos
Thilikos Dimitrios M.
Vazirgiannis Michalis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/09/2012
Field of study

International audienceCommunity detection and evaluation is an important task in graph mining. In many cases, a community is defined as a subgraph characterized by dense connections or interactions between its nodes. A variety of measures are proposed to evaluate different quality aspects of such communities--in most cases ignoring the directed nature of edges. In this paper, we introduce novel metrics for evaluating the collaborative nature of directed graphs--a property not captured by the single node metrics or by other established commu- nity evaluation metrics. In order to accomplish this objective, we capitalize on the concept of graph degeneracy and define a novel D-core framework, extending the classic graph-theoretic notion of k-cores for undirected graphs to directed ones. Based on the D-core, which essen- tially can be seen as a measure of the robustness of a community under degeneracy, we devise a wealth of novel metrics used to evaluate graph collaboration features of directed graphs. We applied the D-core approach on large synthetic and real-world graphs such as Wikipedia, DBLP, and ArXiv and report interesting results at the graph as well at the node level

Crossref

HAL Descartes

HAL-Polytechnique

TR-2004017: Towards a Formal Concept Analysis Approach to Exploring Communities on the World Wide Web

Author: Haralick Robert M.
Rome Jayson E.
Publication venue: CUNY Academic Works
Publication date: 01/01/2004
Field of study

City University of New York

Towards a Formal Concept Analysis Approach to Exploring Communities on the World Wide Web

Author: B. Berendt
B. Ganter
C. Carpineto
D.S. Modha
F.R.K. Chung
J.M. Kleinberg
M.R. Henzinger
P. Pirolli
R. Cole
R.J. Cole
R.M. Haralick
S. Brin
S. Chakrabarti
S.R. Kumar
Y. Kalfoglou
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref

The bipartite clique: A topological paradigm for Web user search customization and Web site restructuring

Author: Choyce-Miles Brenda F.
Publication venue: Louisiana Tech Digital Commons
Publication date: 01/04/2005
Field of study

The objective of this dissertation research is to aid the Web user to achieve his search objective at a host Web site by organizing a strongly connected neighborhood of Web pages that are thematically and spatially related to the user\u27s search interest. Therefore, methods were developed to (1) find all Web pages at a given Web site that are thematically similar to a user\u27s initial choice of a Web page (selected from the set of Web pages returned in response to a query by any popular search engine), and (2) organize these pages hierarchically in terms of their relevance to the user\u27s initial Web page request. This selection and organization of pages is dynamically adjusted in order to make these methods responsive to the user\u27s choice of pages defining his search agenda. The methods developed in this work skillfully incorporate the production of the bipartite clique graph structure to simulate both spatial and thematic relatedness of Web pages. By ranking the user\u27s initial page choice as the most relevant page, the authority page, link analysis is used to identify a set of pages with out-links to this authority page and assemble these into a hub of relevant pages. The authority set (initially containing only the user\u27s initial page choice) is then expanded to include other pages with in-links from the set of hub pages. The authority-hub relationship signified by Web page links is used to define the two partite sets of the biclique graph. The partite set of authority pages contains the user\u27s initial page choice and other thematically and spatially similar pages. The partite set of hub pages contains pages whose out-links to the authority pages serve as validation of their thematic relevance to the user\u27s search objective. Two maximal biclique neighborhoods of Web pages specific to the user\u27s interest, containing eight and five pages respectively, were successfully extracted from Web server access logs containing 47,635 entries and 1,140 distinct request pages. The iterative use of these methods in association with three Web page metrics introduced in this research facilitated extending a neighborhood dynamically to include nine additional relevant pages

Louisiana Tech Digital Commons

Algorithms For Discovering Communities In Complex Networks

Author: Balakrishnan Hemant
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2006
Field of study

It has been observed that real-world random networks like the WWW, Internet, social networks, citation networks, etc., organize themselves into closely-knit groups that are locally dense and globally sparse. These closely-knit groups are termed communities. Nodes within a community are similar in some aspect. For example in a WWW network, communities might consist of web pages that share similar contents. Mining these communities facilitates better understanding of their evolution and topology, and is of great theoretical and commercial significance. Community related research has focused on two main problems: community discovery and community identification. Community discovery is the problem of extracting all the communities in a given network, whereas community identification is the problem of identifying the community, to which, a given set of nodes belong. We make a comparative study of various existing community-discovery algorithms. We then propose a new algorithm based on bibliographic metrics, which addresses the drawbacks in existing approaches. Bibliographic metrics are used to study similarities between publications in a citation network. Our algorithm classifies nodes in the network based on the similarity of their neighborhoods. One of the drawbacks of the current community-discovery algorithms is their computational complexity. These algorithms do not scale up to the enormous size of the real-world networks. We propose a hash-table-based technique that helps us compute the bibliometric similarity between nodes in O(m ?) time. Here m is the number of edges in the graph and ?, the largest degree. Next, we investigate different centrality metrics. Centrality metrics are used to portray the importance of a node in the network. We propose an algorithm that utilizes centrality metrics of the nodes to compute the importance of the edges in the network. Removal of the edges in ascending order of their importance breaks the network into components, each of which represent a community. We compare the performance of the algorithm on synthetic networks with a known community structure using several centrality metrics. Performance was measured as the percentage of nodes that were correctly classified. As an illustration, we model the ucf.edu domain as a web graph and analyze the changes in its properties like densification power law, edge density, degree distribution, diameter, etc., over a five-year period. Our results show super-linear growth in the number of edges with time. We observe (and explain) that despite the increase in average degree of the nodes, the edge density decreases with time

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Recommended from our members

Integrating Network Analysis and Data Mining Techniques into Effective Framework for Web Mining and Recommendation. A Framework for Web Mining and Recommendation

Author: Nagi Mohamad
Publication venue: School of Electrical Engineering and Computer Science
Publication date: 01/01/2015
Field of study

The main motivation for the study described in this dissertation is to benefit from the development in technology and the huge amount of available data which can be easily captured, stored and maintained electronically. We concentrate on Web usage (i.e., log) mining and Web structure mining. Analysing Web log data will reveal valuable feedback reflecting how effective the current structure of a web site is and to help the owner of a web site in understanding the behaviour of the web site visitors. We developed a framework that integrates statistical analysis, frequent pattern mining, clustering, classification and network construction and analysis. We concentrated on the statistical data related to the visitors and how they surf and pass through the various pages of a given web site to land at some target pages. Further, the frequent pattern mining technique was used to study the relationship between the various pages constituting a given web site. Clustering is used to study the similarity of users and pages. Classification suggests a target class for a given new entity by comparing the characteristics of the new entity to those of the known classes. Network construction and analysis is also employed to identify and investigate the links between the various pages constituting a Web site by constructing a network based on the frequency of access to the Web pages such that pages get linked in the network if they are identified in the result of the frequent pattern mining process as frequently accessed together. The knowledge discovered by analysing a web site and its related data should be considered valuable for online shoppers and commercial web site owners. Benefitting from the outcome of the study, a recommendation system was developed to suggest pages to visitors based on their profiles as compared to similar profiles of other visitors. The conducted experiments using popular datasets demonstrate the applicability and effectiveness of the proposed framework for Web mining and recommendation. As a by product of the proposed method, we demonstrate how it is effective in another domain for feature reduction by concentrating on gene expression data analysis as an application with some interesting results reported in Chapter 5

Bradford Scholars

Diagnosing A Silent Epidemic: The Historical Ecology of Metal Pollution in the Sonoran Desert

Author
Publication venue
Publication date: 01/01/2019
Field of study

abstract: This research investigates the biophysical and institutional mechanisms affecting the distribution of metals in the Sonoran Desert of Arizona. To date, a long-term, interdisciplinary perspective on metal pollution in the region has been lacking. To address this gap, I integrated approaches from environmental chemistry, historical geography, and institutional economics to study the history of metal pollution in the desert. First, by analyzing the chemistry embodied in the sequentially-grown spines of long-lived cacti, I created a record of metal pollution that details biogeochemical trends in the desert since the 1980s. These data suggest that metal pollution is not simply a legacy of early industrialization. Instead, I found evidence of recent metal pollution in both the heart of the city and a remote, rural location. To understand how changing land uses may have contributed to this, I next explored the historical geography of industrialization in the desert. After identifying cities and mining districts as hot spots for airborne metals, I used a mixture of historical reports, maps, and memoirs to reconstruct the industrial history of these polluted landscapes. In the process, I identified three key transitions in the energy-metal nexus that drove the redistribution of metals from mineral deposits to urban communities. These transitions coincided with the Columbian exchange, the arrival of the railroads, and the economic restructuring that accompanied World War II. Finally, to determine how legal and political forces may be influencing the fate of metals, I studied the evolution of the rights and duties affecting metals in their various forms. This allowed me to track changes in the institutions regulating metals from the mining laws of the 19th century through their treatment as occupational and public health hazards in the 20th century. In the process, I show how Arizona’s environmental and resource institutions were often transformed by extra-territorial concerns. Ultimately, this created an institutional system that compartmentalizes metals and fails to appreciate their capacity to mobilize across legal and biophysical boundaries to accumulate in the environment. Long-term, interdisciplinary perspectives such as this are critical for untangling the complex web of elements and social relations transforming the modern world.Dissertation/ThesisDoctoral Dissertation Sustainability 201

ASU Digital Repository

Identification and Modeling Social Media Influence Pathways: a Characterization of a Disinformation Campaign Using the Flooding-the-zone Strategy via Transfer Entropy

Author: Jasser Jasser
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2023
Field of study

The internet has made it easy for narratives to spread quickly and widely without regard for accuracy or the harm they may cause to society. Unfortunately, this has led to the rise of bad actors who use fake and misleading articles to spread harmful misinformation. These actors flood the information space with low-quality articles in an effort to disrupt opposing narratives, sow confusion, and discourage the pursuit of truth. In societies that prioritize free speech, maintaining control over the information space remains a persistent challenge. Achieving this requires strategic planning to protect the dissemination of information in ways that promote dialogue towards organic consensus building and protect users from undue manipulation from foreign adversarial state actors. The objective of this dissertation is to investigate how bad actors can manipulate the information space in societies that value free speech. To achieve this objective, we will define the different narratives used to flood the information space, identify the controversial elements that contribute to their spread, and analyze the actors involved in promoting these narratives and their levels of influence. To gain a deeper understanding of the dynamics that underlie information space flooding, we will examine the flow of influence from news organizations to online users across multiple social networks, and explore the formation of online communities and echo chambers that align with specific narratives. We will also investigate the role of controversiality in information and influence spread, specifically examining how controversial authors tend to be sources of influence in these networks. By addressing these objectives, we hope to provide an analysis of the ways in which bad actors can manipulate the information space. Furthermore, we aim to provide insights into how we can develop strategies to counteract these efforts and protect the integrity of the information ecosystem. Through our investigation, we hope to contribute to the growing body of research focused on understanding and addressing the challenges posed by bad actors in the information space

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)