Search CORE

160 research outputs found

Pinpoint Clustering of Web Pages and Mining Implicit Crossover Concepts

Author: Makoto Haraguchi
Yoshiaki Okubo
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

IntechOpen

Crossref

Finding Conceptual Document Clusters Based on Top-N Formal Concept Search: Pruning Mechanism and Empirical Effectiveness

Author: Makoto Haraguchi
Yoshiaki Okubo
Publication venue: 'IntechOpen'
Publication date: 26/04/2011
Field of study

IntechOpen

Clique-based data mining for related genes in a biomedical database

Author: A Bairoch
A Hamosh
AL Barabási
B Adamcsek
B Baudin
Chikara Yonemori
DJ Cook
DJ Watts
DM Wilkinson
E Tomita
E Tomita
EE Snyder
Etsuji Tomita
H Hu
H Müller
J Chen
J Hauer
JP Benzecri
K Almind
K Oda
KI Goh
LJ Jensen
M Haraguchi
M Kanehisa
Masaaki Muramatsu
MEJ Newman
MEJ Newman
MK Halushka
MY Galperin
NCEP
O Seda
PC White
PM Roberts
R Dunn
R Sharan
RA De Fronzo
RH Eckel
T Aittokallio
T Matsunaga
T Uno
Tsutomu Matsunaga
X Yan
Y Wang
Y Zhang
Publication venue: BioMed Central
Publication date: 01/07/2009
Field of study

Abstract Background Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph. Results We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes. Conclusion We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Enumeration of condition-dependent dense modules in protein interaction networks

Author: Avis
Bader
Bader
Breitkreutz
Cenciarelli
Chatr-aryamontri
Chavez
Chen
Chuang
Dudley
Elbing
Elisabeth Georgii
Everett
Farkas
Gavin
Gavin
Guldener
Guldener
Han
Hanisch
Haraguchi
Hermjakob
Huang
Ideker
Janeway
Jansen
Kipreos
Koepp
Koji Tsuda
Koyuturk
Krogan
Lei
Ling
Newman
O'Brien
Orphanides
Palla
Pei
Peri
Philipp Pagel
Ruepp
Sabine Dietmann
Segal
Shamir
Sharan
Spirin
Su
Takeaki Uno
Tanay
Ulitsky
Uno
van Dongen
Vincent
Wurmser
Xenarios
Yan
Zeng
Zhao
Zheng
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Motivation: Modern systems biology aims at understanding how the different molecular components of a biological cell interact. Often, cellular functions are performed by complexes consisting of many different proteins. The composition of these complexes may change according to the cellular environment, and one protein may be involved in several different processes. The automatic discovery of functional complexes from protein interaction data is challenging. While previous approaches use approximations to extract dense modules, our approach exactly solves the problem of dense module enumeration. Furthermore, constraints from additional information sources such as gene expression and phenotype data can be integrated, so we can systematically mine for dense modules with interesting profiles

The bipartite clique: A topological paradigm for Web user search customization and Web site restructuring

Author: Choyce-Miles Brenda F.
Publication venue: Louisiana Tech Digital Commons
Publication date: 01/04/2005
Field of study

The objective of this dissertation research is to aid the Web user to achieve his search objective at a host Web site by organizing a strongly connected neighborhood of Web pages that are thematically and spatially related to the user\u27s search interest. Therefore, methods were developed to (1) find all Web pages at a given Web site that are thematically similar to a user\u27s initial choice of a Web page (selected from the set of Web pages returned in response to a query by any popular search engine), and (2) organize these pages hierarchically in terms of their relevance to the user\u27s initial Web page request. This selection and organization of pages is dynamically adjusted in order to make these methods responsive to the user\u27s choice of pages defining his search agenda. The methods developed in this work skillfully incorporate the production of the bipartite clique graph structure to simulate both spatial and thematic relatedness of Web pages. By ranking the user\u27s initial page choice as the most relevant page, the authority page, link analysis is used to identify a set of pages with out-links to this authority page and assemble these into a hub of relevant pages. The authority set (initially containing only the user\u27s initial page choice) is then expanded to include other pages with in-links from the set of hub pages. The authority-hub relationship signified by Web page links is used to define the two partite sets of the biclique graph. The partite set of authority pages contains the user\u27s initial page choice and other thematically and spatially similar pages. The partite set of hub pages contains pages whose out-links to the authority pages serve as validation of their thematic relevance to the user\u27s search objective. Two maximal biclique neighborhoods of Web pages specific to the user\u27s interest, containing eight and five pages respectively, were successfully extracted from Web server access logs containing 47,635 entries and 1,140 distinct request pages. The iterative use of these methods in association with three Web page metrics introduced in this research facilitated extending a neighborhood dynamically to include nine additional relevant pages

Louisiana Tech Digital Commons

Search Rank Fraud Prevention in Online Systems

Author: Rahman Md Mizanur
Publication venue: FIU Digital Commons
Publication date: 01/01/2018
Field of study

The survival of products in online services such as Google Play, Yelp, Facebook and Amazon, is contingent on their search rank. This, along with the social impact of such services, has also turned them into a lucrative medium for fraudulently influencing public opinion. Motivated by the need to aggressively promote products, communities that specialize in social network fraud (e.g., fake opinions and reviews, likes, followers, app installs) have emerged, to create a black market for fraudulent search optimization. Fraudulent product developers exploit these communities to hire teams of workers willing and able to commit fraud collectively, emulating realistic, spontaneous activities from unrelated people. We call this behavior “search rank fraud”. In this dissertation, we argue that fraud needs to be proactively discouraged and prevented, instead of only reactively detected and filtered. We introduce two novel approaches to discourage search rank fraud in online systems. First, we detect fraud in real-time, when it is posted, and impose resource consuming penalties on the devices that post activities. We introduce and leverage several novel concepts that include (i) stateless, verifiable computational puzzles that impose minimal performance overhead, but enable the efficient verification of their authenticity, (ii) a real-time, graph based solution to assign fraud scores to user activities, and (iii) mechanisms to dynamically adjust puzzle difficulty levels based on fraud scores and the computational capabilities of devices. In a second approach, we introduce the problem of fraud de-anonymization: reveal the crowdsourcing site accounts of the people who post large amounts of fraud, thus their bank accounts, and provide compelling evidence of fraud to the users of products that they promote. We investigate the ability of our solutions to ensure that fraud does not pay off

DigitalCommons@Florida International University

Dagstuhl Reports : Volume 1, Issue 2, February 2011

Author: Schloss Dagstuhl Leibniz-Zentrum für Informatik
Publication venue
Publication date: 09/09/2011
Field of study

Online Privacy: Towards Informational Self-Determination on the Internet (Dagstuhl Perspectives Workshop 11061) : Simone Fischer-Hübner, Chris Hoofnagle, Kai Rannenberg, Michael Waidner, Ioannis Krontiris and Michael Marhöfer Self-Repairing Programs (Dagstuhl Seminar 11062) : Mauro Pezzé, Martin C. Rinard, Westley Weimer and Andreas Zeller Theory and Applications of Graph Searching Problems (Dagstuhl Seminar 11071) : Fedor V. Fomin, Pierre Fraigniaud, Stephan Kreutzer and Dimitrios M. Thilikos Combinatorial and Algorithmic Aspects of Sequence Processing (Dagstuhl Seminar 11081) : Maxime Crochemore, Lila Kari, Mehryar Mohri and Dirk Nowotka Packing and Scheduling Algorithms for Information and Communication Services (Dagstuhl Seminar 11091) Klaus Jansen, Claire Mathieu, Hadas Shachnai and Neal E. Youn

Hochschulschriftenserver - Universität Frankfurt am Main

Automatic assistants for database exploration

Author: Sellam T.H.J. (Thibault)
Publication venue
Publication date: 03/11/2016
Field of study

CWI's Institutional Repository

Logic learning and optimized drawing: two hard combinatorial problems

Author: Pastore Tommaso
Publication venue
Publication date: 10/12/2018
Field of study

Nowadays, information extraction from large datasets is a recurring operation in countless fields of applications. The purpose leading this thesis is to ideally follow the data flow along its journey, describing some hard combinatorial problems that arise from two key processes, one consecutive to the other: information extraction and representation. The approaches here considered will focus mainly on metaheuristic algorithms, to address the need for fast and effective optimization methods. The problems studied include data extraction instances, as Supervised Learning in Logic Domains and the Max Cut-Clique Problem, as well as two different Graph Drawing Problems. Moreover, stemming from these main topics, other additional themes will be discussed, namely two different approaches to handle Information Variability in Combinatorial Optimization Problems (COPs), and Topology Optimization of lightweight concrete structures

Università degli Studi di Napoli Federico Il Open Archive

Mobile Search Engine using Clustering and Query Expansion

Author: Nguyen Huy
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2010
Field of study

Internet content is growing exponentially and searching for useful content is a tedious task that we all deal with today. Mobile phones lack of screen space and limited interaction methods makes traditional search engine interface very inefficient. As the use of mobile internet continues to grow there is a need for an effective search tool. I have created a mobile search engine that uses clustering and query expansion to find relevant web pages efficiently. Clustering organizes web pages into groups that reflect different components of a query topic. Users can ignore clusters that they find irrelevant so they are not forced to sift through a long list of off-topic web pages. Query expansion uses query results, dictionaries, and cluster labels to formulate additional terms to manipulate the original query. The new manipulated query gives a more in depth result that eliminates noise. I believe that these two techniques are effective and can be combined to make the ultimate mobile search engine

SJSU ScholarWorks