Search CORE

13 research outputs found

Quality information retrieval for the World Wide Web

Author: Hagenbuchner Markus
Kc Milly W
Tsoi Ah Chung
Publication venue: 'Sociological Research Online'
Publication date: 01/01/2008
Field of study

The World Wide Web is an unregulated communication medium which exhibits very limited means of quality control. Quality assurance has become a key issue for many information retrieval services on the Internet, e.g. web search engines. This paper introduces some quality evaluation and assessment methods to assess the quality of web pages. The proposed quality evaluation mechanisms are based on a set of quality criteria which were extracted from a targeted user survey. A weighted algorithmic interpretation of the most significant user quoted quality criteria is proposed. In addition, the paper utilizes machine learning methods to produce a prediction of quality for web pages before they are downloaded. The set of quality criteria allows us to implement a web search engine with quality ranking schemes, leading to web crawlers which can crawl directly quality web pages. The proposed approaches produce some very promising results on a sizable web repository

Crossref

Research Online

Linking Climate Change and Groundwater

Author: A Aureli
A Iglesias
A Kertesz
A Miyakoshi
A Serrat-Capdevila
A Vandenbohede
AC Skinner
B Bates
BC Hewitson
BD Tapley
BF Zaitchik
BI Cook
BR Scanlon
BR Scanlon
BS Caruso
BS Sukhija
C Cambi
C Tague
C Yang
C-Y Xu
CD Keeling
CD Keeling
CP Slomp
CS Ngongondo
D Cohen
D Dent
D Porcelli
D Rowell
DA Randall
DG Jorgensen
DH Speidel
DL Sahagian
DM Allen
DR Cayan
DR Pool
DS Chapman
EAB Eltahir
EAB Eltahir
EAB Eltahir
EN Lorenz
F Bouraoui
F Bultot
F Gasse
FHS Chiew
FM Phillips
G Barrocu
G Destouni
G Ramillien
G Wang
GHP Oude Essink
GHP Oude Essink
GHP Oude Essink
GJ McCabe
GM Zuppi
H Aguilera
H Beltrami
H Faure
H Kitabata
H Kooi
H Kooi
H Treut Le
HA Loaiciga
HA Loaiciga
HA Loaiciga
HA Loaiciga
HF Jungkunst
HL Windom
I Clark
I White
I Yusoff
IP Holman
IS Zektser
IT Stewart
J Bear
J Dams
J Kalma
J Okkonen
J Scibek
J Scibek
J Wahr
JAM Gun van der
JD Garbrecht
JD Neelin
JD Salas
JD Salas
JE Smerdon
JF González-Rouco
JH Christensen
JJ Gurdak
JJ Gurdak
JJ Gurdak
JJ Gurdak
JJ Vaccaro
JJ Vries de
JM St.Jacques
JP Bloomfield
JR Petit
JT Payne
K Eckhardt
K Jasper
K Sandstrom
KC Hsu
KE Taylor
KM Hiscock
KS Lee
KW Thoning
L Brown
L Candela
L Roosmalen van
L Roosmalen van
LD Brekke
LN Plummer
LO Mearns
M Anderson
M Bazilian
M Beuhler
M Ghil
M Herrera-Pantoja
M Laroque
M Mudelsee
M Rodell
M Rodell
M Sophocleous
M Stute
M Taniguchi
M Taniguchi
M Taniguchi
MA Berg
MA Walvoord
MC Castro
MD Dettinger
MI Jyrkama
MJ Hendry
ML Sharma
MM Sharif
MW Toews
N Lambrakis
N Nakićenović
N Ruud
NJ Mantua
NJ Rosenberg
O Novicky
O Ojo
P Cook
P Döll
P Goderniaux
P Kabat
P Ranjan
PB McMahon
PCD Milly
PCD Milly
PF Juckem
PH Gleick
PJ-F Yeh
PW Mote
R Barthel
R Jeu de
RA Kerr
RA Pielke Sr
RG Dzhamalov
RG Taylor
RJ Hunt
RJT Klein
RL Wilby
RL Wilby
RMM Leith
RN Peterson
RT Hanson
RT Hanson
RT Hanson
RW Healy
S Brouyere
S Dessai
S Emori
S Giertz
S Haldorsen
S Huang
S Ouysse
S Postel
S Roy
S Swenson
SD Warner
SE Grasby
SI Seneviratne
SJE Dijck Van
SK Tanaka
SM Gosling
SP Ranjan
ST Woldeamlak
T Gleeson
T Shah
T Wang
TC Winter
TC Winter
TD Mayer
TH Brikowski
TP Barnett
TR Green
TR Green
V Čermák
VF Bense
VN Sharda
VR Burkett
VS Kovalevskii
W Bajjali
W Knorr
W Martin-Rosales
WC Burnett
WE Glassley
WJ Gutowski
WL Pierson
WM Alley
WM Alley
WM Edmunds
Y Fukuda
Y Yechieli
Z Chen
Z Chen
ZW Kundzewicz
ZW Kundzewicz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Springer - Publisher Connector

Building a prototype for quality information retrieval from the World Wide Web

Author: Kc Milly W
Publication venue: Faculty of Informatics
Publication date: 01/01/2009
Field of study

Given the phenomenal rate by which the World Wide Web is changing, retrieval methods and quality assurance have become bottleneck issues for many information retrieval services on the Internet, e.g. Web search engine designs. In this thesis, approaches that increase the efficiency of information retrieval methods, and provide for quality assurance of information obtained from the Web, are developed through the implementation of a quality-focused information retrieval system. A novel approach to the retrieval of quality information from the Internet is introduced. Implemented as a component of a vertical search application, this results in a focused crawler which is capable of retrieving quality information from the Internet. The three main contributions of this research are: (1) An effective and flexible crawling application that is well-suited for information retrieving tasks on the dynamic World Wide Web (WWW) is implemented. The resulting crawling application (crawler) is designed after having observed the dynamics of the web evolution through regular monitoring of the WWW; it also addresses the shortcomings of some existing crawlers, therefore presenting itself as a practical implementation. (2) A mechanism that converts human quality judgement through user surveys into an algorithm is developed, so that user perceptions of a set of criteria which may lead to determination of the quality content on the web page concerned, can be applied to a large amount of Web documents with minimal manual effort. This was obtained through a relatively large user survey which was conducted in a collaborative research work with Dr Shirlee-Ann Knight of Edith Cowan University. The survey was conducted to determine what criteria Web documents are perceived to meet to qualify as a quality document. This results in an aggregate numeric score for each web page between 0 and 1 respectively indicating that it does not meet any quality criteria, or that it meets all quality criteria perfectly. (3) This research proposes an approach to predict the quality of a web page before it is retrieved by a crawler. The approach allows its incorporation into a vertical search application which focuses on the retrieval of quality information. Experimental results on real world data show that the proposed approach is more effective than any other brute force approaches which have been published so far. The proposed methods produce a numerical quality score for any text based Web document. This thesis will show that such score can also be used as a web page ranking criterion for horizontal search engines. As part of this research project, this ranking scheme has been implemented and embedded into a working search engine. The observed user feedback confirms that search results when ranked by quality score, satisfy user needs more satisfactorily than when ranked by other popular ranking schemes such as PageRank or relevancy ranking. It is also investigated whether the combination of quality score with existing ranking schemes can further enhance the user experience with search engines

Research Online

A scalable lightweight distributed crawler for crawling with limited resources

Author: Hagenbuchner Markus
Kc Milly W.
Tsoi Ah Chung
Publication venue: 'Sociological Research Online'
Publication date: 01/01/2008
Field of study

Web page crawlers are an essential component in a number of Web applications. The sheer size of the Internet can pose problems in the design of Web crawlers. All currently known crawlers implement approximations or have limitations so as to maximize the throughput of the crawl, and hence, maximize the number of pages that can be retrieved within a given time frame. This paper proposes a distributed crawling concept which is designed to avoid approximations, to limit the network overhead, and to run on relatively inexpensive hardware. A set of experiments, and comparisons highlight the effectiveness of the proposed approach

Research Online

Self organizing maps for the clustering of large sets of labeled graphs

Author: Hagenbuchner Markus
Kc Milly W
Tsoi Ah Chung
Zhang Shujia
Publication venue: 'Sociological Research Online'
Publication date: 01/01/2009
Field of study

Graph Self-Organizing Maps (GraphSOMs) are a new concept in the processing of structured objects using machine learning methods. The GraphSOM is a generalization of the Self-Organizing Maps for Structured Domain (SOM-SD) which had been shown to be a capable unsupervised machine learning method for some types of graphstructured information. An application of the SOM-SD to document mining tasks as part of an international competition: Initiative for the Evaluation of XML Retrieval (INEX), on the clustering of XML formatted documents was conducted, and the method subsequently won the competition in 2005 and 2006 respectively. This paper applies the GraphSOM to theclustering of a larger dataset in the INEX competition 2007. The results are compared with those obtained when utilizing the more traditional SOM-SD approach. Experimental results show that (1) the GraphSOM is computationally more efficient than the SOM-SD, (2) the performances of both approaches on the larger dataset in INEX 2007 are not competitive when compared with those obtained by other participants of the competition using other approaches, and, (3) different structural representation of the same dataset can influence the performance of the proposed GraphSOM technique

Research Online

Efficient clustering of structured documents using graph self-organizing maps

Author: Hagenbuchner Markus
Kc Milly W
Sperduti Alessandro
Tsoi Ah Chung C.
Publication venue: 'Sociological Research Online'
Publication date: 01/01/2008
Field of study

Research Online

A machine learning approach to link prediction for interlinked documents

Author: Chau Rowena
Hagenbuchner Markus
Kc Milly W
Lee Vincent
Tsoi Ah Chung
Publication venue: 'Sociological Research Online'
Publication date: 01/01/2010
Field of study

This paper provides an explanation to how a recently developed machine learning approach, namely the Probability Measure Graph Self-Organizing Map (PM-GraphSOM) can be used for the generation of links between referenced or otherwise interlinked documents. This new generation of SOM models are capable of projecting generic graph structured data onto a fixed sized display space. Such a mechanism is normally used for dimension reduction, visualization, or clustering purposes. This paper shows that the PM-GraphSOM training algorithm “inadvertently” encodes relations that exist between the atomic elements in a graph. If the nodes in the graph represent documents, and the links in the graph represent the reference (or hyperlink) structure of the documents, then it is possible to obtain a set of links for a test document whose link structure is unknown. A significant finding of this paper is that the described approach is scalable in that links can be extracted in linear time. It will also be shown that the proposed approach is capable of predicting the pages which would be linked to a new document, and is capable of predicting the links to other documents from a given test document. The approach is applied to web pages from Wikipedia, a relatively large XML text database consisting of many referenced documents

Research Online

Drought losses in China might double between the 1.5 °C and 2.0 °C warming

Author: Andreadis
Anqian Wang
Buda Su
Chen
Dai
Dai
Fischer
Guojie Wang
Hemin Sun
Hirabayashi
Huang
Hui Tao
Jiang
Jianqing Zhai
Jinlong Huang
Kc
Leimbach
Marco Gemmer
Milly
Milly
O’Neill
Sheffield
Sun
Thomas Fischer
Tong Jiang
Touma
Trenberth
van Vuuren
Wang
Wu
Xiaofan Zeng
Xiucang Li
Xu
Yanjun Wang
Zbigniew W. Kundzewicz
Zhai
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date
Field of study

Crossref

Adaptive governance to promote ecosystem services in urban green spaces

Author: AB Morancho
Adam Berland
Ahjond S. Garmestani
AJ Tyre
AS Garmestani
B Bolitzer
BC Karkkainen
BC Karkkainen
C Burkman
C Harrison
Caitlin E. Burkman
CC Branas
CY Jim
D Mitsova
DE Pataki
DS Srivastava
DU Hooper
EC Garvin
FJ Escobedo
G McPherson
GW Frankie
H Rudd
J Bartens
J Colding
J Declet-Barreto
J Niemela
JB Ruhl
JE Duffy
JF Dwyer
JL Crompton
JR Miller
JR Miller
JR Miller
K Morland
K Pothukuchi
K Shea
K Wolf
K Wolf
KC Matteson
KC Matteson
KN Lee
Lance Gunderson
M Kattwinkel
MA Altieri
Mary M. Gardiner
Matthew E. Hopton
MCR Hunter
MD Robards
Michael L. Schoon
MM Gardiner
MM Gardiner
MP Corrigan
Natalie C. Ban
NE McIntyre
NE McIntyre
Olivia Odom Green
OO Green
P Balvanera
P Bolund
PCD Milly
R Friedrich
R Isaacs
RA Rae
RJ McLain
RV Pouyat
Sandra Albro
SD Day
SH Faeth
SK Mincey
SM Landry
SS Grewal
ST Anderson
W Medema
W Zhang
WD Shuster
William D. Shuster
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Integrated SWAT model and statistical downscaling for estimating streamflow response to climate change in the Lake Dianchi watershed, China

Author: A Griensven Van
AH Thiessen
B Xie
BC Hewitson
C Harpham
C Prudhomme
CY Xu
Dan He
DB Huang
DH Bae
DL Wu
DN Moriasi
EP Maurer
F Bouraoui
G Huntingford
H Chen
H Sheng
HJ Fowler
Hu Sheng
Huaicheng Guo
HW Paerl
IPCC
IPCC Report AR4
J Huang
J Yang
J Zhou
J Zhou
JE Nash
JG Arnold
JG Cheng
JG Cheng
Jing Zhou
JJ Chen
KC Abbaspour
KS Wu
L Gao
LE Brown
Lei Zhao
LL Liu
LP Graham
MR Najafi
MS Khan
N Wan
NS Christensen
NW Arnell
OH Guldberg
P Gonzalez
PCD Milly
PG Whitehead
QW Chen
R Urrutia
RL Wilby
RL Wilby
Rui Zou
S Montenegro
SC Moser
SG Setegn
SL Neitsch
T Yang
TJ Whitmore
TT Wang
VK Arora
W Hagg
WC Zhang
WW Immerzeel
XL Yu
XY Shi
Y Liu
Y Liu
Y Liu
YA Guo
YB Dibike
YF Liu
YL He
Yong Liu
Yonghui Yang
Yufeng Xie
Z Li
ZX Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref