Search CORE

1,507 research outputs found

Towards ultrahigh dimensional feature selection for big data

Author: Tan M
Tsang IW
Wang L
Publication venue
Publication date: 01/01/2014
Field of study

In this paper, we present a new adaptive feature scaling scheme for ultrahigh-dimensional feature selection on Big Data, and then reformulate it as a convex semi-infinite programming (SIP) problem. To address the SIP, we propose an eficient feature generating paradigm. Different from traditional gradient-based approaches that conduct optimization on all input features, the proposed paradigm iteratively activates a group of features, and solves a sequence of multiple kernel learning (MKL) subproblems. To further speed up the training, we propose to solve the MKL subproblems in their primal forms through a modified accelerated proximal gradient approach. Due to such optimization scheme, some eficient cache techniques are also developed. The feature generating paradigm is guaranteed to converge globally under mild conditions, and can achieve lower feature selection bias. Moreover, the proposed method can tackle two challenging tasks in feature selection: 1) group-based feature selection with complex structures, and 2) nonlinear feature selection with explicit feature mappings. Comprehensive experiments on a wide range of synthetic and real-world data sets of tens of million data points with O(1014) features demonstrate the competitive performance of the proposed method over state-of-the-art feature selection methods in terms of generalization performance and training eficiency. © 2014 Mingkui Tan, Ivor W. Tsang and Li Wang

OPUS - University of Technology Sydney

DR-NTU (Digital Repository of NTU)

Principal Graph and Structure Learning Based on Reversed Graph Embedding

Author: Mao Q
Sun Y
Tsang IW
Wang L
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2017
Field of study

© 2017 IEEE. Many scientific datasets are of high dimension, and the analysis usually requires retaining the most important structures of data. Principal curve is a widely used approach for this purpose. However, many existing methods work only for data with structures that are mathematically formulated by curves, which is quite restrictive for real applications. A few methods can overcome the above problem, but they either require complicated human-made rules for a specific task with lack of adaption flexibility to different tasks, or cannot obtain explicit structures of data. To address these issues, we develop a novel principal graph and structure learning framework that captures the local information of the underlying graph structure based on reversed graph embedding. As showcases, models that can learn a spanning tree or a weighted undirected ℓ1 graph are proposed, and a new learning algorithm is developed that learns a set of principal points and a graph structure from data, simultaneously. The new algorithm is simple with guaranteed convergence. We then extend the proposed framework to deal with large-scale data. Experimental results on various synthetic and six real world datasets show that the proposed method compares favorably with baselines and can uncover the underlying structure correctly

OPUS - University of Technology Sydney

Increased entropy of signal transduction in the cancer metastasis phenotype

Author: A Barrat
A Naderi
A Ozgür
A Platzer
AA Samani
AE Teschendorff
AE Teschendorff
AE Teschendorff
AL Barabasi
Andrew E Teschendorff
B Derrida
C Sotiriou
D Pardoll
D Yee
DM Bates
DP Tuck
F Rapaport
H Jeong
H Yu
HY Chuang
I Ulitsky
IW Taylor
J Schäfer
JD Han
JD Storey
JJ Hornberg
K Chin
KR Brown
M Barthelemy
M Neuberg
M Schmidt
MA Pujana
P Farmer
PF Jonsson
RK Nibbe
S Loi
S Negrini
SF Chin
Simone Severini
SL Carter
TS Prasad
U Stelzl
Y Wang
Publication venue
Publication date: 01/01/2010
Field of study

Studies into the statistical properties of biological networks have led to important biological insights, such as the presence of hubs and hierarchical modularity. There is also a growing interest in studying the statistical properties of networks in the context of cancer genomics. However, relatively little is known as to what network features differ between the cancer and normal cell physiologies, or between different cancer cell phenotypes. Based on the observation that frequent genomic alterations underlie a more aggressive cancer phenotype, we asked if such an effect could be detectable as an increase in the randomness of local gene expression patterns. Using a breast cancer gene expression data set and a model network of protein interactions we derive constrained weighted networks defined by a stochastic information flux matrix reflecting expression correlations between interacting proteins. Based on this stochastic matrix we propose and compute an entropy measure that quantifies the degree of randomness in the local pattern of information flux around single genes. By comparing the local entropies in the non-metastatic versus metastatic breast cancer networks, we here show that breast cancers that metastasize are characterised by a small yet significant increase in the degree of randomness of local expression patterns. We validate this result in three additional breast cancer expression data sets and demonstrate that local entropy better characterises the metastatic phenotype than other non-entropy based measures. We show that increases in entropy can be used to identify genes and signalling pathways implicated in breast cancer metastasis. Further exploration of such integrated cancer expression and protein interaction networks will therefore be a fruitful endeavour.Comment: 5 figures, 2 Supplementary Figures and Table

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

UCL Discovery

PubMed Central

A critical evaluation of network and pathway based classifiers for outcome prediction in breast cancer

Author: A Subramanian
C Desmedt
Christine Staiger
D Hanahan
E Lee
F Reyal
G Abraham
GR Mishra
Gunnar W. Klau
HY Chuang
I Ulitsky
IW Taylor
Joaquín Dopazo
K Chin
KR Brown
L Ein-Dor
L Tian
LD Miller
LFA Wessels
LJ van’t Veer
Lodewyk F. A. Wessels
M Kanehisa
Marcus Dittrich
MH van Vliet
MJ van de Vijver
ML Gatza
MT Dittrich
P Dao
Raul Kooter
S Loi
S Ma
SA Chowdhury
Sidney Cadot
Tobias Müller
TSK Prasad
V Popovici
Y Pawitan
Y Wang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/10/2011
Field of study

Recently, several classifiers that combine primary tumor data, like gene expression data, and secondary data sources, such as protein-protein interaction networks, have been proposed for predicting outcome in breast cancer. In these approaches, new composite features are typically constructed by aggregating the expression levels of several genes. The secondary data sources are employed to guide this aggregation. Although many studies claim that these approaches improve classification performance over single gene classifiers, the gain in performance is difficult to assess. This stems mainly from the fact that different breast cancer data sets and validation procedures are employed to assess the performance. Here we address these issues by employing a large cohort of six breast cancer data sets as benchmark set and by performing an unbiased evaluation of the classification accuracies of the different approaches. Contrary to previous claims, we find that composite feature classifiers do not outperform simple single gene classifiers. We investigate the effect of (1) the number of selected features; (2) the specific gene set from which features are selected; (3) the size of the training set and (4) the heterogeneity of the data set on the performance of composite feature and single gene classifiers. Strikingly, we find that randomization of secondary data sources, which destroys all biological information in these sources, does not result in a deterioration in performance of composite feature classifiers. Finally, we show that when a proper correction for gene set size is performed, the stability of single gene sets is similar to the stability of composite feature sets. Based on these results there is currently no reason to prefer prognostic classifiers based on composite features over single gene classifiers for predicting outcome in breast cancer

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

VU Research Portal

CWI's Institutional Repository

Directory of Open Access Journals

PubMed Central

Online-Publikations-Server der Universität Würzburg

FigShare

Violations of local stochastic independence exaggerate scalability in Mokken scaling analysis of the Chinese Mandarin SF-36

Author: A Bedford
BT Hempker
CD DeSante
CJ Liu
David R Thompson
GD Mishra
GJ Boyle
IJL Egberink
IW Molenaar
IW Nader
J Starkweather
JE Ware
JH Straat
K Niemöller
K Sijtsma
K Sijtsma
K Sijtsma
LA Van der Ark
P Kline
PMG Van der Heijden
R Ligtvoet
R Mokken
R Watson
RE Kuijpers
RJ De Ayala
Roger Watson
RR Meijer
SD Shenkin
W Wang
W Wang
W Wang
Wenru Wang
WHM Emons
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Bioﬁlter aquaponic system for nutrients removal from fresh market wastewater

Author: A Endut
A Manarangi
AA Al-Gheethi
AA Al-Gheethi
AR Rahmani
AW Mayo
B Holm
B Marques
CA Madera-Parra
CE Lim
CH Sim
CY Wang
E Noman
G Provolo
GH Bindu
H Huang
H Teiri
IW Witus
J Del-Pozo
J Popp
JL Catley
JP Fry
KM Buzby
L Dediu
L Silva
NM Apandi
NM Jais
PV Haseena
RA Criley
S Jasrotia
S Mishra
S Naidoo
S Rysgaard
S Wongkiew
SE Boxman
SN Nandeshwar
SO Ojoawo
SS Lam
VK Gupta
XW Chen
Y Xu
YY Fang
Z Hu
Publication venue: Springer Nature
Publication date: 01/01/2020
Field of study

Aquaponics is a signiﬁcant wastewater treatment system which refers to the combination of conventional aquaculture (raising aquatic organism) with hydroponics (cultivating plants in water) in a symbiotic environment. This system has a high ability in removing nutrients compared to conventional methods because it is a natural and environmentally friendly system (aquaponics). The current chapter aimed to review the possible application of aquaponics system to treat fresh market wastewater with the intention to highlight the mechanism of phytoremediation occurs in aquaponic system. The literature revealed that aquaponic system was able to remove nutrients in terms of nitrogen and phosphorus

UTHM Institutional Repository

Crossref

Structure and mechanism of human DNA polymerase η

Author: A Alt
A Mees
A Nicholls
A Scrima
AJ McCoy
AJ McCoy
Alan R. Lehmann
AR Lehmann
AT Brünger
BC Broughton
C Masutani
C Masutani
C Vonrhein
Chikahide Masutani
Christian Biertümpfel
DE Brash
DF Jarosz
DG Vassylyev
DV Bugreev
E Bassett
E Glick
F Wang
Fumio Hanaoka
H Inui
H Ling
H Ling
H Ling
H Park
H Saribasak
IW Davis
J Bauer
J Di Lucca
J Trincao
J Yao
Jae Young Lee
JH Min
K Sugasawa
L Jia
L Rey
L Wang
M Tanioka
Mark Gregory
MJ McIlwraith
P Emsley
R Kusumoto
R Kusumoto
RC Wilson
RE Johnson
RJ Kokoska
S Broyde
S Creighton
S Lone
Santiago Ramón-Maiques
SD McCulloch
SG Chaney
T Hishida
T Kawamoto
TC Terwilliger
W Kabsch
W Yang
W Yang
WA Hendrickson
Wei Yang
Y Li
Ye Zhao
Yuji Kondo
Z Otwinowski
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/06/2010
Field of study

The variant form of the human syndrome xeroderma pigmentosum (XPV) is caused by a deficiency in DNA polymerase eta (Pol eta), a DNA polymerase that enables replication through ultraviolet-induced pyrimidine dimers. Here we report high-resolution crystal structures of human Pol eta at four consecutive steps during DNA synthesis through cis-syn cyclobutane thymine dimers. Pol eta acts like a 'molecular splint' to stabilize damaged DNA in a normal B-form conformation. An enlarged active site accommodates the thymine dimer with excellent stereochemistry for two-metal ion catalysis. Two residues conserved among Pol eta orthologues form specific hydrogen bonds with the lesion and the incoming nucleotide to assist translesion synthesis. On the basis of the structures, eight Pol eta missense mutations causing XPV can be rationalized as undermining the molecular splint or perturbing the active-site alignment. The structures also provide an insight into the role of Pol eta in replicating through D loop and DNA fragile sites

Crossref

PubMed Central

Sussex Research Online

An effective theory for jet propagation in dense QCD matter: jet broadening and medium-induced bremsstrahlung

Author: A Hornig
A Idilbi
A Idilbi
AV Belitsky
AV Manohar
B-W Zhang
CW Bauer
CW Bauer
CW Bauer
CW Bauer
CW Bauer
CW Bauer
G Aad
GF Sterman
GL Bayatian
Grigory Ovanesyan
GT Bodwin
I Vitev
I Vitev
I Vitev
I Vitev
I Vitev
Ivan Vitev
IW Stewart
J-w Qiu
JC Collins
JC Collins
JC Collins
JC Collins
JR Ellis
K Aamodt
LF Abbott
M Baumgart
M Djordjevic
M Gyulassy
M Gyulassy
M Ploskon
ME Luke
PB Arnold
R Baier
R Sharma
RB Neufeld
RP Feynman
S Fleming
S Fleming
S Mantry
S Salur
T Becher
T Becher
T Becher
T Becher
T Renk
V Ahrens
X-d Ji
X-N Wang
Y-S Lai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/12/2011
Field of study

Two effects, jet broadening and gluon bremsstrahlung induced by the propagation of a highly energetic quark in dense QCD matter, are reconsidered from effective theory point of view. We modify the standard Soft Collinear Effective Theory (SCET) Lagrangian to include Glauber modes, which are needed to implement the interactions between the medium and the collinear fields. We derive the Feynman rules for this Lagrangian and show that it is invariant under soft and collinear gauge transformations. We find that the newly constructed theory SCET

_{\rm G}

recovers exactly the general result for the transverse momentum broadening of jets. In the limit where the radiated gluons are significantly less energetic than the parent quark, we obtain a jet energy-loss kernel identical to the one discussed in the reaction operator approach to parton propagation in matter. In the framework of SCET

_{\rm G}

we present results for the fully-differential bremsstrahlung spectrum for both the incoherent and the Landau-Pomeranchunk-Migdal suppressed regimes beyond the soft-gluon approximation. Gauge invariance of the physics results is demonstrated explicitly by performing the calculations in both the light-cone and covariant

R_{\xi}

gauges. We also show how the process-dependent medium-induced radiative corrections factorize from the jet production cross section on the example of the quark jets considered here.Comment: 52 pages, 15 pdf figures, as published in JHE

arXiv.org e-Print Archive

Crossref

Effects of Redispersible Polymer Powder on Mechanical and Durability Properties of Preplaced Aggregate Concrete with Recycled Railway Ballast

Author: A Qudoos
ACI Committee 304
ACI Committee 318
ASTM C109
ASTM C192
ASTM C39
ASTM C469
ASTM C666
ASTM C78
B Persson
E Sakai
F Shaker
H Choi
IW Lee
IW Lee
J Li
JA Rossignolo
JB Kardon
JF Muñoz
JH Kim
JK Norvell
JY Choi
K Kim
K McNeil
K Murao
Korean Standards Association
Korean Standards Association
L Jun
LK Aggarwal
M Wang
MF Najjar
P Galvín
R Wang
S Liu
S Miura
SY Jang
THK Kang
Y Bezin
Y Ohama
Y Zhang
Z Su
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2018
Field of study

The rapid-hardening method employing the injection of calcium sulfoaluminate (CSA) cement mortar into voids between preplaced ballast aggregates has recently emerged as a promising approach for the renovation of existing ballasted railway tracks to concrete tracks. This method typically involves the use of a redispersible polymer powder to enhance the durability of the resulting recycled aggregate concrete. However, the effects of the amount of polymer on the mechanical and durability properties of recycled ballast aggregate concrete were not clearly understood. In addition, the effects of the cleanness condition of ballast aggregates were never examined. This study aimed at investigating these two aspects through compression and flexure tests, shrinkage tests, freezing-thawing resistance tests, and optical microscopy. The results revealed that an increase in the amount of polymer generally decreased the compressive strength at the curing age of 28 days. However, the use of a higher polymer ratio enhanced the modulus of rupture, freezing-thawing resistance, and shrinkage resistance, likely because it improved the microstructure of the interfacial transition zones between recycled ballast aggregates and injected mortar. In addition, a higher cleanness level of ballast aggregates generally improved the mechanical and durability qualities of concrete

Crossref

Directory of Open Access Journals

ScholarWorks@UNIST

Prognostic gene network modules in breast cancer hold promise

Author: A Calabrò
AE Teschendorff
AE Teschendorff
AE Teschendorff
AE Teschendorff
Andrew E Teschendorff
C Desmedt
C Sotiriou
Carlos Caldas
HY Chuang
IW Taylor
J Li
LJ van't Veer
M Schmidt
MJ van de Vijver
S Paik
Y Wang
Yan Jiao
Publication venue: BioMed Central
Publication date: 08/12/2010
Field of study

A substantial proportion of lymph node-negative patients who receive adjuvant chemotherapy do not derive any benefit from this aggressive and potentially toxic treatment. However, standard histopathological indices cannot reliably detect patients at low risk of relapse or distant metastasis. In the past few years several prognostic gene expression signatures have been developed and shown to potentially outperform histopathological factors in identifying low-risk patients in specific breast cancer subgroups with predictive values of around 90%, and therefore hold promise for clinical application. We envisage that further improvements and insights may come from integrative expression pathway analyses that dissect prognostic signatures into modules related to cancer hallmarks

Crossref

PubMed Central

UCL Discovery