Search CORE

36 research outputs found

Feature Selection with the Boruta Package

Author: Miron B. Kursa
Witold R. Rudnicki
Publication venue
Publication date
Field of study

This article describes a R package Boruta, implementing a novel feature selection algorithm for finding \emph{all relevant variables}. The algorithm is designed as a wrapper around a Random Forest classification algorithm. It iteratively removes the features which are proved by a statistical test to be less relevant than random probes. The Boruta package provides a convenient interface to the algorithm. The short description of the algorithm and examples of its application are presented.

Research Papers in Economics

Generalized Strong Curvature Singularities and Cosmic Censorship

Author: Deshingkar S. S.
Królak A.
Penrose R.
ROBERT J. BUDZYŃSKI
Rudnicki W.
WIESŁAW RUDNICKI
WITOLD KONDRACKI
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 19/03/2002
Field of study

A new definition of a strong curvature singularity is proposed. This definition is motivated by the definitions given by Tipler and Krolak, but is significantly different and more general. All causal geodesics terminating at these new singularities, which we call generalized strong curvature singularities, are classified into three possible types; the classification is based on certain relations between the curvature strength of the singularities and the causal structure in their neighborhood. A cosmic censorship theorem is formulated and proved which shows that only one class of generalized strong curvature singularities, corresponding to a single type of geodesics according to our classification, can be naked. Implications of this result for the cosmic censorship hypothesis are indicated.Comment: LaTeX, 11 pages, no figures, to appear in Mod. Phys. Lett.

arXiv.org e-Print Archive

Crossref

CERN Document Server

Feature Selection with the Boruta Package

Author: Kursa Miron B.
Rudnicki Witold R.
Publication venue: 'Foundation for Open Access Statistic'
Publication date: 16/09/2010
Field of study

This article describes a R package Boruta, implementing a novel feature selection algorithm for finding emph{all relevant variables}. The algorithm is designed as a wrapper around a Random Forest classification algorithm. It iteratively removes the features which are proved by a statistical test to be less relevant than random probes. The Boruta package provides a convenient interface to the algorithm. The short description of the algorithm and examples of its application are presented

Directory of Open Access Journals

Journal of Statistical Software

The need for standardisation in life science research - an approach to excellence and trust

Author: Baebler Špela
Bongcam Rudloff Erik
D'Elia Domenica
Evelo Chris T.
Frohme Marcus
Gruca Aleksandra
Gruden Kristina
Hollmann Susanne
Kremer Andreas
Nechyporenko Alina
Regierer Babette
Rudnicki Witold R.
Tong Weida
Trefois Christophe
Šafránek David
Publication venue
Publication date: 01/01/2020
Field of study

Today, academic researchers benefit from the changes driven by digital technologies and the enormous growth of knowledge and data, on globalisation, enlargement of the scientific community, and the linkage between different scientific communities and the society. To fully benefit from this development, however, information needs to be shared openly and transparently. Digitalisation plays a major role here because it permeates all areas of business, science and society and is one of the key drivers for innovation and international cooperation. To address the resulting opportunities, the EU promotes the development and use of collaborative ways to produce and share knowledge and data as early as possible in the research process, but also to appropriately secure results with the European strategy for Open Science (OS). It is now widely recognised that making research results more accessible to all societal actors contributes to more effective and efficient science; it also serves as a boost for innovation in the public and private sectors. However for research data to be findable, accessible, interoperable and reusable the use of standards is essential. At the metadata level, considerable efforts in standardisation have already been made (e.g. Data Management Plan and FAIR Principle etc.), whereas in context with the raw data these fundamental efforts are still fragmented and in some cases completely missing. The CHARME consortium, funded by the European Cooperation in Science and Technology (COST) Agency, has identified needs and gaps in the field of standardisation in the life sciences and also discussed potential hurdles for implementation of standards in current practice. Here, the authors suggest four measures in response to current challenges to ensure a high quality of life science research data and their re-usability for research and innovation

Epsilon Open Archive

Maastricht University Research Portal

Open Repository and Bibliography - Luxembourg

Prediction of overall survival for patients with metastatic castration-resistant prostate cancer : development of a prognostic model through a crowdsourced challenge with open clinical trial data

Author: Abdallah Kald
Abdallah Kald
Airola Antti
Airola Antti
Aittokallio Tero
Aittokallio Tero
Anghe Catalina
Ankerst Donna P
Azima Helia
Baertsch Robert
Ballester Pedro J
Bare Chris
Bare J Christopher
Bhandari Vinayak
Bot Brian M
Bot Brian M
Buchardt Ann-Sophie
Buturovic Ljubomir
Cao Da
Chalise Prabhakar
Chang Billy HW
Cho Junwoo
Chu Tzu-Ming
Coley R Yates
Conjeti Sailesh
Correia Sara
Costello James C
Costello James C
Dai Junqiang
Dai Ziwei
Dang Cuong C
Dargatz Philip
Delavarkhan Sam
Deng Detian
Dhanik Ankur
Du Yu
Dunbar Maria Bekker-Nielsen
Elangovan Aparna
Ellis Shellie
Elo Laura L
Espiritu Shadrielle M
Fan Fan
Farshi Ashkan B
Freitas Ana
Fridley Brooke
Friend Stephen
Friend Stephen
Fuchs Christiane
Gofer Eyal
Golinska Agnieszka K
Graw Stefan
Greiner Russ
Guan Yuanfang
Guinney Justin
Guinney Justin
Guo Jing
Gupta Pankaj
Guyer Anna I
Han Jiawei
Hansen Niels R
Hirvonen Outi
Huang Barbara
Huang Chao
Hwang Jinseub
Ibrahim Joseph G
Jayaswa Vivek
Jeon Jouhyun
Ji Zhicheng
Juvvadi Deekshith
Jyrkkiö Sirkku
Kanigel-Winner Kimberly
Katouzian Amin
Kazanov Marat D
Khan Suleiman A
Khan Suleiman A
Khayyer Shahin
Kim Dalho
Koestler Devin
Kokowicz Fernanda
Kondofersky Ivan
Krautenbacher Norbert
Krstajic Damjan
Kumar Luke
Kurz Christoph
Kyan Matthew
Laajala Teemu D
Laajala Teemu D
Laimighofer Michael
Lee Eunjee
Lesinski Wojciech
Li Miaozhu
Li Ye
Lian Qiuyu
Liang Xiaotao
Lim Minseong
Lin Henry
Lin Xihui
Lu Jing
Mahmoudian Mehrad
Manshaei Roozbeh
Meier Richard
Miljkovic Dejan
Mirtti Tuomas
Mirtti Tuomas
Mnich Krzysztof
Navab Nassir
Neto Elias C
Neto Elias Chaibub
Newton Yulia
Norman Thea
Norman Thea
Pahikkala Tapio
Pahikkala Tapio
Pal Subhabrata
Park Byeongju
Patel Jaykumar
Pathak Swetabh
Pattin Alejandrina
Peddinti Gopal
Peddinti Gopalacharyulu
Peng Jian
Petersen Anne H
Philip Robin
Piccolo Stephen R
Polewko-Klim Aneta
Pölsterl Sebastian
Rao Karthik
Ren Xiang
Rocha Miguel
Rudnicki Witold R.
Ryan Charles J
Ryan Charles J
Ryu Hyunnam
Sartor Oliver
Sartor Oliver
Scher Howard I
Scherb Hagen
Sehgal Raghav
Seyednasrollah Fatemeh
Shang Jingbo
Shao Bin
Shen Liji
Shen Liji
Sher Howard
Shiga Motoki
Sokolov Artem
Song Lei
Soule Howard
Soule Howard
Stolovitzky Gustavo
Stolovitzky Gustavo
Stuart Josh
Sun Ren
Sweeney Christopher J
Sweeney Christopher J
Söllner Julia F
Tahmasebi Nazanin
Tan Kar-Tong
Tomaziu Lisbeth
Usset Joseph
Vang Yeeleng S
Vega Roberto
Vieira Vitor
Wang David
Wang Difei
Wang Junmei
Wang Lichao
Wang Sheng
Wang Tao
Wang Tao
Wang Yue
Winner Kimberly Kanigel
Wolfinger Russ
Wong Chris
Wu Zhenke
Xiao Jinfeng
Xie Xiaohui
Xie Yang
Xie Yang
Xin Doris
Yang Hojin
Yu Nancy
Yu Thomas
Yu Thomas
Yu Xiang
Zahedi Sulmaz
Zanin Massimiliano
Zhang Chihao
Zhang Jingwen
Zhang Shihua
Zhang Yanchun
Zhou Fang Liz
Zhou Fang Liz
Zhu Hongtu
Zhu Shanfeng
Zhu Yuxin
Publication venue
Publication date: 01/01/2016
Field of study

Background Improvements to prognostic models in metastatic castration-resistant prostate cancer have the potential to augment clinical trial design and guide treatment strategies. In partnership with Project Data Sphere, a not-for-profit initiative allowing data from cancer clinical trials to be shared broadly with researchers, we designed an open-data, crowdsourced, DREAM (Dialogue for Reverse Engineering Assessments and Methods) challenge to not only identify a better prognostic model for prediction of survival in patients with metastatic castration-resistant prostate cancer but also engage a community of international data scientists to study this disease. Methods Data from the comparator arms of four phase 3 clinical trials in first-line metastatic castration-resistant prostate cancer were obtained from Project Data Sphere, comprising 476 patients treated with docetaxel and prednisone from the ASCENT2 trial, 526 patients treated with docetaxel, prednisone, and placebo in the MAINSAIL trial, 598 patients treated with docetaxel, prednisone or prednisolone, and placebo in the VENICE trial, and 470 patients treated with docetaxel and placebo in the ENTHUSE 33 trial. Datasets consisting of more than 150 clinical variables were curated centrally, including demographics, laboratory values, medical history, lesion sites, and previous treatments. Data from ASCENT2, MAINSAIL, and VENICE were released publicly to be used as training data to predict the outcome of interest-namely, overall survival. Clinical data were also released for ENTHUSE 33, but data for outcome variables (overall survival and event status) were hidden from the challenge participants so that ENTHUSE 33 could be used for independent validation. Methods were evaluated using the integrated time-dependent area under the curve (iAUC). The reference model, based on eight clinical variables and a penalised Cox proportional-hazards model, was used to compare method performance. Further validation was done using data from a fifth trial-ENTHUSE M1-in which 266 patients with metastatic castration-resistant prostate cancer were treated with placebo alone. Findings 50 independent methods were developed to predict overall survival and were evaluated through the DREAM challenge. The top performer was based on an ensemble of penalised Cox regression models (ePCR), which uniquely identified predictive interaction effects with immune biomarkers and markers of hepatic and renal function. Overall, ePCR outperformed all other methods (iAUC 0.791; Bayes factor >5) and surpassed the reference model (iAUC 0.743; Bayes factor >20). Both the ePCR model and reference models stratified patients in the ENTHUSE 33 trial into high-risk and low-risk groups with significantly different overall survival (ePCR: hazard ratio 3.32, 95% CI 2.39-4.62, p Interpretation Novel prognostic factors were delineated, and the assessment of 50 methods developed by independent international teams establishes a benchmark for development of methods in the future. The results of this effort show that data-sharing, when combined with a crowdsourced challenge, is a robust and powerful framework to develop new prognostic models in advanced prostate cancer.Peer reviewe

Universidade do Minho: RepositoriUM

Crossref

PubMed Central

VTT Research System

Publications at Bielefeld University

Helsingin yliopiston digitaalinen arkisto