241 research outputs found
Spectral information criterion for automatic elbow detection
We introduce a generalized information criterion that contains other
well-known information criteria, such as Bayesian information Criterion (BIC)
and Akaike information criterion (AIC), as special cases. Furthermore, the
proposed spectral information criterion (SIC) is also more general than the
other information criteria, e.g., since the knowledge of a likelihood function
is not strictly required. SIC extracts geometric features of the error curve
and, as a consequence, it can be considered an automatic elbow detector. SIC
provides a subset of all possible models, with a cardinality that often is much
smaller than the total number of possible models. The elements of this subset
are elbows of the error curve. A practical rule for selecting a unique model
within the sets of elbows is suggested as well. Theoretical invariance
properties of SIC are analyzed. Moreover, we test SIC in ideal scenarios where
provides always the optimal expected results. We also test SIC in several
numerical experiments: some involving synthetic data, and two experiments
involving real datasets. They are all real-world applications such as
clustering, variable selection, or polynomial order selection, to name a few.
The results show the benefits of the proposed scheme. Matlab code related to
the experiments is also provided. Possible future research lines are finally
discussed
Universal and Automatic Elbow Detection for Learning the Effective Number of Components in Model Selection Problems
We design a Universal Automatic Elbow Detector (UAED) for deciding the
effective number of components in model selection problems. The relationship
with the information criteria widely employed in the literature is also
discussed. The proposed UAED does not require the knowledge of a likelihood
function and can be easily applied in diverse applications, such as regression
and classification, feature and/or order selection, clustering, and dimension
reduction. Several experiments involving synthetic and real data show the
advantages of the proposed scheme with benchmark techniques in the literature
Thread-level information for comment classification in community question answering
Community Question Answering (cQA) is a new application of QA in social contexts (e.g., fora). It presents new interesting challenges and research directions, e.g., exploiting the dependencies between the different comments of a thread to select the best answer for a given question. In this paper, we explored two ways of modeling such dependencies: (i) by designing specific features looking globally at the thread; and (ii) by applying structure prediction models. We trained and evaluated our models on data from SemEval-2015 Task 3 on Answer Selection in cQA. Our experiments show that: (i) the thread-level features consistently improve the performance for a variety of machine learning models, yielding state-of-the-art results; and (ii) sequential dependencies between the answer labels captured by structured prediction models are not enough to improve the results, indicating that more information is needed in the joint model
Sweet cherry production in South Patagonia
In South Patagonia, the total sweet cherry (Prunus avium L.) area has increased from 176 ha in 1997 to 507 ha in 2004, of which 232 ha are located in Los Antiguos (46°19¿ SL; 220 m elevation), 158 ha in the Lower Valley of Chubut River (LVCHR) (43°16¿ SL; 30 m elevation), 52 ha in Sarmiento (45°35¿ SL; 270 m elevation), 35 ha in Esquel (42°55¿ SL; 570 m elevation) and 30 ha in Comodoro Rivadavia (45°52¿ SL; 50 m elevation). The most common varieties are `Lapins¿, `Bing¿, `Newstar¿, `Sweetheart¿, `Stella¿, `Sunburst¿ and `Van¿ grafted on `Mahaleb¿, `Pontaleb¿, `SL 64¿, `Colt¿ or `Mazzard¿ rootstocks. Trees generally are drip-irrigated and planted at high densities, using training systems such as Tatura, central leader and modified vase (2700, 1100 and 1000 trees ha-1, respectively). Growers in Los Antiguos are more traditional, planting mainly as vase (400 to 1000 trees ha-1) or freestanding trees (280 trees ha-1) and irrigating by gravity (74% of the area). Only 4.4% of the area of Los Antiguos is frost protected, as growers rely strongly on the moderating effect of Lake Buenos Aires. Frost control systems are absent in Comodoro Rivadavia because the established orchards are located next to the sea, in an area with low risk of frost. The frost-protected area is 49% in Sarmiento, 35% in Esquel and 57% in LVCHR. Fruit are harvested from November (LVCHR) to the end of January (Los Antiguos and Esquel), and the harvest-only labour demand during the 2004/2005 season was 100,000 h. In that season, seven packinghouses exported 390 t (45% of the total production) to Europe. Most orchards have not yet reached their mature stage and new ones are being established. Therefore, fruit volumes will continue to increase and shortages of labour and packing facilities may become a constraint
Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 1: Check-Worthiness
We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims, with focus on Task 1: Check-Worthiness. The task asks to predict which claims in a political debate should be prioritized for fact-checking. In particular, given a debate or a political speech, the goal was to produce a ranked list of its sentences based on their worthiness for fact checking. We offered the task in both English and Arabic, based on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign. A total of 30 teams registered to participate in the Lab and seven teams actually submitted systems for Task 1. The most successful approaches used by the participants relied on recurrent and multi-layer neural networks, as well as on combinations of distributional representations, on matchings claims' vocabulary against lexicons, and on measures of syntactic dependency. The best systems achieved mean average precision of 0.18 and 0.15 on the English and on the Arabic test datasets, respectively. This leaves large room for further improvement, and thus we release all datasets and the scoring scripts, which should enable further research in check-worthiness estimation
Overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. Task 2: Factuality
We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims, with focus on Task 2: Factuality. The task asked to assess whether a given check-worthy claim made by a politician in the context of a debate/speech is factually true, half-true, or false. In terms of data, we focused on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign (we also provided translations in Arabic), and we relied on comments and factuality judgments from factcheck.org and snopes.com, which we further refined manually. A total of 30 teams registered to participate in the lab, and five of them actually submitted runs. The most successful approaches used by the participants relied on the automatic retrieval of evidence from the Web. Similarities and other relationships between the claim and the retrieved documents were used as input to classifiers in order to make a decision. The best-performing official submissions achieved mean absolute error of .705 and .658 for the English and for the Arabic test sets, respectively. This leaves plenty of room for further improvement, and thus we release all datasets and the scoring scripts, which should enable further research in fact-checking
Overview of the CLEF-2018 checkthat! lab on automatic identification and verification of political claims
We present an overview of the CLEF-2018 CheckThat! Lab on Automatic Identification and Verification of Political Claims. In its starting year, the lab featured two tasks. Task 1 asked to predict which (potential) claims in a political debate should be prioritized for fact-checking; in particular, given a debate or a political speech, the goal was to produce a ranked list of its sentences based on their worthiness for fact-checking. Task 2 asked to assess whether a given check-worthy claim made by a politician in the context of a debate/speech is factually true, half-true, or false. We offered both tasks in English and in Arabic. In terms of data, for both tasks, we focused on debates from the 2016 US Presidential Campaign, as well as on some speeches during and after the campaign (we also provided translations in Arabic), and we relied on comments and factuality judgments from factcheck.org and snopes.com, which we further refined manually. A total of 30 teams registered to participate in the lab, and 9 of them actually submitted runs. The evaluation results show that the most successful approaches used various neural networks (esp. for Task 1) and evidence retrieval from the Web (esp. for Task 2). We release all datasets, the evaluation scripts, and the submissions by the participants, which should enable further research in both check-worthiness estimation and automatic claim verification
Crop residue grazing and tillage systems effects on soil physical properties and corn (Zea Mays L.) performance
Crop-livestock systems under no till (NT) could negatively affect soil physical properties and crop performance, due to the additive effects of reduced soil cover and cattle trampling due to livestock grazing, and the absence of tillage. We evaluated the effects of four grazing strategies and of a shallow tillage (ST) on soil physical properties and corn (Zea mays L.) performance for a mollisol after 15 years under crop-livestock systems under NT in Argentina. Grazing strategies evaluated were: closure (C), one grazing (OG), high stocking rate (HR) and farmer's management (FM), and the tillage systems were: NT and ST. Bulk density (BD), penetration resistance (PR), hydraulic conductivity (ks), plant population, surface root distribution, aboveground dry matter accumulation, aboveground total N (TN) accumulation and corn yield were evaluated. High stocking rate and FM increased RP. On the other hand, ST decreased PR and BD and increased ks. Corn yield was higher under ST than under NT, and under HR than under the other grazing strategies. Total N accumulation was higher under HR than under the rest of grazing strategies. Rational grazing management and use of tillage systems on resilient soils could have prevented soil physical properties be affected beyond critical thresholds.Facultad de Ciencias Agrarias y Forestale
Crop residue grazing and tillage systems effects on soil physical properties and corn (Zea Mays L.) performance
Crop-livestock systems under no till (NT) could negatively affect soil physical properties and crop performance, due to the additive effects of reduced soil cover and cattle trampling due to livestock grazing, and the absence of tillage. We evaluated the effects of four grazing strategies and of a shallow tillage (ST) on soil physical properties and corn (Zea mays L.) performance for a mollisol after 15 years under crop-livestock systems under NT in Argentina. Grazing strategies evaluated were: closure (C), one grazing (OG), high stocking rate (HR) and farmer's management (FM), and the tillage systems were: NT and ST. Bulk density (BD), penetration resistance (PR), hydraulic conductivity (ks), plant population, surface root distribution, aboveground dry matter accumulation, aboveground total N (TN) accumulation and corn yield were evaluated. High stocking rate and FM increased RP. On the other hand, ST decreased PR and BD and increased ks. Corn yield was higher under ST than under NT, and under HR than under the other grazing strategies. Total N accumulation was higher under HR than under the rest of grazing strategies. Rational grazing management and use of tillage systems on resilient soils could have prevented soil physical properties be affected beyond critical thresholds.Facultad de Ciencias Agrarias y Forestale
Evaluation of prognostic factors among patients with chronic graft-versus-host disease
Background: Chronic graft-versus-host disease (cGVHD) is a major complication after allogeneic stem cell transplantation with an adverse effect on both mortality and morbidity. In 2005, the National Institute of Health proposed new criteria for diagnosis and classification of chronic graft-versus-host disease for clinical trials. New sub-categories were recognized such as late onset acute graft-versus-host disease and overlap syndrome.
Design and methods: We evaluated the prognostic impact of the new sub-categories as well as the clinical scoring system proposed by the National Institute of Health in a retrospective, multicenter study of 820 patients undergoing allogeneic stem cell transplantation between 2000 and 2006 at 3 different institutions. Patients were retrospectively categorized according to the National Institute of Health criteria from patients' medical histories.
Results: As far as the new sub-categories are concerned, in univariate analysis diagnosis of overlap syndrome adversely affected the outcome. Also, the number of organs involved for a cut-off value of 4 significantly influenced both cGVHD related mortality and survival. In multivariate analysis, in addition to NIH score, platelet count and performance score at the time of cGVHD diagnosis, plus gut involvement, significantly influenced outcome. These 3 variables allowed us to develop a simple score system which identifies 4 subgroups of patients with 84%, 64%, 43% and 0% overall survival at five years after cGVHD diagnosis (score 0: HR=15.96 (95% CI: 6.85-37.17), P<0.001; score 1: HR=5.47 (95% CI: 2.6-11.5), P<0.001; score 2: HR=2.8 (95% CI: 1.32-5.93), P=0.007).
Conclusions: In summary, we have identified a powerful and simple tool to discriminate different subgroups of patients in terms of chronic graft-versus-host disease related mortality and survival
- …