68 research outputs found
Alternative Methods of Linear Regression
This paper is a survey on traditional linear regression techniques using the lñ-, l2-, and
lâÂÂ-norm. We derive the characterization of the respective regression estimates (including optimality
and uniqueness criteria), as well as discuss some of their statistical properties.Statistics Working Papers Serie
Alternative Methods of Linear Regression
This paper is a survey on traditional linear regression techniques using the lñ-, l2-, and
lâÂÂ-norm. We derive the characterization of the respective regression estimates (including optimality
and uniqueness criteria), as well as discuss some of their statistical properties.Statistics Working Papers Serie
A Fast Algorithm for Robust Regression with Penalised Trimmed Squares
The presence of groups containing high leverage outliers makes linear
regression a difficult problem due to the masking effect. The available high
breakdown estimators based on Least Trimmed Squares often do not succeed in
detecting masked high leverage outliers in finite samples.
An alternative to the LTS estimator, called Penalised Trimmed Squares (PTS)
estimator, was introduced by the authors in \cite{ZiouAv:05,ZiAvPi:07} and it
appears to be less sensitive to the masking problem. This estimator is defined
by a Quadratic Mixed Integer Programming (QMIP) problem, where in the objective
function a penalty cost for each observation is included which serves as an
upper bound on the residual error for any feasible regression line. Since the
PTS does not require presetting the number of outliers to delete from the data
set, it has better efficiency with respect to other estimators. However, due to
the high computational complexity of the resulting QMIP problem, exact
solutions for moderately large regression problems is infeasible.
In this paper we further establish the theoretical properties of the PTS
estimator, such as high breakdown and efficiency, and propose an approximate
algorithm called Fast-PTS to compute the PTS estimator for large data sets
efficiently. Extensive computational experiments on sets of benchmark instances
with varying degrees of outlier contamination, indicate that the proposed
algorithm performs well in identifying groups of high leverage outliers in
reasonable computational time.Comment: 27 page
Neo-Rawlsian fringes: A new approach to market segmentation and product development
A new approach to market segmentation and product development
Effect of organic and biodynamic management on chemical characteristics, macrofauna and biological activity of soil in a vineyard of cv. BRS Carmen
Organic agriculture is based in the improvement of biodiversity and maintenance of plant cover, that could favor nutrient cycling, soil aggregation, water storage, organic matter maintenance, macro and microorganisms. In this study, we compared the characteristics of the soil in areas with grapevines cv. BRS Carmem cultivated under organic and biodynamic management. The trial was carried out in Guarapuava, Paraná State, Southern Brazil from September 2013, when the grapevines were planted, until June 2017. The soil was handled in the same way in both treatments, but in the plots of biodynamic treatment the following biodynamic preparations were applied: manure horn (500), Equisetun (508) and Fladen. All plants were fertilized with the same organic compost, however, those from the biodynamic treatment received the preparations 502 (Achillea millefolium), 503 (Chamomilla officinalis), 504 (Urtica dioica), 505 (Quercus robus), 506 (Taraxacum officinale) And 507 (Valeriana officinalis). The following soil traits were evaluated: chemical analysis (0-10 and 10-20 cm), quantification of macrofauna of the soil with pittfall trap and soil monoliths, number of cysts of ground-pearls (Eurhizococcus brasiliensis) in the vine roots and ß-glucosidase enzyme activity in soil. Soil with biodynamic preparations showed higher K and H + AL content in both vertical sections. It was possible to observe a larger number of ground-pearl cysts in the roots of plants under organic treatment. No statistical difference was observed for ß-glucosidase enzyme activity
Assessment of chromium bioaccumulation in Pseudokirchneriella subcapitata (Korshikov) Hindak by the Central Composite Design (CCD) and Response Surface Methodology (RSM)
The effects of chromium bioaccumulation in Pseudokirchneriella subcapitata were evaluated by Central Composite Design (CCD), factorial 22 and Response Surface Methodology (RSM). All the models of regression generated by CCD were highly significant, with R2 between 77 and 88%, which is the percentual variability in the response that the model can account for. This is indicative of a satisfactory representation of the process models whose data can be used for simulations of response. The maximum shrinkage biovolume presented 28–69% reduction compared to controls. Results from this study suggest that the smaller algal cells amplify metal binding sites, leading to an increased bioaccumulation and a consequential increased capacity to accumulate chromium. Nevertheless, the absorption capacity decreases for more elevated chromium concentrations and for longer exposure.Keywords: Algae, Biovolume, Central Composite Design, Metal, Selenastrum capricornutum
Filtering Outliers in One Step with Genetic Programming
Outliers are one of the most difficult issues when dealing with real-world modeling tasks. Even a small percentage of outliers can impede a learning algorithm’s ability to fit a dataset. While robust regression algorithms exist, they fail when a dataset is corrupted by more than 50% of outliers (breakdown point). In the case of Genetic Programming, robust regression has not been properly studied. In this paper we present a method that works as a filter, removing outliers from the target variable (vertical outliers). The algorithm is simple, it uses a randomly generated population of GP trees to determine which target values should be labeled as outliers. The method is highly efficient. Results show that it can return a clean dataset when contamination reaches as high as 90%, and may be able to handle higher levels of contamination. In this study only synthetic univariate benchmarks are used to evaluate the approach, but it must be stressed that no other approaches can deal with such high levels of outlier contamination while requiring such small computational effort
- …