Sasstat software sas technical support sas support. A similar result was mentioned by donoho and rousseeuw at the 1985 oberwolfach workshop on robustness. A very important problem in finance is the construction of portfolios of assets that balance risk and reward in an optimal way. Least squares, for example, minimizes the variance of the residuals and is a special case of sestimators. Robust least squares refers to a variety of regression methods designed to be robust, or less sensitive, to outliers. The asymptotics of sestimators in the linear regression model jstor.
Huber, 1964, 1973, least median of squares lms rousseeuw, 1984, least trimmed squares lts rousseeuw, 1985, sestimation rousseeuw and yohai, 1984, and mmestimation yohai, 1987, are elaborated in the book of rousseeuw and leroy 2005. We find that 1 that robust regression applications are appropriate for modeling stock returns in global markets. Introduction to rousseeuw 1984 least median of squares. To overcome these limitations, maronna 2011 has recently proposed an. All of these estimates, however, have very low efficiency under a regression model with normal errors. Supandi et al 593 sestimators sestimators were first introduced in the context of regression by rousseeuw and yohai 1984. For more details see salibianbarrera and yohai 2006 or thieler, fried and rathjens 2016.
In addition, asymptotic distributions of the estimators are given, coupled with second order corrections to the bias of the estimators. Therefore, one single unusual observation can have large impact on the ols estimate. A class of robust and fully efficient regression estimators. The use of alternative regression methods in social. Use the link below to share a fulltext version of this article with your friends and colleagues. The asymptotic distribution of mmestimates has been studied by yohai 1987 under the assumption that h ho central parametric model. Sestimators of regression parameters, proposed by rousseeuw and yohai 1984, search for the slope and intercept values that minimize some measure of scale associated with the residuals. We advocate the least median of squares method rousseeuw 1984 because it appeals to the intuition and is easy to use. It leads to the notion of breakdown hodges, 1967, hampel, 1974, and rousseeuw, 1984 and bias robustness see for example donoho and liu, 1988 or martin, yohai and zamar, 1989. Model of robust regression with parametric and nonparametric. For this reason, rousseeuw and yohai 1984 propose to minimize. These are all called highbreakdown estimators since they can be tuned to resist contamination in up to 50% of the observations. In the latter two papers, the authors construct regression estimators which have both high breakdown points and high efficiency.
One of the commonly used robust loss functions is hubers function huber 1981, where. This algorithm, that we call fasts, is based on modifying each candidate with a step that improves the soptimality criterion, and thus allows to reduce the number of subsamples required to obtain a desired breakdown. More on sestimates below see rousseeuw and yohai, 1984. In this analysis of the risk and return of stocks in global markets, we apply several applications of robust regression techniques in producing stock selection models and several optimization techniques in portfolio construction in global stock universes. Highbreakdown point estimation of some regression models. Whether youve loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them. This algorithm, that we call \fasts, is based on modifying each candidate with a step that improves the soptimality criterion, and thus allows to reduce the number of subsamples. Rousseeuw and van driessen 1999 which is actually a lot faster.
Rousseeuw 1984 proposed the least median of squares lms and the least trimmed squares lts. The sestimator rousseeuw and yohai 1984 develop a highbreakdown estimator that minimizes the dispersion of the residuals. A robust learning approach for regression models based on. Robust regression and outlier detection rousseeuw, peter. We will consider estimators of scale defined by a function, which satisfy. These estimates have a very high computational complexity and therefore the usual algorithms compute only approximate solutions. Both of these estimators are useful for variable selection, but can only be tuned to be either highly robust or highly efficient under the normal model yohai, 1987. On the basis of these robust estimates, rousseeuw and leroy. Phd thesis, university of michigan, university micro. Later, they were applied to the multivariate scale and location estimation problem davies, 1992.
Easily share your publications and get them in front of issuus. The goal of sestimators is to have a simple highbreakdown regression estimator, which share the flexibility and nice asymptotic properties of mestimators. Mmestimates yohai 1987 are obtained by a iteration procedure. High breakdownpoint estimates of regression by means of. Part of the springer series in statistics book series sss. Rousseeuw 1984, least trimmed squares rousseeuw 1984, restimators jaeckel 1972, mestimators huber 1964, generalised mestimators hampel et al. Its selfcontained treatment allows readers to skip the mathematical material which is concentrated in a few sections. Outlier detection using nonconvex penalized regression yiyuan she and art b. To compute it, they use a modified version of the forward search algorithm see e. Therefore it can be viewed as a statistical theory dealing with approximate parametric models and a bridge between the fisherian parametric approach and the full nonparametric approach.
Rousseeuw 1984 showed that such estimators achieve a high breakdown value that is, they continue to give reasonable results even in the presence of many bad observations. A critical issue in portfolio development is how to address data outliers that reflect very unusual, generally nonrecurring, market conditions. Citeseerx a fast algorithm for sestimates of regression. Rousseeuw and yohai 1984 indicated that ols estimates have a breakdown point bp of bp 1n, which tends to zero when the sample size nis getting large. He obtained his phd in 1981 at the vrije universiteit brussel, following research carried out at the eth in zurich in the group of frank hampel, which led to a book on influence functions. Note that the maximumlikelihood estimator is an mestimator, obtained by putting the maximumlikelihood estimator can give arbitrarily bad results when the underlying assumptions e. Citeseerx citation query robust regression by means of. The concept of robust estimators has been further extended in huber, rousseeuw and yohai, rousseeuw and tyler and is broadly discussed in the existing literature in the context of robust methods for principal component analysis as in maronna or huber and ronchetti. Mestimation is the simplest approach both computationally. Robust regression by means of sestimators springerlink. Rousseeuw 1984 proposed an approximate algorithm based on drawing random subsamples of the same size than the number of carriers. A fast algorithm for the minimum covariance determinant estimator. To remedy this problem, rousseeuw and bassett introduced a new robust estimator which.
If this is the case then should a robust method be used for raim. It has a higher statistical e ciency than sestimation. In this paper, we propose to use instead a modification of the cstep algorithm proposed by rousseeuw and van driessen 1999 which is actually a lot faster. The feasible solution algorithm for least trimmed squares. Rousseeuw and yohai 1984 minimize the variance of the residuals. The performance of this method was improved by the fastlts algorithm of rousseeuw and van driessen 1998. Rousseeuw and yohai 1984 proposed svestimates, defined by the property of minimizing an mestimateofthe residuals scale. A combination of the high breakdown value method and mestimation is the mmestimation yohai, 1987. The br akdown point approach is highly attractive for a number of reasons, not the least. Mm estimation, introduced by yohai 1987, which combines high breakdown value estimation and m estimation. It is shown than an sestimate based on ajumpfunctiontype p solves the minmax bias problem for the class ofmestimates with very general scale. Robust and nonlinear time series analysis, 256272, 1984.
Eviews offers three different methods for robust least squares. These estimates have a very high computational complexity, and thus the usual algorithms compute only approximate solutions. No back ground knowledge or choice of tuning constants are needed. The books by hampel et al and rousseeuw and leroy 18 also cover robust tests. This observation allows us to elaborate on a property of highbreakdown estimators first noted by rousseeuw 1984 and formally defined by yohai and zamar 1988. The results directly paralleled the uncorrected analyses. Selfefficacy, aftercare and relapse in a treatment program for alcoholics. Outlier detection using nonconvex penalized regression. The lws is regression, scale, and affine equivariant similar to the lms and the lts rousseeuw and leroy, 1987. A comparison of outlier detection procedures and robust. Penalized weighted least squares for outlier detection and. Econometrics free fulltext financial big data solutions.
Outlier detection using distributionally robust optimization. Rousseeuw tutorialrobuststatistics jchemtrx 1991 ku leuven. Ronchetti, rousseeuw, and stahel 1986, maronna, martin, and yohai 2006, and dellaquila and ronchetti 2006 for an overview. Unfortunately, another common feature of these estimators is the timeconsuming nature. Individual differences in the perception of biological motion. That is, an minimizes the mscale an a implicitly defined by the equation 2. Introduction to rousseeuw 1984 least median of squares regression. Mm estimation, introduced by yohai 1987, combines high breakdown value estimation and. Issuu is a digital publishing platform that makes it simple to publish magazines, catalogs, newspapers, books, and more online.
He obtained his phd in 1981 at the vrije universiteit brussel, following research carried out at the eth in zurich in the group of frank hampel, which led to a book on influence. Rousseeuw and yohai 1984 proposed s estimates, defined by the property of minimizing an m estimate of the residuals scale. The maximal bias under arbitrary contaminations of size. The asymptotic breakdown point of the sestimator is given by rousseeuw and yohai, 1984. En robust and nonlinear time series, editores, franke, hardle and martin. When setting int to true, this adds an intercept column to the design matrix. Robust estimation in simultaneous equations models, journal of statistical. Sestimators, proposed by rousseeuw and yohai 1984, were the first high. Its breakdown is 50% when h is approximately n2 rousseeuw and leroy, 1987. Given the same breakdown value, s estimation has a higher statistical efficiency than lts estimation. Rousseeuw and yohai 1984, by permission of springerverlag, new york. The following dataset can be found in the world almanac and book of facts.
Proteomic biomarkers study using novel robust penalized. S estimation, which is a high breakdown value method that was introduced by rousseeuw and yohai 1984. Rousseeuw and yohai 1984 proved consistency and asymptotic normality. Robust regression and outlier detection rousseeuw, peter j. The performance of this method was improved by the fastlts algorithm ofrousseeuw and van driessen2000. It is worth noting that ss is a highly nonlinear, nondifferentiable function with multiple local maxima.
Sestimators of regression parameters, proposed by rousseeuw and yohai 1984, search for the slope and intercept values that minimize some measure of. It seeks to provide a robust estimator that is minimized the subsets. However, all these estimates have very low efficiency under a regression model with normal errors. This does not mean one should not use lts, just that one should be aware of the gaussian efficiency price one is paying, as a function of fraction of contaminationbreakdown point and probably opt for lower breakdown points when confident that the fraction of. Robust regression by means of sestimators in robust and nonlinear time series analysis. These estimates, called mmestimates, have simultaneously the following properties.
Rousseeuw, 1984 the asymptotic breakdown point is then defined as 2. In this talk we present an algorithm for sestimates see rousseeuw and yohai, 1984 similar to the fastlts. Next 10 on robust properties of convex risk minimization methods for pattern recognition. Yohai 1984, and sestimators for multivariate location and scatter have been. Croux 443 a measure of dispersion of the residuals that is less sensitive to extreme values than the variance. S estimation is a high breakdown value method introduced by rousseeuw and yohai 1984. Fast robust sur with economical and actuarial applications. For comparison to the partial correlation and linear regression analyses summarized above, we also conducted robust regression analyses using the s rousseeuw and yohai, 1984 and mm estimation yohai, 1987 procedures, both of which correct estimates for the effects of outliers.
Rousseeuw and yohai 1984 24 introduced the trimmed least squares tls regression which is a highly robust method for fitting a linear regression model. Rousseeuw and yohai 1984, by permission of springer. With the same breakdown value, it has a higher statistical ef. This article presents some applications of an hbp estimator called the sestimator rousseeuw and yohai, robust and nonlinear time series analysis eds w. Part of the lecture notes in statistics book series lns, volume 26. Least trimmed squares lts regression is based on the subset of h cases out of n whose least squares t possesses the smallest sum of squared residuals. An empirical comparison between robust estimation and. Rousseeuw born october 1956 is a statistician known for his work on robust statistics and cluster analysis.
The paper will provide an overview of robust regression methods, describe the syntax of proc robustreg, and illustrate the use of the procedure to. The name sestimators was chosen as they are based on estimators of scale. Following seminal papers by box 1953 and tukey 1960, which demonstrated the need for robust statistical procedures, the theory of robust statistics blossomed in the 1960s and 1970s. Efficient robust regression via twostage generalized empirical. Examples include the least median of squares lms rousseeuw, 1984, which minimizes the median of the absolute residuals, the least trimmed squares lts rousseeuw, 1985, which minimizes the sum of the qsmallest squared residuals, and sestimation rousseeuw and yohai, 1984, which has a higher statistical e ciency than lts with the same break. The breakdown value is a measure of the proportion of contamination that an estimation method can withstand and still maintain its robustness.
1228 73 537 190 1401 470 147 955 1293 1300 1066 1382 816 1212 307 990 1073 1422 207 107 57 1043 1534 121 662 1581 1159 532 237 96 283 107 909 673 1541 1485 1153 1471 1179 1425 890 1413 190 571