Sum of squared error calculator
How to use Sum of Squared Errors Calculator?
This is an online tool for calculating Sum of Squared Errors.
Usage:
1. Type X and Y values.
2. Click Calculate SSE button
3. Results are generated automatically.
Contact: [email protected]
Sum of Squared Errors (SSE) Calculation in Linear Regression Analysis
 Sum of Squared Errors (SSE) Calculation in Linear Regression Analysis
 Introduction
 Calculation Steps
 Step 1: Calculate $SS\_{XX}$ (Sum of Squares for X)
 Step 2: Calculate $SS\_{YY}$ (Sum of Squares for Y)
 Step 3: Calculate $SS\_{XY}$ (Sum of Squares for X and Y)
 Step 4: Calculate Regression Coefficients
 Step 5: Calculate SS_Regression (Sum of Squares for Regression)
 Step 6: Calculate SSE (Sum of Squared Errors)
 Example Calculation
 Conclusion
Introduction
In linear regression analysis, the Sum of Squared Errors (SSE) measures the total deviation in the model's predictions, that is the difference between the observed and the predicted values. It is calculated as the difference between the total variability in the observed data and the variability explained by the regression model. When SSE is zero, it indicates that the model perfectly explains all the variability in the data.
Calculation Steps
To compute SSE, follow these steps:
Step 1: Calculate $SS_{XX}$ (Sum of Squares for X)
$SS_{XX}$ represents the total variability in the predictor variable $X$. It is calculated using the formula:
$SS_{XX} = \sum_{i=1}^{n} X_i^2  \frac{1}{n} \left(\sum_{i=1}^{n} X_i\right)^2$
where $X_i$ are the values of the predictor variable, and $n$ is the number of observations.
Step 2: Calculate $SS_{YY}$ (Sum of Squares for Y)
$SS_YY$ represents the total variability in the response variable $Y$. It is calculated using:
$SS_{YY} = \sum_{i=1}^{n} Y_i^2  \frac{1}{n} \left(\sum_{i=1}^{n} Y_i\right)^2$
where $Y_i$ are the values of the response variable, and $n$ is the number of observations.
Step 3: Calculate $SS_{XY}$ (Sum of Squares for X and Y)
$SS_{XY}$ measures the covariance between $X$ and $Y$. It is calculated using:
$SS_{XY} = \sum_{i=1}^{n} X_i Y_i  \frac{1}{n} \left(\sum_{i=1}^{n} X_i\right) \left(\sum_{i=1}^{n} Y_i\right)$
Step 4: Calculate Regression Coefficients
 Slope ($\hat{\beta_1}$):
$\hat{\beta_1} = \frac{SS_{XY}}{SS_{XX}}$
 Intercept ($\hat{\beta_0}$):
$\hat{\beta_0} = \frac{\sum_{i=1}^{n} Y_i}{n}  \hat{\beta_1} \times \frac{\sum_{i=1}^{n} X_i}{n}$
Step 5: Calculate SS_Regression (Sum of Squares for Regression)
SS_Regression represents the portion of the total variability in $Y$ that is explained by the regression model. It is given by:
$SS_{R} = \hat{\beta_1} \times SS_{XY}$
Step 6: Calculate SSE (Sum of Squared Errors)
SSE measures the portion of the total variability in $Y$ that is not explained by the regression model. It is given by:
$SSE = SS_{YY}  SS_{R}$
Example Calculation
Given the following data:
 X values: [1, 2, 3, 4, 5]
 Y values: [2, 4, 6, 8, 10]
We calculate:

$SS_{XX}$:
$SS_{XX} = \sum X_i^2  \frac{1}{n} \left(\sum X_i\right)^2 = 55  \frac{1}{5}(15)^2 = 10$
$SS_{YY}$:
$SS_{YY} = \sum Y_i^2  \frac{1}{n} \left(\sum Y_i\right)^2 = 220  \frac{1}{5}(30)^2 = 40$

$SS_{XY}$:
$SS_{XY} = \sum X_i Y_i  \frac{1}{n} \left(\sum X_i\right) \left(\sum Y_i\right) = 110  \frac{1}{5}(15 \times 30) = 20$
Slope ($\hat{\beta_1}$):
$\hat{\beta_1} = \frac{SS_{XY}}{SS_{XX}} = \frac{20}{10} = 2$

Intercept ($\hat{\beta_0}$):
$\hat{\beta_0} = \frac{\sum Y_i}{n}  \hat{\beta_1} \times \frac{\sum X_i}{n} = 6  2 \times 3 = 0$

SS_Regression:
$SS_{R} = \hat{\beta_1} \times SS_{XY} = 2 \times 20 = 40$

SSE:
$SSE = SS_{YY}  SS_{R} = 40  40 = 0$
Conclusion
SSE quantifies the discrepancy between the observed data and the predictions made by the regression model. A SSE of 0 indicates that the model perfectly explains the variability in the response variable.
Alternatives to SSE in statistics
Mean Absolute Error (MAE)
Definition: MAE measures the average magnitude of errors in a set of predictions, without considering their direction. It is the mean of the absolute differences between predicted values and actual values.
Formula:
$\text{MAE} = \frac{1}{n} \sum_{i=1}^{n} y_i  \hat{y}_i$
Advantages:
 MAE is less sensitive to outliers compared to MSE because it does not square the errors.
 Provides a more intuitive measure of average error magnitude.
Disadvantages:
 Does not differentiate between larger and smaller errors, as all errors are treated equally.
Root Mean Squared Error (RMSE)
Definition: RMSE is the square root of the average of the squared differences between predicted and actual values. It penalizes large errors more severely than MAE due to the squaring of errors.
Formula:
$\text{RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i  \hat{y}_i)^2}$
Advantages:
 RMSE provides a measure of error in the same units as the dependent variable, making it easier to interpret.
 Emphasizes larger errors more than MAE, which can be useful if large errors are particularly undesirable.
Disadvantages:
 Sensitive to outliers due to the squaring of errors.
Mean Absolute Percentage Error (MAPE)
Definition: MAPE measures the size of the error in percentage terms. It is the mean of the absolute percentage errors between predicted and actual values.
Formula:
$\text{MAPE} = \frac{1}{n} \sum_{i=1}^{n} \left \frac{y_i  \hat{y}_i}{y_i} \right \times 100\%$
Advantages:
 Provides error metrics in percentage terms, which can be easier to interpret in some contexts.
 Useful for comparing forecast performance across different scales.
Disadvantages:
 Can be problematic if actual values are close to zero, leading to very high percentage errors.