3 Widely used Methods of Regression Analysis

3 widely used types of Regression Analysis

Learn the basics of Regression analysis with examples which are easy to understand. Regression is one of the most widely used statistical concept in data analytics, marketing research and other areas of applied statistics. Regression analysis is the process of constructing a mathematical model that can be used to predict one variable by another variable or variables.

In this module we will understand the concept and usage of Bivariate(Simple Linear) Regression with examples.

What is regression

  • While Correlation explains the strength and direction of relationship, Regression explain the degree of the movement in one variable with respect to movement in another
  • Regression generates an equation that quantifies the correlation between ‘X’ and ‘Y’
  • This equation can be further used to predict values of ‘Y’ at a given value of ‘X’ with-in the study range

Types of Regression Analysis

There are basically three types of Regression analysis which are mostly used in analysis and data modeling. These are based of number of independent variables and data type of dependent variable.

1. Simple Linear Regression :

Regression of Y on single X and both variable should be continuous. This is explained in detail later in this article.

2. Multiple Regression :

Regression of Y on more than one Xs and all variables should be continuous. This is the most widely used concept for data modeling and there are two methods “Best subset” and “stepwise” which we can use for creating our model.

a) Best Subset – Best subsets regression identifies the best-fitting regression models that can be constructed with the independent variables you specify. Best subsets regression is an efficient way to identify models that achieve your goals with as fewer Xs as possible.

b) Stepwise Regression- Like best subset, the purpose here also is to select fewer important X’s from a large number of X’s or predictor variables. Approaches used here are:

Forward inclusion- X’s are entered one at a time, only if they meet certain criteria specified in terms of F ratio. The order in which the variables are included is based on the contribution to the explained variance.

Backward elimination- Initially, all the variables are included in the model then X’s are removed one at a time based on the F ratio.

3. Logistic Regression

In many ways logistic regression is like ordinary regression. It requires a dependent variable, y, and one or more independent variables. Logistic regression can be used to model situations in which the dependent variable, y, may only assume two discrete values, such as 0 and 1. So in Regression Y is continuous but in Logistic Regression Y is discrete in nature.

Bivariate/Simple Linear Regression

A simple linear regression equation is nothing but a fitted linear equation between ‘Y’ & ‘X’

Regression Model :Y = a + bX + c

where Y = Dependent variable / output / response
X = Independent variable / input / predictor(Only 1 independent variable in case of bivariate or simple linear regression)
a = Intercept of fitted line on Y axis
b = Slope of the fitted line
c = Error in the model

Simple Linear Regression with Example using Minitab

In this example number of passengers is the X and we want to predict cost Y based on number of passengers