# Learn about 2 main data types and their sub types

Agenda

Data type is a simple but very important topic as this forms the foundation of data analysis and hypothesis testing. You go through this module and I promise that you will not face any problem in identifying data types in your future data analysis work.

We will cover following items in this module:

Types of Data

There are three types of data, discrete, continuous and locational data. In our data analysis we mostly use continuous and discrete type of data. When we plan to apply any particular analysis to test a hypothesis, we have to first make sure that required data types are available. Basically application of any analysis type is linked with type of data, we have to first understand the type of data points available. If our data is discrete then we cannot apply some of the analysis types which work with continuous data only(Please refer to Fig-2).

• Data is objective information that everyone can agree on
• What we measure is not the object but some characteristic of it

Data Type I – Discrete Data Type

Discrete Data

Discrete data can only be integers as it is count data, for example 2, 40, 41 etc. Counted data or attribute data are answers to questions like “how many”, “how often”, “pass/fail count”.

Binary Data

Only two possible outcomes (yes / no, on time / late, Ok / Not Ok)

• A cab is either on time or late
• An agent is either present or not present

Count Data

Count of incidences

• Number of Computer breakdowns in a week
• Number of times agent puts client on hold during a call
Data Type II – Continuous Data Type

Continuous Data

Variable data is continuous data, this means that the data values can be any real number like 2.12, 3.33, -3.3 etc. This data is measured on a continual scale like distance, time, weight, length etc. Measured data is regarded as being better than counted data. It is more precise and contains more information. For example knowing how much it rained each day is much better information than number of days it rained. However collecting continuous data is time consuming and expensive as compared to counted/discrete data.

Data that can be measured on a Continual Scale with resolution that is limited only by precision of the measuring equipment

Examples

• Time it takes to Close a Call
• Actual reporting time of a cab at the gate
• Temperature of the room
• Exchange rate of a currency
• Height of a person

Data Type III – Locational Data Type

Locational Data

Locational data simply answers the question “where”. Charts that utilize locational data are often called “measles charts” or “concentration chart”. They can also be “Heat map” showing volume or concentration on a map.

Primary Scales of Measurement

There are four primary scales of measurement : nominal, ordinal, interval and ratio. These scales are summarized in Fig – 2.

1.Nominal Scale : This is a figurative labeling scheme in which the numbers serve only as labels or tags for identifying and classifying objects. For example, the number assigned to the runner in a race is nominal. Here each number is assigned to only one runner and the numbers are unique. Another example could be Social Security Number. The numbers in a nominal scale do not reflect the amount of the characteristic possessed by the object. For example a person with higher SSN number is not superior to those with lower value SSN number. The only mathematical operation we can do is counting on nominal scale.

2.Ordinal Scale : An ordinal scale is a ranking scale in which numbers(ranks) are assigned to objects to indicate the relative extent to which the objects posses some characteristic. It indicates the relative position but it doesn’t indicate the magnitude of the difference between the objects. Along with counting, we can calculate percentile, quartile, median, rank-order correlation or other summary statistics from ordinal data.

3.Interval Scale : In an interval scale, scale represents equal distance between the values in the characteristic being measured. The most important point is that in interval scale, location of zero point is not fixed. The difference between any two scale values is identical to the difference between any other two adjacent values.

4.Ratio Scale : Ratio scale possesses all the properties of nominal, ordinal and interval scale and, in addition, an absolute zero point. Common examples of ratio scale are height, weight, distance, age etc. All statistical techniques can be applied to ratio scale.

Data Type of Distributions

Continuous Data – Normal Distribution

Discrete Data – Binomial / Poisson Distribution

Basic Statistics for Continuous Data

Measures of Location

Mean:

• Arithmetic average of a set of values
• Equally reflects the influence of all values
• Strongly Influenced by extreme values

Median:

• reflects the 50% rank
• the center number after a set of numbers has been sorted
• is “robust” to extreme scores

Mode :

• A single data point appeared maximum no. of times in data set

Continuous Data – Variability

Range

• Numerical distance between the highest and the lowest values in a data set.

Standard Deviation

• The square root of the variance, it is the most commonly used measure to quantify variability

The ‘Variance’ is the square of Standard Deviation; usually it is used for calculation of capability.