## A Guide to Metrics in Exploratory Data Analysis | by Esmaeil Alizadeh | Dec, 2020

[ad_1]

Estimates of location are measures of the central tendency of the data (where most of the data is located). In statistics, this is usually referred to as the first moment of a distribution.

## Mean

The *arithmetic mean*, or simply *mean *or *average *is probably the most popular estimate of location. There different variants of mean, such as *weighted mean* or *trimmed/truncated mean*. You can see how they can be computed below.

where *n* denotes the total number of observations (rows).

Weighted mean (equation 1.2) is a variant of mean that can be used in situations where the sample data does not represent different groups in a dataset. By assigning a larger weight to groups that are under-represented, the computed weighted mean will more accurately represent all groups in our dataset.

Extreme values can easily influence both themeanandweighted meansince neither one is a robust metric!

Another variant of mean is the *trimmed mean* (eq. 1.3) that is a robust estimate.

Robust estimate:A metric that is not sensitive to extreme values (outliers).

The trimmed mean is used in calculating the final score in many sports where a panel of judges will each give a score. Then the lowest and the highest scores are dropped and the mean of the remaining scores are computed as a part of the final score[2]. One such example is in the international diving score system.

In statistics,xˉrefers to asamplemean, whereasμrefers to thepopulationmean.

## A Use Case for the Weighted Mean

If you want to buy a smartphone or a smartwatch or any gadget where there are many options, you can use the following method to choose among various options available for a gadget.

Let’s assume you want to buy a smartphone, and the following features are important to you: 1) battery life, 2) camera quality, 3) price and 4) the phone design. Then, you give the following weights to each one:

Let’s say you have two options an iPhone and Google’s Pixel. You can give each feature a score of some value between 1 and 10 (1 being the worst and 10 being the best). After going over some reviews, you may give the following scores to the features of each phone.

So, which phone is better for you?

iPhone score=0.15×6+0.3×9+0.25×1+0.3×9=6.55

Google Pixel score=0.15×5+0.3×9.5+0.25×8+0.3×5=7.1

And based on your feature preferences, the Google Pixel might be the better option for you!

## Median

Median is the middle of a sorted list, and it’s a robust estimate. For an ordered sequence x_1, x_2, …, x_n, the median is computed as follows:

Read More …

[ad_2]