*
This blog post is 1) my attempt to learn the material in the book Practical Statistics for Data Scientists by Bruce and Bruce and 2) better learn the different Python packages needed for statistics and data science.
*

*
I have made every attempt to convert the code supplied in the book from R to Python correctly. Any mistakes and errors must be assumed to be mine and mine alone.
*

In [9]:

```
import pandas as pd
from scipy import stats
import numpy as np
```

In [ ]:

```
state = pd.read_csv('../data/state.csv')
```

In [3]:

```
# See Table 1-2 pg 12
state.head(8)
```

Out[3]:

In [4]:

```
state["Population"].mean()
```

Out[4]:

In [7]:

```
# Trimmed mean -- the mean after removing 10% of data points from either side
stats.trim_mean(state["Population"], 0.1)
```

Out[7]:

In [8]:

```
state["Population"].median()
```

Out[8]:

In [16]:

```
# need to use NumPy's average function to get a weighted mean
np.average(state["Murder.Rate"], weights=state["Population"])
```

Out[16]:

In [17]:

```
state["Population"].std()
```

Out[17]:

In [19]:

```
# First quantile is 0.25 or the 25th percentile
Q1 = state["Population"].quantile(0.25)
Q1
```

Out[19]:

In [21]:

```
# Third quantile is 0.75 or 75th percentile
Q3 = state["Population"].quantile(0.75)
Q3
```

Out[21]:

In [23]:

```
# pandas does not have an IQR function but it is easy to compute
# it is just the difference between Q3 and Q1
IQR = Q3 - Q1
IQR
```

Out[23]: