Still feeling nervous for your exam? Check out the SQL Cram Kit for practice questions that emulate the difficulty that you'll face on your exam. Each question includes a thorough step-by-step explanation where I walk you through the entire problem solving process!

## Aggregate functions explained

**Aggregate functions** are a very simple concept to understand, so this'll be a piece of cake.

They're used to conduct calculations on a list of values.

Let's say that we've got the following list of numbers.

numbers |

1 |

2 |

3 |

We can use the following aggregate functions on these numbers and get the results shown in the examples.

Function | Description | Example w/ numbers |

AVG | Average of values | AVG(numbers) = (1 + 2 + 3) / 3 = 2 |

MIN | Minimum of values | MIN(numbers) = 1, 2, 3 = 1 |

MAX | Maximum of values | MAX(numbers) = 1, 2, 3 = 3 |

SUM | Sum of the values | SUM(numbers) = 1 + 2 + 3 = 6 |

COUNT | How many values | COUNT(numbers) = 3 |

A few things to note before moving on.

### The difference between SUM and COUNT

It's incredibly easy to mix up SUM and COUNT, so here's the difference:

SUM takes a list of values and **totals** them.

COUNT takes a list and determines **how many** values we have.

The reason why SUM(numbers) equals 6 is because 1 + 2 + 3 = 6.

The reason why COUNT(numbers) equals 3 is because we have 3 values.

### The difference between COUNT(numbers) and COUNT(*)

For the sake of this concept, let's throw in a null value at the end of numbers.

numbers |

1 |

2 |

3 |

(null) |

In this case, COUNT(numbers) will *still* return a value of 3. Why? Because…

When you specify the field in your COUNT function, it will not include **null** **values** in the count.

However, COUNT(*) will return a value of **4**. Why? Because…

**COUNT(*)** essentially tells SQL to return a count of the **rows** present. It doesn't pay attention to any **null** **values** in those rows.

So, since the null value still counts as a row, it's included in COUNT(*).

## GROUP BY explained

Let's say I give you the following list of ages for boys and girls…

Boys: 9, 10, 11

Girls: 10, 11, 12

…and asked you what the average age is for boys versus girls.

Easy money. Boys' average age is **10**. Girls' average age is **11**.

So, how'd you do this mathematically? You *grouped* each age value by gender and then calculated the average.

First, you took all the boys' ages and *grouped* them together. Then you summed them and divided by 3, resulting in 10, since (9 + 10 + 11) / 3 = 10.

Then, you took the girls' ages and did the same thing, resulting in 11, since (10 + 11 + 12) / 3 = 11.

In this situation, you accomplished what a GROUP BY statement does. You grouped the values by a field, gender, in order to calculate the average age for each value of that gender field.

GROUP BY statements place records in **groups** by **field** values. They then run aggregate functions on each group.

## How to code a GROUP BY

Here's the template for GROUP BY statements:

GROUP BY **field(s)**;

This will make no sense to you now, so let's learn the concept with an example.

Let's take the above situation of boys versus girls ages, expect attack it in SQL terms.

Here's our table that we'll be working with, children, which contains the ages of the boys and girls above.

children | |

gender | age |

M | 9 |

M | 10 |

M | 11 |

F | 10 |

F | 11 |

F | 12 |

In this situation, it helps to first identify the end result we're going for and then work backwards.

So, what do we want to see?

Results | |

gender | Average age |

M | 10 |

F | 11 |

This matches exactly what we stated above:

"Easy money. Boys' (M) average age is **10**. Girls' (F) average age is **11**."

So, how should we go about this displaying these results in SQL?

I'd recommend we first find the average age of *all* children, then group those ages by gender.

So, to find the average age of all children, all we need to do is place the AVG aggregate function in our SELECT statement…

SELECT AVG(___)

FROM **database_name***.***table_name**;

…with the field that we're taking the average of, ages, as its argument.

SELECT AVG(ages)

FROM **database_name***.***table_name**;

Next, let's nail down the **database_name** and **table_name**. We'll call the database people and the table children. (this really doesn't matter. Don't focus too much on it, we've just gotta put something in our FROM statement.)

SELECT AVG(ages)

FROM people.children;

When we execute this query, we get the following results:

Results |

AVG(ages) |

10.5 |

This makes sense, because the average of all the ages equals 10.5.

(9 + 10 + 11 + 10 + 11 + 12) / 6 = **10.5**

Now, we want to place the gender values to the left of the average age, like we have in our goal results.

Results | |

gender | AVG(ages) |

M | 10 |

F | 11 |

So, all we need to do is place gender in our SELECT statement, right?

SELECT gender, AVG(ages)

FROM people.children;

**NO.** No, no, no. This is not *all* we need to do. We forgot something: our GROUP BY statement.

Whenever you have **field(s)** and **aggregate function(s)** in your SELECT statement, you must include all **field(s)** in your GROUP BY statement.

So, do we have field(s) in our SELECT statement? Yes, we have gender.

SELECT gender, AVG(ages)

FROM people.children;

Do we have aggregate function(s) in our SELECT statement? Yes, we have "AVG(ages)".

SELECT gender, AVG(ages)

FROM people.children;

Therefore, we must place all field(s), gender, in a GROUP BY statement.

SELECT gender, AVG(ages)

FROM people.children

GROUP BY gender;

Now, when we execute this query, we get our desired results.

Results | |

gender | AVG(ages) |

M | 10 |

F | 11 |

You might be wondering, why is it so important to include this GROUP BY statement?

## Revisiting the purpose of GROUP BY

Let's rewind back to when I first gave you the list of ages for boys and girls.

Boys: 9, 10, 11

Girls: 10, 11, 12

It was pretty easy to calculate the average ages for boys versus girls in this situation, right? That's because I pre-grouped the ages based on gender for you.

Imagine if I had just given you this...

9, 10, 11, 10, 11, 12

...and told you to calculate the average ages for boys versus girls.

It'd be impossible! What ages are for boys and what ages are for girls?

From the perspective of SQL, not utilizing a GROUP BY statement would be exactly like this.

Here in lies the value of GROUP BY statements. They allow SQL to identify the *groups* of records to run the aggregate functions on.

If there is **one thing** I want you to take away from this article, or the *entire* "SQL in 30 Minutes" series for that matter, it's this:

Whenever you have **field(s)** and **aggregate function(s)** in your SELECT statement, you must include all **field(s)** in your GROUP BY statement.

That, right there, is your key to **mastering GROUP BY** statements in the **shortest amount of time**.

## SQL Follow-along Guide

It's no secret you retain info better when you *write* it down. That's why I created the **SQL Follow-along Guide** for you!

As you come upon key concepts highlighted in yellow, like this sentence here, you can **fill-in-the-blanks **on your Follow-along Guide so that you remember all the important stuff for later.

You can obtain the SQL Follow-along Guide for **FREE** by entering your email below!

Now that we know how to run calculations on records in SQL, let's move onto learning how to filter those mathematical results with a HAVING statement!

### Concepts

*Each exam concept broken down with relatable situations to your life!*

Tools | SQL Follow-along Guide (FREE) | |

Lesson | What is SQL? | |

Lesson | SELECT & FROM | |

Lesson | JOIN | |

Lesson | WHERE | |

Lesson | GROUP BY | |

Lesson | HAVING | |

Lesson | ORDER BY | |

Lesson | Troubleshooting errors |

### Practice Problems

*Include step-by-step explanations through each part of the problem!*

"The SQL Cram Kit was the best thing I could’ve done before my exam. It takes practice and more practice to be good at coding and that is what I needed."

**Andrew Green***October 29, 2020*

I’m a Miami University (OH) 2021 alumni who majored in Information Systems. At Miami, I tutored students in Python, SQL, JavaScript, and HTML for 2+ years. I’m a huge fantasy football fan, Marvel nerd, and love hanging out with my friends here in Chicago where I currently reside.