Window Functions

Calculates a window function operates on a group (“window”) of related rows

Introduction

What are Window Functions?

Window functions are a type of analytical function in SQL and other programming languages that perform calculations across a set of rows related to the current row. Unlike regular aggregate functions, which return a single value for a group of rows, window functions return a value for each row, based on the rows within its "window."

Applications of Window Functions

Window functions are commonly used in various applications, such as:

  • Running totals

  • Cumulative sums or averages

  • Moving averages

  • Ranking or numbering rows

  • Calculating percentiles

Types of Window Functions

There are several types of window functions available in SQL and other programming languages:

  • Ranking functions: ROW_NUMBER()

  • Aggregate functions: SUM(), AVG(), MIN(), MAX(), and COUNT()

Parameters in Window Functions

Window functions can be configured with several parameters that determine the range, ordering, and partitioning of the data within the window. Here, we'll discuss the key parameters and their usage in window functions.

  1. PARTITION BY: This parameter is used to divide the data into partitions to which the window function is applied. If you don't specify the PARTITION BY clause, the function will treat the whole result set as a single partition.

  2. ORDER BY: This parameter is used to specify the order in which the rows will be processed by the window function. It's important to define the order, especially when using ranking or numbering functions, as it directly impacts the output.

  3. ROWS BETWEEN: This parameter is used to define the range of rows that should be included in the window frame, relative to the current row. It allows you to specify the frame using one of the following options:

    • UNBOUNDED PRECEDING: Includes all rows from the start of the partition to the current row.

    • UNBOUNDED FOLLOWING: Includes all rows from the current row to the end of the partition.

    • n PRECEDING: Includes the previous n rows before the current row.

    • n FOLLOWING: Includes the next n rows after the current row.

    • CURRENT ROW: Only includes the current row.

Example

In this example, we want to calculate the running total of sales for each product, ordered by the date.

Consider the following sales table:

id
date
product_id
amount

1

2023-01-01

1

100

2

2023-01-02

1

150

3

2023-01-03

2

75

4

2023-01-04

2

200

5

2023-01-05

1

50

6

2023-01-06

2

300

Parameters

SUM(amount)

PARTITION BY product_id

ORDER BY date

The result would be:

id
date
product_id
amount
running_total

1

2023-01-01

1

100

100

2

2023-01-02

1

150

250

5

2023-01-05

1

50

300

3

2023-01-03

2

75

75

4

2023-01-04

2

200

275

6

2023-01-06

2

300

575

In this example, the SUM() window function is used and includes both the PARTITION BY and ORDER BY clauses. The PARTITION BY product_id clause ensures that the running total is calculated separately for each product, while the ORDER BY date clause sorts the rows within each partition by date.

Last updated