Drew Martin 08/12/2019

Pagination with Relative Cursors

When requesting multiple pages of records from a server the simple implementation is to have an incremental page number in the URL. Starting at page one, each subsequent request that’s sent has a page number that’s one greater than the previous. The problem is that incremental page numbers scale poorly—the bigger the page number, the slower the query. The simple solution is relative cursor pagination because it remembers where you were and continues from that point onwards instead.

The Problem

A common activity for third-party applications on Shopify to do is to sync the full catalogue of products. Some shops have more than 100,000 products and these can’t all be loaded in a single request as it would time out. Instead, the application would make multiple requests to Shopify for successive pages of products which look like this:

https:<your-shop-domain>.myshopify.com/admin/products.json?page=25&limit=100

This would generate a SQL query like this:

This query scales poorly because the bigger the offset, the slower the query. In the above example, the query needs to go through 2500 records and then discard the first 2400. Using a test shop with 14 million products, we ran some experiments loading pages of products at various offsets. Taking the average time over five runs at each offset, here are the results:

Offset

Time (ms)

10

6.54

100

7.72

1,000

8.46

10,000

79.82

100,000

2,221.60

Omitted from the table are tests with the 1,000,000th offset and above since they consistently timed out.

Not only do queries take a long time when a large offset is used, but there’s also a limited number of queries that can be run concurrently. If too many requests with large page numbers are made at the same time, they can pile up faster than they can be executed. This leads to unrelated, quick queries timing out while waiting to be run because all of the database connections are in use by these slow, large-offset queries.

It’s particularly problematic on large shops when third-party applications load all records for a particular model, be it products, collects, orders, or anything else. Such usage has ramifications outside of the shop they are being run on. Since multiple shops are run on the same database instances, a moderate volume of large-offset queries cause unrelated queries from shops that happen to share the same database instance to be slower or time out altogether. For the long-term health of our platform we couldn’t allow this situation to continue unchecked.

What is Relative Cursor Pagination?

Relative cursor pagination remembers where you were so that each request after the first continues from where the previous request left off. The downside is that you can no longer jump to a specific page. The easiest way to do this is remembering the id of the last record from the last page you’ve seen and continuing from that record, but it requires the results to be sorted by id. With a last id of 67890 this would looks like:

A good index set up can handle this query and will perform much better than using an offset, in this example, it’s the primary index on id. Using the same test shop, here’s how long it takes to get the same pages of records but this time using the last id:

Offset

Time using offset (ms)

Time using last id (ms)

Percentage improvement

10

6.54

5.32

18.65%

100

7.72

5.78

25.13%

1,000

8.46

5.76

31.91%

10,000

79.82

6.04

92.43%

100,000

2,221.60

5.24

99.76%

With an offset of 100,000 it’s over 400 times faster to use last id! It’s much faster, and it doesn’t matter how many pages you request, the last page takes around the same amount of time as the first.

Sorting and Skipping Records

Sorting by something other than id is possible by remembering the last value of the field being sorted on. For example, if you’re sorting by title, then the last value is the title of the last record in the page. If the sort value is not unique, then if we used it alone we would potentially be skipping records. For example, assume you have the following products:

Sorting by Title
Sorting by Title

Requesting a page size of two sorted by title would return product with ids 3 and 2. To request the next page, just querying by title > “Pants” would skip product 4 and start at product 1.

Sorting by Title - Product Skipped
Sorting by Title - Product Skipped

Whatever the use case of the client that requests these records, it’s likely to have problems if records are sometimes skipped. The solution is to set a secondary sort column on a unique value, like id, and then remembering both the last value and last id. In that case the query for the second page would look like this:

Querying in this way results in getting the expected products on the second page.

Sorting by Title - No Skipped Product
Sorting by Title - No Skipped Product

To ensure the query is performant as the number of records increases you’d need a database index set up on title and id. If an appropriate index is not set up then it could be even slower than using a page number.

Using the same test shop as before, here’s how long it takes to get the same pages of records but this time using both last value and last id:

Offset

Time using offset (ms)

Time using last id (ms)

Time using last value (ms)

Percentage improvement over offset

Percentage improvement over last id

10

6.54

5.32

6.64

-1.53%

-24.81%

100

7.72

5.78

6.22

19.43%

-7.61%

1,000

8.46

5.76

6.5

23.17%

-12.85%

10,000

79.82

6.04

9.18

88.50%

-51.99%

100,000

2,221.60

5.24

6

99.73%

-14.50%

Overall, it’s slower than using a last id alone, but still orders of magnitude faster than using an offset when the offset grows large.

Making it Easy for Clients to Use Relative Cursors

The field being sorted on might not be included in the response. For example, in the Shopify API pages of products sorted by total inventory can be requested. We don’t expose total inventory directly on the product, but it can be derived by adding up the inventory_quantity from the nested variants, which are included in the response. Rather than requiring clients do this calculation themselves we make it easy for them by generate URLs that can be used to request the next and previous page, and include them in a Link header in the response. If there’s both a next page and a previous page it looks like this: 

Conversion in Shopify

The problem of large offsets causing queries to be slow was well known within Shopify, as was the solution of using relative cursors. In our internal endpoints, we were making liberal use of them, but rolling relative cursors out to external clients is a much bigger effort. We just added API versioning to our REST API, so it’s reasonable to make such a large change as removing page numbers and switching everything to relative cursors.

As the responsibility for the different endpoints was spread across many different teams there was no clear owner of pagination as a whole. Though the problem wasn’t directly related to my team, Merchandising, our ownership of the products and collects APIs meant we were acutely aware of the problem. They’re two of the largest APIs in terms of both the volume of requests, and the number of records they deal with.

I wanted to fix the problem and no one else was tackling it, so I put together a proposal on how we could fix it across our platform and sent it to my lead and senior engineering leadership. They agreed with my solution and I got the green light to work on it. A couple more engineers joined me and together we put together the patterns all endpoints were to follow, along with the common code they would use, and a guide for how to migrate their endpoints. We made a list of all the endpoints that would need to be converted and pushed it out to the teams who owned them. Soon we had dozens of developers across the company working on it.

As third-party developers must opt in to use relative cursors for now, adoption is currently quite low and we don’t have much in the way of performance measures to share. Early usage of relative pagination on the /admin/products.json endpoint show it to be about 11 times faster on average than comparable requests using a page number. By July 2020 no endpoints will support page numbers on any API version and will need to use relative pagination. We’ll have to wait until then to see the full results of the change.


We’re always looking for awesome people of all backgrounds and experiences to join our team. Visit our Engineering career page to find out what we’re working on.