How to Avoid Docker Hub Rate Limits (and Improve Your Workflow)
Docker Hub has a rate limit that is effectively in place since November 2020 Checking Your Current Docker Pull Rate Limits and Status | Docker. The detailed documentation about this rate limit is described here Docker Hub rate limit | Docker Docs
There is nothing new, but somehow, there is. I recently received an email from Azure stating that I should mitigate the effects of rate limiting by June 30th, 2024.
This date made me think, and I did some investigations. First, I started to look up precisely what the limits are.
Checking Local Rate Limits
If you don't have a docker subscription you pay for, you might hit the docker rate limit, which is for the free account - 200 image pulls per 6 hours. Our anonymous pulls were already just 100 image pulls per 6 hours.
I never log in to Docker Hub, so I run under the anonymous tier. I also never hit this limit locally. Also, I have had builds often on certain days, and 100 pulls per 6 hours seems not too much. It turns out that the pull only counts if the image is not already cached on your system. That is good, as you usually don't wipe your image cache each time.
Retrieving Rate Limits on Your System
Checking my local pulled docker images shows only one python base image
❯ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
python 3.11-slim-buster db841a2e8ab3 11 months ago 120MB
Following the guidance I retrieve with the anonymous bearer token the current limits I have.
I pulled now an additional ubuntu:latest
image and re-executed the curl call to the rate limit endpoint. The output states correctly that my limit was reduced by one.
Pulling again an image that is now locally cached, in this case the ubuntu:latest
again, doesn't affect the rate limit at all.
Azure & Azure DevOps MS Hosted Agents: Handling Rate Limits
But what about the Azure and Azure DevOps environments with Microsoft-hosted agents? Are they supposed to have the same issue?
I created a simple test pipeline to see how Azure DevOps handles rate limits.
The docker images cached in the Ubuntu:22.04 runner image, can be found here runner-images/images/ubuntu/Ubuntu2204-Readme.md. It doesn't have any of the images cached that I just pulled in the pipeline.
The exciting thing about this pipeline run is that the output of the "Check rate limit" step doesn't show any rate limits.
HTTP/1.1 200 OK
content-length: 2782
content-type: application/vnd.docker.distribution.manifest.v1+prettyjws
docker-content-digest: sha256:767a3815c34823b355bed31760d5fa3daca0aec2ce15b217c9cd83229e0e2020
docker-distribution-api-version: registry/2.0
etag: "sha256:767a3815c34823b355bed31760d5fa3daca0aec2ce15b217c9cd83229e0e2020"
date: Tue, 07 May 2024 10:05:56 GMT
strict-transport-security: max-age=31536000
docker-ratelimit-source: 13.79.31.179
I did the same for Azure, opened a CLI in the Portal, and entered the same rate limit query.
dariusz [ ~ ]$ TOKEN=$(curl "https://auth.docker.io/token?service=registry.docker.io&scope=repository:ratelimitpreview/test:pull" | jq -r .token)
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 5429 0 5429 0 0 20653 0 --:--:-- --:--:-- --:--:-- 20721
dariusz [ ~ ]$ curl --head -H "Authorization: Bearer $TOKEN" https://registry-1.docker.io/v2/ratelimitpreview/test/manifests/latest
HTTP/1.1 200 OK
content-length: 2782
content-type: application/vnd.docker.distribution.manifest.v1+prettyjws
docker-content-digest: sha256:767a3815c34823b355bed31760d5fa3daca0aec2ce15b217c9cd83229e0e2020
docker-distribution-api-version: registry/2.0
etag: "sha256:767a3815c34823b355bed31760d5fa3daca0aec2ce15b217c9cd83229e0e2020"
date: Tue, 07 May 2024 11:00:31 GMT
strict-transport-security: max-age=31536000
docker-ratelimit-source: 0.0.0.0
I inspected the retrieved token with https://jwt.ms and received the anonymous tier output, with an untouched rate limit of 100.
{
"alg": "RS256",
"typ": "JWT",
"x5c": [
"MIIEFj...Cx5Q3"
]
}.{
"access": [
{
"actions": [
"pull"
],
"name": "ratelimitpreview/test",
"parameters": {
"pull_limit": "100",
"pull_limit_interval": "21600"
},
"type": "repository"
}
],
"aud": "registry.docker.io",
"exp": 1715079853,
"iat": 1715079553,
"iss": "auth.docker.io",
"jti": "dckr_jti_CjpzWM-iQQaK_7QXf8wlUkz8kJI=",
"nbf": 1715079253,
"sub": ""
}.[Signature]
Based on this claim Testing Dockerhub pull limits in Azure - Docker Community Forums there must have been some agreement that was already known back in 2022 when this was written, and probably this agreement will now run out by end of June 2024.
Starting Rate Limit Mitigation
The rate limits will likely affect future CI pipeline runs. In our case, we have a CI pipeline that builds images on each push to the Azure DevOps git repository using the Azure Container Registry build task.
Current State: CI Pipeline Configuration in Azure DevOps
Below is a snippet from a pipeline step in Azure DevOps to build our image.
- task: AzureCLI@2
displayName: "Build and push container image"
inputs:
azureSubscription: $(serviceConnection)
addSpnToEnvironment: true
workingDirectory: $(projectRoot)
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
az acr build --registry "${REGISTRY}" --image "${REGISTRY}.azurecr.io/backend:dev-${BUILD_NUMBER}" --image "${REGISTRY}.azurecr.io/backend:dev-latest" --platform linux --file Dockerfile .
env:
REGISTRY: $(containerRegistryName)
BUILD_NUMBER: $(Build.BuildNumber)
The corresponding Dockerfile still refers to the base image on the Docker hub.
FROM python:3.11-slim-buster
WORKDIR /code
COPY ./requirements.txt /code/requirements.txt
RUN pip install --no-cache-dir -r /code/requirements.txt
COPY app /code/app
CMD ["uvicorn", "app.main:app", "--host=0.0.0.0", "--port=8000"]
Using Azure Container Registry Caching
As the Azure Container Registry is in place, we changed the behavior using the Azure Container Registry Cache feature.
This feature allows the ACR to pull and cache an image from another registry based on your defined rules. In the case of the Python-based images, a rule might look like this.
az acr cache create \
--registry ${REGISTRY} \
--name python-base-images \
--source-repo docker.io/library/python
We changed the Dockerfile to pull from our ACR instead of the Docker hub where the base image originally lives.
ARG REGISTRY=your-acr-registry-name
FROM $REGISTRY.azurecr.io/python:3.11-slim-buster
WORKDIR /code
COPY ./requirements.txt /code/requirements.txt
RUN pip install --no-cache-dir -r /code/requirements.txt
COPY app /code/app
CMD ["uvicorn", "app.main:app", "--host=0.0.0.0", "--port=8000"]
The first time this runs, ACR will pull this image from Docker Hub and cache it under the python repository, as specified in the shell command above.
Once the pipeline runs and pulls the image through the Azure Container Registry, you can evaluate the cache rule and look up the cached repository.
az acr cache list --registry ${REGISTRY}
az acr repository show --name ${REGISTRY} --repository python
From now on, using Azure Container Registry Cache will mitigate rate limits that might affect our pipeline runs on Microsoft-hosted Agents and offer benefits like improved build performance, reduced network traffic, enhanced reliability, and simplified management.
Conclusion: Optimizing Development Workflows in the Face of Rate Limits
Docker Hub rate limits present an opportunity to optimize your development workflow and improve performance.
Take Action
- Explore caching strategies: Understand how effective caching can reduce your reliance on Docker Hub.
- Leverage your cloud provider's registry: Cloud registries often offer integrated solutions that streamline image management and can boost build speeds.
- If you need a high volume of pulls, consider a paid Docker Hub subscription: This ensures reliable access and support.
By proactively addressing Docker Hub's limits, you can enhance the efficiency and resilience of your CI/CD pipelines.