Token Bucket Algorithm: Rate Limiting Explained with Python & Go

Token Bucket Algorithm

In the world of high-performance systems, APIs, and network security, rate limiting is a powerful technique to control traffic, prevent abuse, and ensure fair resource distribution. One of the most commonly used algorithms for rate limiting is the Token Bucket Algorithm.

This article will cover:

  • What is the Token Bucket Algorithm?
  • How it works
  • Real-world use cases
  • Python implementation
  • Go implementation
  • Comparison with Leaky Bucket
  • Best practices

What is the Token Bucket Algorithm?

The Token Bucket Algorithm is a rate-limiting algorithm used to control the amount of data or requests a system can handle over time.

Imagine a bucket that fills with tokens at a fixed rate (e.g., 1 token every second). Every time a user wants to perform an action (like an API call), they must “pay” with a token. If there are no tokens in the bucket, the request is denied or delayed.

Key Characteristics

  • Burst-friendly: Allows short bursts of activity as long as there are tokens.
  • Flexible: You can adjust the bucket size (capacity) and refill rate.
  • Real-time: Works well for live rate limiting.

How Does It Work?

Initialization:

  • Set a bucket capacity (maximum number of tokens it can hold).
  • Set a token refill rate (tokens added per second).

Token Refill:

  • At each time interval, tokens are added to the bucket, up to its capacity.

Request Handling:

  • Each incoming request consumes one or more tokens.
  • If enough tokens are available, the request is processed.
  • If not, the request is rejected or delayed.

Real-World Use Cases

  • API rate limiting – Prevent abuse of public APIs
  • Network traffic shaping – Control bandwidth usage
  • Login throttling – Slow down brute force attacks
  • Task scheduling – Ensure fair access to shared resources

Python Implementation of Token Bucket

Let’s implement a basic Token Bucket in Python using threading and time.

Try changing the value of the “time.sleep(0.3)” and you will see the algorithm just passes certain simulated requests.

Golang Implementation of Token Bucket

Here’s a thread-safe token bucket in Go using time.Ticker and sync.Mutex.

Token Bucket vs. Leaky Bucket

FeatureToken BucketLeaky Bucket
Burst support✅ Yes❌ No
Request shaping❌ No✅ Yes (constant rate)
Simplicity✅ Simple✅ Simple
Real-time control✅ Yes✅ Yes

Best Practices

  • Thread safety: Use locks or atomic operations.
  • Precise timing: Use monotonic clocks to avoid system time jumps.
  • Tuning parameters: Experiment with bucket size and refill rate.
  • Distributed use: Store token buckets in Redis or distributed caches for multi-node systems.

Conclusion

The Token Bucket Algorithm is a powerful yet simple way to limit the rate of actions across systems. Whether you’re building an API, controlling bandwidth, or preventing abuse, implementing this algorithm in Python or Go can greatly enhance the stability and fairness of your applications.

Want to go further? Add Redis to make it distributed, or use middleware in web frameworks like FastAPI or Gin to plug token buckets into API endpoints.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *