Now that's a fine question! As it turns out, a token bucket is pretty well named, but I'll try to expand beyond that and not just tell you it's bucket with tokens :P The idea behind the bucket is this:
I'll keep a bucket for you. You have 2 things you control: how big the bucket is, and how fast I refill it with tokens. Once you've told me those two things, then you can ask me for N tokens and I'll give you some if they are available.
So what can I do with a token bucket?
Rate limiting, concurrency, only-once behavior, you name it. First a quick rate limit example. If you are only allowed to hit the Twitter API 180 times per 15 minutes, you tell me to put 180 tokens in the bucket and refill it at a rate of 720 per hour. Then everytime you go to hit the API you request 1 token and if that's not available you wait.
What was that about concurrency limiting?
Say you've got 20 workers distributed across 10 different machines. You got some sort of race-condition you want to avoid so what do you need? Well, you need a distributed lock. We can solve this with a token bucket if we add just one thing to the concept we described above, namely that you can return a token when you're done. So you tell me put 1 token in the bucket and, if I never return what I've taken, refill it after 30 minutes. Then you take a token. While you've got it nobody else can have it and when you're done you give it right back.
What about Exactly Once behavior?
Would you like to send an event to MixPanel or Amplitude only once? Do you have tables full of "user A has seen and dismissed the popup modal" junk? Ask for a bucket with one token that never refills and boom, you've got once and only once behavior.
So why use a service?
Totally fair question. You can absolutely do all of this in your local Memcached, local Redis, local Consul or Zookeeper. The reasons you might prefer it as a service are: