Sentinel源码—5.FlowSlot借鉴Guava的限流算法二

大纲

1.Guava提供的RateLimiter限流使用示例

2.Guava提供的RateLimiter简介与设计

3.继承RateLimiter的SmoothBursty源码

4.继承RateLimiter的SmoothWarmingUp源码

3.继承RateLimiter的SmoothBursty源码

(1)SmoothBursty的初始化流程

(2)SmoothBursty的初始化完成后的变量值

(3)SmoothBursty的acquire()和tryAcquire()

(4)SmoothBursty的令牌生成规则分析

(5)SmoothRateLimiter对预支令牌的处理分析

(6)SmoothBursty的案例场景分析

(1)SmoothBursty的初始化流程

令牌桶算法是可以应对突发流量的，Bursty则有突发的含义。SmoothBursty应对突发流量是有前提条件的，只有在令牌桶内有存储的令牌情况下，才会放行相应的突发流量，而令牌桶内的已存储令牌是低流量时省下来的。如果系统一直处于高流量，导致令牌桶内没有存储的令牌，那么当突发流量过来时，也只能按照固定速率放行。

所以在SmoothBursty类中，获取令牌桶中的存储令牌是无需额外代价的。当令牌桶能满足请求线程所需的令牌数量时，就不会阻塞线程，从而达到应对突发流量的能力。当然，令牌桶中的存储令牌是有上限的，该上限会通过构造方法进行设置。

首先，new SmoothBursty(stopwatch, 1.0)构造方法表示的是：通过硬编码指定了令牌桶中最多存储1秒的令牌数。如果传入的permitsPerSecond = 10，表示的是每秒生成10个令牌，那么意味着令牌桶中最多存储10个令牌。

然后，初始化SmoothBursty的重点是RateLimiter的setRate()方法。该方法会调用SmoothRateLimiter的doSetRate()方法，然后调用SmoothRateLimiter的resync()方法，最后调用SmoothBursty的doSetRate()设定maxPermits和storedPermits。

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
    ...
    //Creates a RateLimiter with the specified stable throughput, 
    //given as "permits per second" (commonly referred to as QPS, queries per second).
    //The returned RateLimiter ensures that on average no more than permitsPerSecond are issued during any given second, 
    //with sustained requests being smoothly spread over each second.
    //When the incoming request rate exceeds permitsPerSecond the rate limiter will release one permit every (1.0 / permitsPerSecond) seconds. 
    //When the rate limiter is unused, bursts of up to permitsPerSecond permits will be allowed, 
    //with subsequent requests being smoothly limited at the stable rate of permitsPerSecond.
    //创建一个具有指定稳定吞吐量的RateLimiter，传入的"permits per second"通常称为QPS、每秒查询量；
    //返回的RateLimiter确保在任何给定的秒期间平均不超过permitsPerSecond的令牌被发出，持续的请求将在每一秒内被平稳地通过；
    //当传入请求的速率超过permitsPerSecond时，速率限制器将每隔(1.0/permitsPerSecond)秒释放一个令牌；
    //当速率限制器未被使用时，将允许突发式的高达permitsPerSecond的令牌，而随后的请求将以permitsPerSecond的稳定速率被平滑地限制；
    
    //对外暴露的创建方法
    //@param permitsPerSecond the rate of the returned RateLimiter, measured in how many permits become available per second.
    public static RateLimiter create(double permitsPerSecond) {
        //The default RateLimiter configuration can save the unused permits of up to one second. 
        //This is to avoid unnecessary stalls in situations like this: 
        //A RateLimiter of 1qps, and 4 threads, all calling acquire() at these moments:
        //T0 at 0 seconds、T1 at 1.05 seconds、T2 at 2 seconds、T3 at 3 seconds
        //Due to the slight delay of T1, T2 would have to sleep till 2.05 seconds, and T3 would also have to sleep till 3.05 seconds.
        //默认的RateLimiter配置可以保存长达一秒钟的未被使用的令牌；
        //这是为了避免在这种情况下出现不必要的停顿：
        //一个由1QPS和4个线程组成的RateLimiter，所有线程都在如下这些时刻调用acquired()：
        //Thread0在0秒、Thread1在1.05秒、Thread2在2秒、Thread3在3秒
        //由于Thread1的轻微延迟，Thread2必须睡眠到2.05秒，Thread3也必须睡眠到3.05秒
        
        //内部调用一个QPS设定 + 起始时间StopWatch的构建函数.
        //这里传入的SleepingStopwatch是一个以系统启动时间的一个相对时间的计量.
        //后面的读时间偏移是以这个开始的时间偏移为起始的.
        return create(permitsPerSecond, SleepingStopwatch.createFromSystemTimer());
    }
    
    @VisibleForTesting
    static RateLimiter create(double permitsPerSecond, SleepingStopwatch stopwatch) {
        //指定了令牌桶中最多存储1秒的令牌数
        RateLimiter rateLimiter = new SmoothBursty(stopwatch, 1.0 /* maxBurstSeconds */);
        //调用RateLimiter的setRate()方法
        rateLimiter.setRate(permitsPerSecond);
        return rateLimiter;
    }
    
    //Updates the stable rate of this RateLimiter, 
    //that is, the permitsPerSecond argument provided in the factory method that constructed the RateLimiter. 
    //Currently throttled threads will not be awakened as a result of this invocation, 
    //thus they do not observe the new rate; only subsequent requests will.
    //Note though that, since each request repays (by waiting, if necessary) the cost of the previous request, 
    //this means that the very next request after an invocation to setRate() will not be affected by the new rate; 
    //it will pay the cost of the previous request, which is in terms of the previous rate.
    //The behavior of the RateLimiter is not modified in any other way, 
    //e.g. if the RateLimiter was configured with a warmup period of 20 seconds, 
    //it still has a warmup period of 20 seconds after this method invocation.
    //更新该RateLimiter的稳定速率，即在构造RateLimiter的工厂方法中提供permitsPerSecond参数；
    //当前被限流的线程将不会由于这个调用而被唤醒，因此它们没有观察到新的速率；只有随后的请求才会；
    //但是要注意的是，由于每个请求(如果需要，通过等待)会偿还先前请求的成本，
    //这意味着调用setRate()方法后的下一个请求将不会受到新速率的影响，
    //它将按照先前的速率处理先前请求的成本；
    //RateLimiter的行为不会以任何其他方式修改，
    //例如：如果RateLimiter被配置为具有20秒的预热周期，在该方法调用之后，它仍然有20秒的预热期；

    //@param permitsPerSecond the new stable rate of this {@code RateLimiter}
    public final void setRate(double permitsPerSecond) {
        checkArgument(permitsPerSecond > 0.0 && !Double.isNaN(permitsPerSecond), "rate must be positive");
        //在同步代码块中设定速率
        synchronized (mutex()) {
            //调用SmoothRateLimiter.doSetRate()方法
            doSetRate(permitsPerSecond, stopwatch.readMicros());
        }
    }
    ...
}

@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
    //The currently stored permits.
    //令牌桶中当前缓存的未消耗的令牌数
    double storedPermits;
    //The maximum number of stored permits.
    //令牌桶中允许存放的最大令牌数
    double maxPermits;
    //The interval between two unit requests, at our stable rate.
    //E.g., a stable rate of 5 permits per second has a stable interval of 200ms.
    //按照我们稳定的速率，两个单位请求之间的时间间隔；例如，每秒5个令牌的稳定速率具有200ms的稳定间隔
    double stableIntervalMicros;
    //The time when the next request (no matter its size) will be granted. 
    //After granting a request, this is pushed further in the future. Large requests push this further than small requests.
    //下一个请求(无论大小)将被批准的时间.
    //在批准请求后，这将在未来进一步推进，大请求比小请求更能推动这一进程。
    private long nextFreeTicketMicros = 0L;//could be either in the past or future
    ...
    //这是一个可以重复调用的函数.
    //第一次调用和非第一次调用的过程有些不一样，目的是设定设定最大令牌数maxPermits和已存储的令牌数storedPermits
    @Override
    final void doSetRate(double permitsPerSecond, long nowMicros) {
        //调用SmoothRateLimiter.resync()方法，重试计算和同步存储的预分配的令牌.
        resync(nowMicros);
        //计算稳定的发放令牌的时间间隔. 单位us, 比如QPS为5, 则为200ms的间隔进行令牌发放. 
        double stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
        this.stableIntervalMicros = stableIntervalMicros;
        //调用SmoothBursty.doSetRate()设定最大令牌数maxPermits和已存储的令牌数storedPermits
        doSetRate(permitsPerSecond, stableIntervalMicros);
    }
    
    //Updates storedPermits and nextFreeTicketMicros based on the current time.
    //根据当前时间，更新storedPermits和nextFreeTicketMicros变量
    //注意: 在初始化SmoothBursty时会第一次调用resync()方法，此时各值的情况如下：
    //coolDownIntervalMicros = 0、nextFreeTicketMicros = 0、newPermits = 无穷大.
    //maxPermits = 0(初始值，还没有重新计算)、最后得到的: storedPermits = 0;
    //同时，nextFreeTicketMicros = "起始时间"
    void resync(long nowMicros) {
        //if nextFreeTicket is in the past, resync to now
        if (nowMicros > nextFreeTicketMicros) {
            double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
            storedPermits = min(maxPermits, storedPermits + newPermits);
            nextFreeTicketMicros = nowMicros;
        }
    }
    
    abstract void doSetRate(double permitsPerSecond, double stableIntervalMicros);
    ...
    
    //This implements a "bursty" RateLimiter, where storedPermits are translated to zero throttling.
    //The maximum number of permits that can be saved (when the RateLimiter is unused) is defined in terms of time, 
    //in this sense: if a RateLimiter is 2qps, and this time is specified as 10 seconds, we can save up to 2 * 10 = 20 permits.
    //SmoothBursty实现了一个"突发式"的速率限制器RateLimiter，其中的storedPermits会被转换为0；
    //它可以保存的最大令牌数量(当RateLimiter未使用时)是根据时间定义的，
    //从这个意义上说：如果RateLimiter是2QPS，并且这个时间被指定为10秒，那么最多可以保存2 * 10 = 20个令牌；
    static final class SmoothBursty extends SmoothRateLimiter {
        //The work (permits) of how many seconds can be saved up if this RateLimiter is unused?
        //如果这个速率限制器RateLimiter没有被使用，那么可以节省多少秒的工作(令牌)？
        final double maxBurstSeconds;
        SmoothBursty(SleepingStopwatch stopwatch, double maxBurstSeconds) {
            super(stopwatch);
            this.maxBurstSeconds = maxBurstSeconds;
        }
      
        @Override
        void doSetRate(double permitsPerSecond, double stableIntervalMicros) {
            //初次设定的时候，oldMaxPermits  = 0.0
            double oldMaxPermits = this.maxPermits;
            //新的(当前的)maxPermits为burst的时间周期(1秒) * 每周期的令牌数.
            maxPermits = maxBurstSeconds * permitsPerSecond;
            if (oldMaxPermits == Double.POSITIVE_INFINITY) {
                //if we don't special-case this, we would get storedPermits == NaN, below
                storedPermits = maxPermits;
            } else {
                //初始化SmoothBursty，执行到此处时，storedPermits为0
                storedPermits = (oldMaxPermits == 0.0) ? 0.0 : storedPermits * maxPermits / oldMaxPermits;
            }
        }
        
        @Override
        long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
            return 0L;
        }
        
        @Override
        double coolDownIntervalMicros() {
            return stableIntervalMicros;
        }
    }
    ...
}

(2)SmoothBursty的初始化完成后的变量值

在构建完SmoothBursty这个RateLimiter后，其初始状态说明如下：

说明一：maxBurstSeconds为1秒。默认情况下，传入的突发周期参数为1秒。

说明二：storedPermits为0。没有预分配的令牌，因为此时还处于初始的状态。

说明三：stableIntervalMicros表示的是每个令牌发放时的时间间隔，会根据给定的QPS换算出来。

说明四：maxPermits表示的是最大允许存储的令牌个数(= 突发周期 * 每周期允许数)，这里突发周期限定为1秒，也就是可以预存储一个周期的令牌。

说明五：nextFreeTicketMicros表示的是下一次可以发放令牌的起始时间，会被初始化为"开始时间"。

(3)SmoothBursty的acquire()和tryAcquire()

一.RateLimiter实现限流的过程

二.SmoothBursty的acquire()方法分析

三.SmoothBursty的tryAcquire()方法分析

一.RateLimiter的限流过程

RateLimiter的限流过程可以分为如下四个步骤：

步骤一：生产令牌

步骤二：获取令牌

步骤三：计算阻塞时间

步骤四：阻塞线程

既然RateLimiter做了抽象，那么说明它提取了限流过程中的共性，而RateLimiter里的共性就是阻塞线程的逻辑。即RateLimiter的acquire()方法将阻塞线程这个共性提取了出来，将生产令牌、获取令牌、计算阻塞时间的具体细节由子类去实现。RateLimiter的子类SmoothRateLimiter的几个重要属性如下：

属性一：nextFreeTicketMicros

表示的是下一次请求被允许的时间。当令牌数不足时，会由处理当前请求的线程延迟计算令牌生成数及耗时。即使需要等待，当前线程也不会去阻塞等待，而是提前预支令牌。而这个预支的代价会转嫁给下一个请求，这样做的目的是为了减少线程阻塞。

属性二：stableIntervalMicros

表示的是每产生一个令牌需要消耗的微秒，这个值是根据构造器传入的permitsPerSecond换算成微秒数得来的。

属性三：maxPermits

表示的是令牌桶中允许存放的最大令牌数。

属性四：storedPermits

表示的是令牌桶中当前缓存的未消耗的令牌数。当令牌消耗速度小于令牌产生速度时，令牌桶内就会开始堆积令牌，但是storedPermits不会大于maxPermits。

二.SmoothBursty的acquire()方法分析

执行SmoothBursty的acquire()方法时，会对令牌对象加synchronized锁。通过加synchronized锁让并发的请求进行互斥，才能实现限流效果。其中SmoothRateLimiter的reserveEarliestAvailable()方法的细节说明如下：

说明一：该方法主要用来实现生产令牌、获取令牌、计算阻塞时间

计算阻塞时间时，会将总的阻塞时间拆分成两部分。第一部分是从桶中获取storedPermitsToSpend个现有令牌的代价，第二部分是等待生成freshPermits个新鲜令牌的代价。

对于子类，生成新鲜令牌的代价是相同的，只有获取现有令牌代价才会不同。所以从桶中获取令牌需要等待的时间的抽象方法storedPermitsToWaitTime()会由SmoothRateLimiter子类实现。其中的一个子类SmoothBursty的storedPermitsToWaitTime()方法返回0，表示不需要等待。

说明二：获取令牌的阻塞代价会转移给下一个请求

如果处理当前请求时发现需要阻塞等待，那么等待时间由下个请求承受。这样做的目的是为了减少线程的阻塞。因为下一个请求的请求时间是不确定的，可能很久后才到来下一个请求。而这段时间内生成的新鲜令牌已经可以满足下一个请求了，从而不用阻塞。

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
    ...
    //无限等待的获取
    //Acquires the given number of permits from this RateLimiter, 
    //blocking until the request can be granted. 
    //Tells the amount of time slept, if any.
    //@param permits the number of permits to acquire，获取的令牌数量
    //@return time spent sleeping to enforce rate, in seconds; 0.0 if not rate-limited
    @CanIgnoreReturnValue
    public double acquire(int permits) {
        //调用RateLimiter.reserve()方法
        //预支令牌并获取需要阻塞的时间：即预定数量为permits的令牌数，并返回需要等待的时间
        long microsToWait = reserve(permits);
        //将需要等待的时间补齐, 从而满足限流的需求，即根据microsToWait来让线程sleep(共性)
        stopwatch.sleepMicrosUninterruptibly(microsToWait);
        //返回这次调用使用了多少时间给调用者
        return 1.0 * microsToWait / SECONDS.toMicros(1L);
    }
        
    //Reserves the given number of permits from this RateLimiter for future use, 
    //returning the number of microseconds until the reservation can be consumed.
    //从这个RateLimiter限速器中保留给定数量的令牌，以备将来使用，返回可以使用保留前的微秒数
    //@return time in microseconds to wait until the resource can be acquired, never negative
    final long reserve(int permits) {
        checkPermits(permits);
        //由于涉及并发操作，所以必须使用synchronized进行互斥处理
        synchronized (mutex()) {
            //调用RateLimiter.reserveAndGetWaitLength()方法
            return reserveAndGetWaitLength(permits, stopwatch.readMicros());
        }
    }
    
    //Reserves next ticket and returns the wait time that the caller must wait for.
    //预定下一个ticket，并且返回需要等待的时间
    final long reserveAndGetWaitLength(int permits, long nowMicros) {
        //调用SmoothRateLimiter.reserveEarliestAvailable()方法
        long momentAvailable = reserveEarliestAvailable(permits, nowMicros);
        return max(momentAvailable - nowMicros, 0);
    }
    
    //Reserves the requested number of permits and returns the time that those permits can be used (with one caveat).
    //保留请求数量的令牌，并返回可以使用这些令牌的时间(有一个警告)
    //生产令牌、获取令牌、计算阻塞时间的具体细节由子类来实现
    //@return the time that the permits may be used, or, if the permits may be used immediately, an arbitrary past or present time
    abstract long reserveEarliestAvailable(int permits, long nowMicros);
    ...
}

@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
    //The currently stored permits. 
    //令牌桶中当前缓存的未消耗的令牌数
    double storedPermits;
    //The maximum number of stored permits.
    //令牌桶中允许存放的最大令牌数
    double maxPermits;
    //The interval between two unit requests, at our stable rate.
    //E.g., a stable rate of 5 permits per second has a stable interval of 200ms.
    //按照我们稳定的速率，两个单位请求之间的时间间隔；例如，每秒5个令牌的稳定速率具有200ms的稳定间隔
    double stableIntervalMicros;
    //The time when the next request (no matter its size) will be granted. 
    //After granting a request, this is pushed further in the future. Large requests push this further than small requests.
    //下一个请求(无论大小)将被批准的时间. 在批准请求后，这将在未来进一步推进，大请求比小请求更能推动这一进程.
    private long nextFreeTicketMicros = 0L;//could be either in the past or future
    ...
    @Override
    final long reserveEarliestAvailable(int requiredPermits, long nowMicros) {
        //1.根据nextFreeTicketMicros计算新产生的令牌数，更新当前未使用的令牌数storedPermits
        //获取令牌时调用SmoothRateLimiter.resync()方法与初始化时的调用不一样.
        //此时会把"还没有使用"的令牌存储起来.
        //但是如果计数时间nextFreeTicketMicros是在未来. 那就不做任何处理.
        resync(nowMicros);
        //下一个请求(无论大小)将被批准的时间，这个值将被作为方法结果返回
        long returnValue = nextFreeTicketMicros;
        
        //2.计算需要阻塞等待的时间
        //2.1.先从桶中取未消耗的令牌，如果桶中令牌数不足，看最多能取多少个
        //存储的令牌可供消费的数量
        double storedPermitsToSpend = min(requiredPermits, this.storedPermits);
        //2.2.计算是否需要等待新鲜的令牌(当桶中现有的令牌数不足时就需要等待新鲜的令牌)，如果需要，则计算需要等待的令牌数
        //需要等待的令牌：新鲜的令牌
        double freshPermits = requiredPermits - storedPermitsToSpend;
        //计算需要等待的时间
        //分两部分计算：waitMicros = 从桶中获取storedPermitsToSpend个现有令牌的代价 + 等待生成freshPermits个新鲜令牌的代价
        //从桶中取storedPermitsToSpend个现有令牌也是有代价的，storedPermitsToWaitTime()方法是个抽象方法，会由SmoothBursty和SmoothWarmingUp实现
        //对于SmoothBursty来说，storedPermitsToWaitTime()会返回0，表示已经存储的令牌不需要等待.
        //而生成新鲜令牌需要等待的代价是：新鲜令牌的个数freshPermits * 每个令牌的耗时stableIntervalMicros
        long waitMicros = storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend) + (long) (freshPermits * stableIntervalMicros);
        
        //3.更新nextFreeTicketMicros
        //由于新鲜的令牌可能已被预消费，所以nextFreeTicketMicros就得往后移，以表示这段时间被预消费了
        this.nextFreeTicketMicros = LongMath.saturatedAdd(nextFreeTicketMicros, waitMicros);
        
        //4.扣减令牌数，更新桶内剩余令牌
        //最后把上面计算的可扣减的令牌数量从存储的令牌里减掉
        this.storedPermits -= storedPermitsToSpend;
        //返回请求需要等待的时间
        //需要注意returnValue被赋值的是上次的nextFreeTicketMicros，说明当前这次请求获取令牌的代价由下一个请求去支付
        return returnValue;
    }
    
    //Updates storedPermits and nextFreeTicketMicros based on the current time.
    //根据当前时间，更新storedPermits和nextFreeTicketMicros变量
    //计算nextFreeTicketMicros到当前时间内新产生的令牌数，这个就是延迟计算
    void resync(long nowMicros) {
        //if nextFreeTicket is in the past, resync to now
        //一般当前的时间是大于下个请求被批准的时间
        //此时：会把过去的时间换成令牌数存储起来，注意存储的令牌数不能大于最大的令牌数
        //当RateLimiter初始化好后，可能刚开始没有流量，或者是一段时间没有流量后突然来了流量
        //此时可以往"后"预存储一秒时间的令牌数. 也就是这里所说的burst能力
        
        //如果nextFreeTicketMicros在未来的一个时间点，那这个if判断便不满足
        //此时，不需要进行更新storedPermits和nextFreeTicketMicros变量
        //此种情况发生在："预借"了令牌的时候
        if (nowMicros > nextFreeTicketMicros) {
            //时间差除以生成一个新鲜令牌的耗时，coolDownIntervalMicros()是抽象方法，由子类实现
            double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
            //更新令牌桶内已存储的令牌个数，注意不超过最大限制
            storedPermits = min(maxPermits, storedPermits + newPermits);
            //更新nextFreeTicketMicros为当前时间
            nextFreeTicketMicros = nowMicros;
        }
    }
    
    //Translates a specified portion of our currently stored permits which we want to spend/acquire, into a throttling time.
    //Conceptually, this evaluates the integral of the underlying function we use, for the range of [(storedPermits - permitsToTake), storedPermits].
    //This always holds: 0 <= permitsToTake <= storedPermits
    //从桶中取出已存储的令牌的代价，由子类实现
    //这是一个抽象函数，SmoothBursty中的实现会直接返回0，可以认为已经预分配的令牌，在获取时不需要待待时间
    abstract long storedPermitsToWaitTime(double storedPermits, double permitsToTake);
    
    //Returns the number of microseconds during cool down that we have to wait to get a new permit.
    //每生成一个新鲜令牌的耗时，由子类实现
    abstract double coolDownIntervalMicros();
    ...
    
    static final class SmoothBursty extends SmoothRateLimiter {
        ...
        @Override
        long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
            return 0L;
        }
        
        @Override
        double coolDownIntervalMicros() {
            return stableIntervalMicros;
        }
    }
    ...
}

三.SmoothBursty的tryAcquire()方法分析

其实就是在acquire()方法的基础上，增加了如下判断：如果当前时间 + 超时时间 > nextFreeTicketMicros，那么就可以继续尝试。

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
    ...    
    //有超时时间的获取
    //@param permits the number of permits to acquire，获取的令牌数量
    //@param timeout the maximum time to wait for the permits. Negative values are treated as zero.
    //@param unit the time unit of the timeout argument
    //@return true if the permits were acquired, false otherwise
    public boolean tryAcquire(int permits, long timeout, TimeUnit unit) {
        long timeoutMicros = max(unit.toMicros(timeout), 0);
        checkPermits(permits);
        long microsToWait;
        synchronized (mutex()) {
            long nowMicros = stopwatch.readMicros();
            //调用RateLimiter.canAcquire()方法看是否超时
            if (!canAcquire(nowMicros, timeoutMicros)) {
                return false;
            } else {
                microsToWait = reserveAndGetWaitLength(permits, nowMicros);
            }
        }
        stopwatch.sleepMicrosUninterruptibly(microsToWait);
        return true;
    }
    
    private boolean canAcquire(long nowMicros, long timeoutMicros) {
        //SmoothRateLimiter.queryEarliestAvailable()方法会返回nextFreeTicketMicros
        //如果当前时间nowMicros + 超时时间timeoutMicros > nextFreeTicketMicros，那么就可以继续等待尝试获取
        return queryEarliestAvailable(nowMicros) - timeoutMicros <= nowMicros;
    }
    
    //Returns the earliest time that permits are available (with one caveat).
    //@return the time that permits are available, or, if permits are available immediately, an arbitrary past or present time
    abstract long queryEarliestAvailable(long nowMicros);
    ...
}

@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
    //The time when the next request (no matter its size) will be granted. 
    //After granting a request, this is pushed further in the future. Large requests push this further than small requests.
    //下一个请求(无论大小)将被批准的时间. 在批准请求后，这将在未来进一步推进，大请求比小请求更能推动这一进程.
    private long nextFreeTicketMicros = 0L;//could be either in the past or future
    ...
    @Override
    final long queryEarliestAvailable(long nowMicros) {
        return nextFreeTicketMicros;
    }
    ...
}

(4)SmoothBursty的令牌生成规则分析

SmoothBursty的消费原理示意图如下：

青色代表已经存储的令牌(permits)，紫色代表新同步时间产生的令牌(permits)。

一.分析时间t1

在时刻t1请求的令牌数量是permits_01，此时青色中已经存储了部分令牌，但是不够。同步时间轴后，产生了部分紫色的新令牌。重新计算可用令牌，青色与紫色加一起。如果两者之和超过了一个Burst周期，则取周期的最大值maxPermits。此时消费后，还剩余部分的令牌，在图中的t2时刻表示为青色部分，而且在t2时刻时nextFreeTicketMicros已经被标记为t1。

二.分析时间t2

在时刻t2请求的令牌数量是pertmits_02，此时青色部分代表的已存储令牌数不够。同步时间轴后，会产生紫色部分的令牌。但是此时，已经产生的令牌数量还是不够消费。因此需要睡眠一个由freshPermits转换的时间间隔，然后nextFreeTicketMicros被更新到t2时刻的未来时间，即产生了预支。如果在nextFreeTicketMicros这个时间点到来之前，直接调用tryAcquire()，并且给定的超时时间太短，那么tryAcquire()就会返回失败。

(5)SmoothRateLimiter对预支令牌的处理分析

Smooth的含义是平稳的、平滑的，Bursty的含义是突发式的、突发性的。

SmoothRateLimiter类型的RateLimiter，在没有被消费的前提下，可以最多预存1秒的令牌来应对突发式流量。

关于nextFreeTicketMicros所表示的"下一个请求被批准的时间"补充说明：在一般场景下，它是一个过去的值，此时可以预存令牌。在初始化场景下，它会初始化为nowMicros，此时无法应对突发式流量。

对于任意时刻，SmoothRateLimiter都可以向以后的时间预支令牌来消费。但所预支令牌的时间成本，当前消费者不承担，由下一个消费者承担。这可参考SmoothRateLimiter.reserveEarliestAvailable()方法中的处理，也就是当前请求不进行等待，而是由下一次请求进行等待。

每次执行SmoothRateLimiter的reserveEarliestAvailable()方法时，都会先调用SmoothRateLimiter的resync()方法更新nextFreeTicketMicros。但如果存在预支令牌的情况，该方法是不会更新nextFreeTicketMicros的。因为预支令牌时，nextFreeTicketMicros是未来的时间，大于nowMicros。然后SmoothRateLimiter的reserveEarliestAvailable()方法最后会返回执行SmoothRateLimiter的resync()方法后的nextFreeTicketMicros，让处理当前请求的线程睡眠阻塞到nextFreeTicketMicros这个时间点。

在SmoothRateLimiter.reserveEarliestAvailable()方法中，由于先执行SmoothRateLimiter的resync()方法更新nextFreeTicketMicros。所以只要处理当前请求时，上一个请求没有出现预支令牌的情况，也就是nextFreeTicketMicros比nowMicros小的时候。那么即使当前请求需要申请的令牌数不够，当前请求也不需要进行等待，只需要向往后的时间去借足够的令牌即可立刻返回。

(6)SmoothBursty的案例场景分析

当程序已经完成了初始化，但是没有任何流量，持续了很长的时间。此时来了一个acquire(200)的请求，不管已经存储的令牌有多少，处理该请求时都可消费这些已存储的令牌，并且在不够时可以借后面时间产生的令牌 + 不需要等待。

但下一个请求acquire(250)到来时可能就没有这么幸运了，处理这个请求可能需要等待上一个请求所预支令牌的生成时间。比如上一个请求是借了后面时间产生的50个令牌，那么处理当前请求时就需要先等待生成这50个令牌的时间。如果等待完成之后，当前请求还需要额外的100个令牌，那么当前请求还需借后面时间产生的100个令牌。再下一个请求的处理，以此类推。

4.继承RateLimiter的SmoothWarmingUp源码

(1)SmoothWarmingUp的介绍

(2)SmoothWarmingUp的初始化

(3)SmoothWarmingUp中maxPermits的计算

(4)SmoothWarmingUp获取令牌

(5)SmoothBursty和SmoothWarmingUp的对比

(1)SmoothWarmingUp的介绍

SmoothWarmingUp支持预热功能，预热是对于冷系统来说的。当系统流量低时，系统就会冷下来，具体表现在：线程池会释放多余线程、连接池会释放多余连接、缓存会过期失效等。这时如果还放行满负荷流量甚至突发流量进入冷系统，则系统压力会暴增。

系统压力暴增很容易会导致系统出现问题，这也是SmoothBursty的不足。因为在SmoothBursty的实现逻辑里，流量低时桶内存储的令牌会增多。此时如果有满负荷流量甚至突发流量进入系统，SmoothBursty会放行，从而对系统产生比较大的压力。所以不能简单根据桶内是否有存储的令牌来放行流量，要判断系统冷热程度。

简单来说就是：流量越低时，桶内堆积的令牌数就会越高(因为生成速度大于消耗速度)，而系统就会越冷，这时令牌生成速率就应该要越低，从而达到预热的目的。

上图中的变量含义如下：

变量一：coldIntervalMicros

表示的是系统最冷时的令牌生成速率，这时单位令牌的耗时最大。

变量二：stableIntervalMicros

表示的是稳定阶段每生成一个令牌需要消耗的微秒数，这个值是根据构造方法传入的permitsPerSecond换算成微秒数得来的。

变量三：maxPermits

表示的是令牌桶中允许存放的最大令牌数。

变量四：storedPermits

表示的是令牌桶中当前缓存的未消耗的令牌数。当令牌消耗速度小于令牌产生速度时，桶内就会开始堆积令牌。但是堆积的令牌数不会大于maxPermits，这个值越大，说明系统越冷。

变量五：thresholdPermits

表示的是进入预热阶段的临界令牌数。当桶内存储的令牌数storedPermits大于该值时，说明系统冷下来了。此时需要进入预热阶段，加大生成单个令牌的耗时。当桶内存储的令牌数storedPermits小于该值时，说明进入热系统阶段，此时可以按正常速率生成令牌了。thresholdPermits默认是整个预热时间除以正常速率的一半。该值太小会过早进入预热阶段，影响性能。该值太大会对系统产生压力，没能达到预热效果。

上图中，横轴表示当前桶内库存令牌数，纵轴表示生成单个令牌的耗时。当桶内存储的令牌数大于storedPermits这个临界值时，系统就会进入预热阶段，对应的纵轴的生成单个令牌的耗时就会增加。当桶内存储的令牌数达到上限maxPermits时，系统处于最冷阶段，此时生成单个令牌的耗时就是最长，从而达到预热的目的。

(2)SmoothWarmingUp的初始化

在初始化SmoothWarmingUp的RateLimiter的create()方法中，会传入如下参数：

参数一：permitsPerSecond

表示的是稳定阶段的速率，也就是稳定阶段每秒生成的令牌数。

参数二：warmupPeriod

表示的是预热时间。

参数三：unit

表示的是预热时间warmupPeriod的单位。

参数四：coldFactor

表示的是冷却因子，这里固定是3.0，决定了coldIntervalMicros的值。

参数五：stopwatch

这可以理解成计时器，记录限流的计时信息，通过计时信息来计算令牌的产生和消耗等信息。

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
    ...
    //Creates a RateLimiter with the specified stable throughput, 
    //given as "permits per second" (commonly referred to as QPS, queries per second), 
    //and a warmup period, during which the RateLimiter smoothly ramps up its rate, 
    //until it reaches its maximum rate at the end of the period (as long as there are enough requests to saturate it). 
    //Similarly, if the RateLimiter is left unused for a duration of warmupPeriod, 
    //it will gradually return to its "cold" state, 
    //i.e. it will go through the same warming up process as when it was first created.
    
    //The returned RateLimiter is intended for cases where the resource that actually fulfills the requests (e.g., a remote server) needs "warmup" time, 
    //rather than being immediately accessed at the stable (maximum) rate.
    //The returned RateLimiter starts in a "cold" state (i.e. the warmup period will follow), 
    //and if it is left unused for long enough, it will return to that state.
    
    //创建一个具有指定稳定吞吐量的RateLimiter，
    //入参为："每秒多少令牌"(通常称为QPS，每秒的查询量)，以及平稳增加RateLimiter速率的预热期，
    //直到RateLimiter在该预热周期结束时达到最大速率(只要有足够的请求使其饱和)；
    //类似地，如果RateLimiter在预热时段的持续时间内未被使用，它将逐渐返回到它的"冷"状态，
    //也就是说，它将经历与最初创建时相同的预热过程；
    
    //返回的RateLimiter适用于实际满足请求的资源(例如远程服务器)需要"预热"时间的情况，而不是以稳定(最大)速率立即访问；
    //返回的RateLimiter在"冷"状态下启动(也就是说，接下来将是预热期)，如果它被闲置足够长的时间，它就会回到那个"冷"状态；

    //@param permitsPerSecond the rate of the returned RateLimiter, measured in how many permits become available per second
    //@param warmupPeriod the duration of the period where the RateLimiter ramps up its rate, before reaching its stable (maximum) rate
    //@param unit the time unit of the warmupPeriod argument
    public static RateLimiter create(double permitsPerSecond, long warmupPeriod, TimeUnit unit) {
        checkArgument(warmupPeriod >= 0, "warmupPeriod must not be negative: %s", warmupPeriod);
        return create(permitsPerSecond, warmupPeriod, unit, 3.0, SleepingStopwatch.createFromSystemTimer());
    }
    
    @VisibleForTesting
    static RateLimiter create(double permitsPerSecond, long warmupPeriod, TimeUnit unit, double coldFactor, SleepingStopwatch stopwatch) {
        RateLimiter rateLimiter = new SmoothWarmingUp(stopwatch, warmupPeriod, unit, coldFactor);
        //调用RateLimiter.setRate()方法
        rateLimiter.setRate(permitsPerSecond);
        return rateLimiter;
    }
    
    //Updates the stable rate of this RateLimiter, 
    //that is, the permitsPerSecond argument provided in the factory method that constructed the RateLimiter. 
    //Currently throttled threads will not be awakened as a result of this invocation, 
    //thus they do not observe the new rate; only subsequent requests will.
    //Note though that, since each request repays (by waiting, if necessary) the cost of the previous request, 
    //this means that the very next request after an invocation to setRate() will not be affected by the new rate; 
    //it will pay the cost of the previous request, which is in terms of the previous rate.
    //The behavior of the RateLimiter is not modified in any other way, 
    //e.g. if the RateLimiter was configured with a warmup period of 20 seconds, 
    //it still has a warmup period of 20 seconds after this method invocation.
    //更新该RateLimiter的稳定速率，即在构造RateLimiter的工厂方法中提供permitsPerSecond参数；
    //当前被限流的线程将不会由于这个调用而被唤醒，因此它们没有观察到新的速率；只有随后的请求才会；
    //但是要注意的是，由于每个请求(如果需要，通过等待)会偿还先前请求的成本，
    //这意味着调用setRate()方法后的下一个请求将不会受到新速率的影响，
    //它将按照先前的速率处理先前请求的成本；
    //RateLimiter的行为不会以任何其他方式修改，
    //例如：如果RateLimiter被配置为具有20秒的预热周期，在该方法调用之后，它仍然有20秒的预热期；

    //@param permitsPerSecond the new stable rate of this {@code RateLimiter}
    public final void setRate(double permitsPerSecond) {
        checkArgument(permitsPerSecond > 0.0 && !Double.isNaN(permitsPerSecond), "rate must be positive");
        //在同步代码块中设定速率
        synchronized (mutex()) {
            //调用SmoothRateLimiter.doSetRate()方法
            doSetRate(permitsPerSecond, stopwatch.readMicros());
        }
    }
    ...
}

@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
    //The currently stored permits. 
    //令牌桶中当前缓存的未消耗的令牌数
    double storedPermits;
    //The maximum number of stored permits.
    //令牌桶中允许存放的最大令牌数
    double maxPermits;
    //The interval between two unit requests, at our stable rate.
    //E.g., a stable rate of 5 permits per second has a stable interval of 200ms.
    //按照我们稳定的速率，两个单位请求之间的时间间隔；例如，每秒5个令牌的稳定速率具有200ms的稳定间隔
    double stableIntervalMicros;
    //The time when the next request (no matter its size) will be granted. 
    //After granting a request, this is pushed further in the future. Large requests push this further than small requests.
    //下一个请求(无论大小)将被批准的时间.
    //在批准请求后，这将在未来进一步推进，大请求比小请求更能推动这一进程。
    private long nextFreeTicketMicros = 0L;//could be either in the past or future
    ...
    
    //这是一个可以重复调用的函数.
    //第一次调用和非第一次调用的过程有些不一样，目的是设定一个新的速率Rate.
    @Override
    final void doSetRate(double permitsPerSecond, long nowMicros) {
        //调用SmoothRateLimiter.resync()方法，重试计算和同步存储的预分配的令牌.
        resync(nowMicros);
        //计算稳定的发放令牌的时间间隔. 单位us, 比如qps为5, 则为200ms即20万us的间隔进行令牌发放. 
        double stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
        this.stableIntervalMicros = stableIntervalMicros;
        //调用SmoothWarmingUp.doSetRate()设定其内部的比率.
        doSetRate(permitsPerSecond, stableIntervalMicros);
    }
    
    //Updates storedPermits and nextFreeTicketMicros based on the current time.
    //根据当前时间，更新storedPermits和nextFreeTicketMicros变量
    //注意: 在初始化SmoothBursty时会第一次调用resync()方法，此时各值的情况如下：
    //coolDownIntervalMicros = 0、nextFreeTicketMicros = 0、newPermits = 无穷大.
    //maxPermits = 0(初始值，还没有重新计算)、最后得到的: storedPermits = 0;
    //同时，nextFreeTicketMicros = "起始时间"
    void resync(long nowMicros) {
        //if nextFreeTicket is in the past, resync to now
        if (nowMicros > nextFreeTicketMicros) {
            double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
            storedPermits = min(maxPermits, storedPermits + newPermits);
            nextFreeTicketMicros = nowMicros;
        }
    }
    
    abstract void doSetRate(double permitsPerSecond, double stableIntervalMicros);
    ...
    
    static final class SmoothWarmingUp extends SmoothRateLimiter {
        private final long warmupPeriodMicros;
        //The slope of the line from the stable interval (when permits == 0), to the cold interval (when permits == maxPermits)
        private double slope;//斜率
        private double thresholdPermits;
        private double coldFactor;

        SmoothWarmingUp(SleepingStopwatch stopwatch, long warmupPeriod, TimeUnit timeUnit, double coldFactor) {
            super(stopwatch);
            //将warmupPeriod转换成微妙并赋值给warmupPeriodMicros
            this.warmupPeriodMicros = timeUnit.toMicros(warmupPeriod);
            this.coldFactor = coldFactor;
        }

        @Override
        void doSetRate(double permitsPerSecond, double stableIntervalMicros) {
            double oldMaxPermits = maxPermits;
            //stableIntervalMicros此时已由前面的SmoothRateLimiter.doSetRate()方法设为：1/qps
            //coldFactor的值默认会初始化为3
            //因此系统最冷时的令牌生成间隔：coldIntervalMicros等于3倍的普通间隔stableIntervalMicros
            double coldIntervalMicros = stableIntervalMicros * coldFactor;
            //warmupPeriodMicros是用户传入的预热时间
            //stableIntervalMicros是稳定期间令牌发放的间隔
            //进入预热阶段的临界令牌数thresholdPermits，默认就是：整个预热时间除以正常速率的一半
            //该值太小会过早进入预热阶段，影响性能；该值太大会对系统产生压力，没达到预热效果
            thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;
            //最大令牌数
            maxPermits = thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);
            //斜率
            slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits - thresholdPermits);
            //设置当前桶内的存储令牌数
            //突发型的RateLimiter——SmoothBursty：
            //初始化时不会预生成令牌，因为storedPermits初始为0；
            //随着时间推移，则会产生新的令牌，这些令牌如果没有被消费，则会存储在storedPermits里；
            //预热型的RateLimiter——SmoothWarmingUp：
            //初始化时会预生成令牌，并且初始化时肯定是系统最冷的时候，所以桶内默认就是maxPermits
            if (oldMaxPermits == Double.POSITIVE_INFINITY) {
                //if we don't special-case this, we would get storedPermits == NaN, below
                storedPermits = 0.0;
            } else {
                //对于SmoothWarmingUp的RateLimiter来说，其初始存储值storedPermits是满的，也就是存储了最大限流的令牌数
                //而对于突发型的限流器SmoothBursty来说，其初始存储值storedPermits是0
                storedPermits = (oldMaxPermits == 0.0) ? maxPermits : storedPermits * maxPermits / oldMaxPermits;
            }
        }
        ...
    }
    ...
}

SmoothWarmingUp初始化时就是系统最冷的时候，此时令牌桶内的已存储令牌数等于maxPermits。SmoothWarmingUp的doSetRate()方法涉及的变量有：

变量一：stableIntervalMicros

表示的是稳定阶段生成令牌的速率，也就是1 / qps。

stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond

变量二：warmupPeriodMicros

表示的是根据构造方法中传入的预热阶段总时间warmupPeriod换算成的微秒值，将时间单位控制在微秒会让耗时更精确。

变量三：coldIntervalMicros

表示的是系统最冷时的令牌生成速率。

coldIntervalMicros = stableIntervalMicros * coldFactor

变量四：thresholdPermits

表示的是进入预热阶段的临界令牌数。

thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros

变量五：maxPermits

表示的是令牌桶内的最大令牌数。

maxPermits = 稳定阶段生成的令牌数 + 预热阶段生成的令牌数

稳定阶段生成的令牌数是thresholdPermits，预热阶段的总时间是warmupPeriodMicros，所以预热阶段生成令牌的平均速率是：

(stableIntervalMicros + coldIntervalMicros) / 2

所以预热阶段生成的令牌数就是：

2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros)

变量六：slope

表示的是斜率或者坡度，单位是微秒。预热阶段是以固定速度来提速的，预热阶段生成后一个令牌的耗时比生成上一个令牌的耗时要多slope微秒。也就是每生成一个令牌，下一个令牌的耗时就会固定增加slope微秒。已知预热阶段每个令牌的初始耗时为coldIntervalMicros微秒，预热结束时每个令牌的耗时为stableIntervalMicros微秒，整个预热阶段产生的令牌数是maxPermits - thresholdPermits，所以可以得出预热阶段生成令牌的增速为：

slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits - thresholdPermits)

变量七：storedPermits

默认SmoothWarmingUp初始化时就是系统最冷的时候，此时的storedPermits = maxPermits。

(3)SmoothWarmingUp中maxPermits的计算

一.计算公式分析

stableIntervalMicros = SECONDS.toMicros(1L) / permitsPerSecond;
thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros;
maxPermits = thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);

进行变换:
maxPermits - thresholdPermits = 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros)

继续变换:
(stableIntervalMicros + coldIntervalMicros) * (maxPermits - thresholdPermits ) / 2 = warmupPeriodMicros

其中梯形的斜边对应的斜率是：
slope = (coldIntervalMicros - stableIntervalMicros) / (maxPermits - thresholdPermits);

结合如下的图(令牌的发放时间间隔随着已存储的令牌不同而不同)可知：maxPermits - thresholdPermits就是梯形的高，stableIntervalMicros + coldIntervalMicros就是梯形的两个底的和。所以给定梯形的面积即warmupPeriodMicros，就可以计算出maxPermits。也就是根据传入的预热时间 + 稳定时的令牌发放间隔 + 冷却因子，就可以计算出预热期间能发放的最大令牌数。

二.举例说明

假如QPS限制为100，预热时间为5秒，那么：

stableIntervalMicros = 1s / 100 = 10ms
coldIntervalMicros = 10ms * 3 = 30ms

也就是说在预热期间，最慢会慢到到每30ms才产生一个令牌，预热周期是5000ms。由于梯形面积5000ms = (上底10ms + 下底30ms) * h / 2，那么h = 10000 / 40 = 250个令牌。

由如下的公式可得如下的结果：

公式：thresholdPermits = 0.5 * warmupPeriodMicros / stableIntervalMicros
结果：thresholdPermits = 0.5 * 预热周期5000ms / 稳定间隔10ms = 250

也就是说，在5s的预热周期内，按正常速率本来要生成500个令牌。但SmoothWarmingUp会以正常速率(每10ms一个令牌)生成其中一半，剩下一半再用5s的预热时间来进行预热式生成。

根据上面计算maxPermits的公式:

maxPermits = thresholdPermits + 2.0 * warmupPeriodMicros / (stableIntervalMicros + coldIntervalMicros);

因为冷却间隔时间是稳定间隔时间的3倍，所以：

stableIntervalMicros + coldIntervalMicros = 4stableIntervalMicros

因此该公式的后半部分也是：0.5倍 * 预热周期5000s / 稳定间隔10ms。

总结：带有预热型的限流器SmoothWarmingUp，会用一个预热周期的时间配合稳定间隔时间来确定最大可存储的令牌数。这个最大可存储的令牌的一半，是按照稳定的、正常速率生成的，另外一半令牌的平均生成速率是正常速率的一半。

(4)SmoothWarmingUp获取令牌

在调用SmoothWarmingUp的acquire()方法获取令牌时，最后会调用到SmoothRateLimiter的reserveEarliestAvailable()方法计算当前线程需要阻塞等待的时间，这个阻塞等待的时间由两部分组成。第一部分是从桶中获取storedPermitsToSpend个现有令牌的耗时，第二部分是等待生成freshPermits个新鲜令牌的耗时。

SmoothWarmingUp从桶中获取storedPermitsToSpend个现有令牌的耗时，会调用SmoothWarmingUp.storedPermitsToWaitTime()方法计算具体的耗时。该等待时间又会分为两部分进行计算：第一部分是获取预热阶段的令牌的耗时，第二部分是获取稳定阶段的令牌的耗时。并且只有当预热阶段的令牌获取完还不够时，才会去获取稳定阶段的令牌。

比如请求4个令牌，此时桶内令牌数是22，进入预热阶段的临界值是20。那么桶内稳定阶段生成的令牌就是20个，预热阶段生成的令牌就是2个。面对要获取4个令牌的请求，会先获取预热阶段的全部令牌也就是2个，然后再获取稳定阶段中的2个令牌。

获取预热阶段的令牌的耗时 = (初始速度 + 结束速度) * 令牌数 / 2
获取稳定阶段的令牌的耗时 = 固定速率stableIntervalMicros * 令牌数

@Beta
@GwtIncompatible
@SuppressWarnings("GoodTime")
public abstract class RateLimiter {
    ...
    //无限等待的获取
    //Acquires the given number of permits from this RateLimiter, 
    //blocking until the request can be granted. 
    //Tells the amount of time slept, if any.
    //@param permits the number of permits to acquire，获取的令牌数量
    //@return time spent sleeping to enforce rate, in seconds; 0.0 if not rate-limited
    @CanIgnoreReturnValue
    public double acquire(int permits) {
        //调用RateLimiter.reserve()方法
        //预支令牌并获取需要阻塞的时间：即预定数量为permits的令牌数，并返回需要等待的时间
        long microsToWait = reserve(permits);
        //将需要等待的时间补齐, 从而满足限流的需求，即根据microsToWait来让线程sleep(共性)
        stopwatch.sleepMicrosUninterruptibly(microsToWait);
        //返回这次调用使用了多少时间给调用者
        return 1.0 * microsToWait / SECONDS.toMicros(1L);
    }
        
    //Reserves the given number of permits from this RateLimiter for future use, 
    //returning the number of microseconds until the reservation can be consumed.
    //从这个RateLimiter限速器中保留给定数量的令牌，以备将来使用，返回可以使用保留前的微秒数
    //@return time in microseconds to wait until the resource can be acquired, never negative
    final long reserve(int permits) {
        checkPermits(permits);
        //由于涉及并发操作，所以必须使用synchronized进行互斥处理
        synchronized (mutex()) {
            //调用RateLimiter.reserveAndGetWaitLength()方法
            return reserveAndGetWaitLength(permits, stopwatch.readMicros());
        }
    }
    
    //Reserves next ticket and returns the wait time that the caller must wait for.
    //预定下一个ticket，并且返回需要等待的时间
    final long reserveAndGetWaitLength(int permits, long nowMicros) {
        //调用SmoothRateLimiter.reserveEarliestAvailable()方法
        long momentAvailable = reserveEarliestAvailable(permits, nowMicros);
        return max(momentAvailable - nowMicros, 0);
    }
    
    //Reserves the requested number of permits and returns the time that those permits can be used (with one caveat).
    //保留请求数量的令牌，并返回可以使用这些令牌的时间(有一个警告)
    //生产令牌、获取令牌、计算阻塞时间的具体细节由子类来实现
    //@return the time that the permits may be used, or, if the permits may be used immediately, an arbitrary past or present time
    abstract long reserveEarliestAvailable(int permits, long nowMicros);
    ...
}

@GwtIncompatible
abstract class SmoothRateLimiter extends RateLimiter {
    //The currently stored permits. 
    //令牌桶中当前缓存的未消耗的令牌数
    double storedPermits;
    //The maximum number of stored permits. 
    //令牌桶中允许存放的最大令牌数
    double maxPermits;
    //The interval between two unit requests, at our stable rate.
    //E.g., a stable rate of 5 permits per second has a stable interval of 200ms.
    //按照我们稳定的速率，两个单位请求之间的时间间隔；例如，每秒5个令牌的稳定速率具有200ms的稳定间隔
    double stableIntervalMicros;
    //The time when the next request (no matter its size) will be granted. 
    //After granting a request, this is pushed further in the future. Large requests push this further than small requests.
    //下一个请求(无论大小)将被批准的时间. 在批准请求后，这将在未来进一步推进，大请求比小请求更能推动这一进程.
    private long nextFreeTicketMicros = 0L;//could be either in the past or future
    ...
    
    @Override
    final long reserveEarliestAvailable(int requiredPermits, long nowMicros) {
        //1.根据nextFreeTicketMicros计算新产生的令牌数，更新当前未使用的令牌数storedPermits
        //获取令牌时调用SmoothRateLimiter.resync()方法与初始化时的调用不一样.
        //此时会把"没有过期"的令牌存储起来.
        //但是如果计数时间nextFreeTicketMicros是在未来. 那就不做任何处理.
        resync(nowMicros);
        //下一个请求(无论大小)将被批准的时间，这个值将被作为方法结果返回
        long returnValue = nextFreeTicketMicros;
        
        //2.计算需要阻塞等待的时间
        //2.1.先从桶中取未消耗的令牌，如果桶中令牌数不足，看最多能取多少个
        //存储的令牌可供消费的数量
        double storedPermitsToSpend = min(requiredPermits, this.storedPermits);
        //2.2.计算是否需要等待新鲜的令牌(当桶中现有的令牌数不足时就需要等待新鲜的令牌)，如果需要，则计算需要等待的令牌数
        //需要等待的令牌：新鲜的令牌
        double freshPermits = requiredPermits - storedPermitsToSpend;
        //计算需要等待的时间
        //分两部分计算：waitMicros = 从桶中获取storedPermitsToSpend个现有令牌的代价 + 等待生成freshPermits个新鲜令牌的代价
        //从桶中取storedPermitsToSpend个现有令牌也是有代价的，storedPermitsToWaitTime()方法是个抽象方法，会由SmoothBursty和SmoothWarmingUp实现
        //对于SmoothBursty来说，storedPermitsToWaitTime()会返回0，表示已经存储的令牌不需要等待.
        //而生成新鲜令牌需要等待的代价是：新鲜令牌的个数freshPermits * 每个令牌的耗时stableIntervalMicros
        long waitMicros = storedPermitsToWaitTime(this.storedPermits, storedPermitsToSpend) + (long) (freshPermits * stableIntervalMicros);
        
        //3.更新nextFreeTicketMicros
        //由于新鲜的令牌可能已被预消费，所以nextFreeTicketMicros就得往后移，以表示这段时间被预消费了
        this.nextFreeTicketMicros = LongMath.saturatedAdd(nextFreeTicketMicros, waitMicros);
        
        //4.扣减令牌数，更新桶内剩余令牌
        //最后把上面计算的可扣减的令牌数量从存储的令牌里减掉
        this.storedPermits -= storedPermitsToSpend;
        //返回请求需要等待的时间
        //需要注意returnValue被赋值的是上次的nextFreeTicketMicros，说明当前这次请求获取令牌的代价由下一个请求去支付
        return returnValue;
    }
    
    //Updates storedPermits and nextFreeTicketMicros based on the current time.
    //根据当前时间，更新storedPermits和nextFreeTicketMicros变量
    //计算nextFreeTicketMicros到当前时间内新产生的令牌数，这个就是延迟计算
    void resync(long nowMicros) {
        //if nextFreeTicket is in the past, resync to now
        //一般当前的时间是大于下个请求被批准的时间
        //此时：会把过去的时间换成令牌数存储起来，注意存储的令牌数不能大于最大的令牌数
        //当RateLimiter初始化好后，可能刚开始没有流量，或者是一段时间没有流量后突然来了流量
        //此时可以往"后"预存储一秒时间的令牌数. 也就是这里所说的burst能力
        
        //如果nextFreeTicketMicros在未来的一个时间点，那这个if判断便不满足
        //此时，不需要进行更新storedPermits和nextFreeTicketMicros变量
        //此种情况发生在："预借"了令牌的时候
        if (nowMicros > nextFreeTicketMicros) {
            //时间差除以生成一个新鲜令牌的耗时，coolDownIntervalMicros()是抽象方法，由子类实现
            double newPermits = (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros();
            //更新令牌桶内已存储的令牌个数，注意不超过最大限制
            storedPermits = min(maxPermits, storedPermits + newPermits);
            //更新nextFreeTicketMicros为当前时间
            nextFreeTicketMicros = nowMicros;
        }
    }
    
    //Translates a specified portion of our currently stored permits which we want to spend/acquire, into a throttling time.
    //Conceptually, this evaluates the integral of the underlying function we use, for the range of [(storedPermits - permitsToTake), storedPermits].
    //This always holds: 0 <= permitsToTake <= storedPermits
    //从桶中取出已存储的令牌的代价，由子类实现
    //这是一个抽象函数，SmoothBursty中的实现会直接返回0，可以认为已经预分配的令牌，在获取时不需要待待时间
    abstract long storedPermitsToWaitTime(double storedPermits, double permitsToTake);
    
    //Returns the number of microseconds during cool down that we have to wait to get a new permit.
    //每生成一个新鲜令牌的耗时，由子类实现
    abstract double coolDownIntervalMicros();
    ...
    
    static final class SmoothWarmingUp extends SmoothRateLimiter {
        private final long warmupPeriodMicros;
        private double slope;//斜率
        private double thresholdPermits;
        private double coldFactor;
        ...
        @Override
        long storedPermitsToWaitTime(double storedPermits, double permitsToTake) {
            //检查当前桶内存储的令牌数是否大于进入预热阶段的临界令牌数thresholdPermits
            double availablePermitsAboveThreshold = storedPermits - thresholdPermits;
            long micros = 0;
            //如果当前桶内存储的令牌数大于进入预热阶段的临界令牌数thresholdPermits
            //则说明系统当前已经冷下来了，需要进入预热期，于是需要计算在预热期生成令牌的耗时
            if (availablePermitsAboveThreshold > 0.0) {
                //计算在超出临界值的令牌中需要取出多少个令牌，并计算耗时
                double permitsAboveThresholdToTake = min(availablePermitsAboveThreshold, permitsToTake);
                //计算预热阶段的耗时，前半部分的permitsToTime()计算的是生成令牌的初始速率，后半部分的permitsToTime()计算的是生成令牌的结束速率
                double length = permitsToTime(availablePermitsAboveThreshold) + permitsToTime(availablePermitsAboveThreshold - permitsAboveThresholdToTake);
                //总耗时 = ((初始速率 + 结束速率) * 令牌数) / 2
                micros = (long) (permitsAboveThresholdToTake * length / 2.0);
                permitsToTake -= permitsAboveThresholdToTake;
            }
            //加上稳定阶段的令牌耗时就是总耗时
            micros += (long) (stableIntervalMicros * permitsToTake);
            return micros;
        }
       
        //已知每生成一个令牌，下一个令牌的耗时就会固定增加slope微秒
        //那么在知道初始耗时stableIntervalMicros的情况下，就可以按如下公式求出生成第permits个令牌的耗时
        private double permitsToTime(double permits) {
            return stableIntervalMicros + permits * slope;
        }
      
        @Override
        double coolDownIntervalMicros() {
            //预热时长 / 最大令牌数
            return warmupPeriodMicros / maxPermits;
        }
    }
    ...
}

(5)SmoothBursty和SmoothWarmingUp的对比

SmoothBursty和SmoothWarmingUp这两种限流器都使用了预支令牌的思路，就是当前线程获取令牌的代价(阻塞时间)需要由下一个线程来支付。这样可以减少当前线程阻塞的概率，因为下一个请求不确定什么时候才来。如果下一个请求很久才来，那么这段时间产生的新令牌已经满足下一个线程的需求，这样就不用阻塞了。

一.在SmoothBursty中

桶内的已存储令牌是可以直接拿来用的，不需要额外的耗时，以此应对突发的流量，但这些已存储的令牌是之前低流量时积累下来的。

如果流量一直处于满负荷，没有结余的令牌，那么当突发流量到来时，仍然会被限流。

而且令牌桶内默认最大的令牌数就是1秒内产生的令牌。比如QPS设置为10的话，那么令牌桶内最多存储10个令牌。当QPS=20的流量到来时，也只够1秒钟的消耗，后面又会进入限流状态。

二.在SmoothWarmingUp中

桶内的已存储令牌是不可以直接拿来用的，需要额外的耗时。为了弥补SmoothBursty的不足，它将系统分为热系统和冷系统两个阶段。

满负荷流量或者突发流量对于热系统来说，可能危害不大。因为系统的线程池、缓存、连接池在热系统下都火力全开、抗压能力强。但对于冷系统，满负荷流量和突发流量会加大系统压力，导致各种问题。

所以一般会加入预热的思路来控制冷系统下的流量(即预热阶段等待时间会更长)，而系统的冷热程度就是通过令牌桶内已存储的未消耗的令牌数来判断。因为当系统冷下来时，也就是系统流量小的时候，令牌消耗速度就会少，相应的令牌桶内已存储的令牌数就会多起来。

如果桶内的令牌数超过了进入预热阶段的临界令牌数thresholdPermits，那么就代表系统进入了预热阶段，在该阶段获取令牌的耗时将会增大，而且增大的速度是slope。