pyro 教程 时间序列 单变量,重尾,python pytorch,教程和实例 Forecasting预测,布朗运动项、偏差项和协变量项

news2024/11/15 9:59:43

预测I:单变量,重尾¶

本教程介绍了预测模块,用Pyro模型进行预测的框架。本教程只涵盖单变量模型和简单的可能性。本教程假设读者已经熟悉慢病毒感染和张量形状.

另请参见:

  • 预测II:状态空间模型

  • 预测三:层次模型

摘要¶

  • 要创建预测模型:

    1. 创建预测模型班级。

    2. 实现。模型(零数据,协变量)使用标准Pyro语法的方法。

    3. 中的所有时间局部变量进行采样self.time_plate语境。

    4. 最后调用。predict(noise_dist,预测)方法。

  • 若要定型预测模型,请创建一个预报员对象。

    • 训练可能是不稳定的,你需要调整超参数并随机重启。

    • 重新参数化可以帮助学习,例如LocScaleReparam.

  • 为了预测未来,从一个Forecaster以数据和协变量为条件的对象。

  • 要模拟季节性,请使用助手周期性_特征(), 周期性重复(),以及periodic_cumsum().

  • 要对重尾数据建模,请使用稳定的分发和稳定程序.

  • 若要评估结果,请使用回溯测试()助手或低级损失函数。

[1]:
import torch
import pyro
import pyro.distributions as dist
import pyro.poutine as poutine
from pyro.contrib.examples.bart import load_bart_od
from pyro.contrib.forecast import ForecastingModel, Forecaster, backtest, eval_crps
from pyro.infer.reparam import LocScaleReparam, StableReparam
from pyro.ops.tensor_utils import periodic_cumsum, periodic_repeat, periodic_features
from pyro.ops.stats import quantile
import matplotlib.pyplot as plt

%matplotlib inline
assert pyro.__version__.startswith('1.9.1')
pyro.set_rng_seed(20200221)
[2]:
dataset = load_bart_od()
print(dataset.keys())
print(dataset["counts"].shape)
print(" ".join(dataset["stations"]))
dict_keys(['stations', 'start_date', 'counts'])
torch.Size([78888, 50, 50])
12TH 16TH 19TH 24TH ANTC ASHB BALB BAYF BERY CAST CIVC COLM COLS CONC DALY DBRK DELN DUBL EMBR FRMT FTVL GLEN HAYW LAFY LAKE MCAR MLBR MLPT MONT NBRK NCON OAKL ORIN PCTR PHIL PITT PLZA POWL RICH ROCK SANL SBRN SFIA SHAY SSAN UCTY WARM WCRK WDUB WOAK

Pyro预测框架介绍¶

Pyro的预测框架包括:- a预测模型基类,谁的.model()方法可以为自定义预测模型实现预报员使用训练和预测的类ForecastingModels,和- a回溯测试()帮助评估模型的一些指标。

考虑一个简单的单变量数据集,比如每周一次BART火车网络中所有车站的乘客总数。这个数据大概是对数的,所以我们用对数变换来建模。

[3]:
T, O, D = dataset["counts"].shape
data = dataset["counts"][:T // (24 * 7) * 24 * 7].reshape(T // (24 * 7), -1).sum(-1).log()
data = data.unsqueeze(-1)
plt.figure(figsize=(9, 3))
plt.plot(data)
plt.title("Total weekly ridership")
plt.ylabel("log(# rides)")
plt.xlabel("Week after 2011-01-01")
plt.xlim(0, len(data));

_images/forecasting_i_4_0.png

先说一个简单的对数线性回归模型,没有趋势性和季节性。注意,虽然这个例子是单变量的,但是Pyro的预测框架是多变量的,所以我们经常需要使用.unsqueeze(-1).expand([1]),以及.to_event(1).

[4]:

# 定义一个继承自ForecastingModel的Model2类。
class Model2(ForecastingModel):
    def model(self, zero_data, covariates):
        data_dim = zero_data.size(-1)
        feature_dim = covariates.size(-1)
        # 定义模型的偏差项,使用正态分布进行采样。
        bias = pyro.sample("bias", dist.Normal(0, 10).expand([data_dim]).to_event(1))
        # 定义模型的权重,使用正态分布进行采样。
        weight = pyro.sample("weight", dist.Normal(0, 0.1).expand([feature_dim]).to_event(1))

        # 我们将会采样一个全局的时间尺度参数,在时间板(plate)之外,
        # 然后在时间板内部采样局部独立同分布(iid)噪声。
        drift_scale = pyro.sample("drift_scale",
                                  dist.LogNormal(-20, 5).expand([1]).to_event(1))
        with self.time_plate:  # 使用时间板来表示时间序列数据。
            # 使用重参数化技术来提高变分拟合的效果。即使移除这个上下文管理器,
            # 模型仍然是正确的,但拟合效果看起来会更差。
            with poutine.reparam(config={"drift": LocScaleReparam()}):
                drift = pyro.sample("drift", dist.Normal(zero_data, drift_scale).to_event(1))

        # 采样了iid "drift" 噪声之后,我们可以以任何时间依赖的方式组合它。
        # 重要的是保持板内的所有内容独立,并在板外应用依赖变换。
        motion = drift.cumsum(-2)  # 一个布朗运动。

        # 预测现在包括三个项:布朗运动项、偏差项和协变量项。
        prediction = motion + bias + (weight * covariates).sum(-1, keepdim=True)
        # 确保预测的形状与zero_data的形状一致。
        assert prediction.shape[-2:] == zero_data.shape

        # 构建噪声分布并进行预测。
        noise_scale = pyro.sample("noise_scale", dist.LogNormal(-5, 5).expand([1]).to_event(1))
        noise_dist = dist.Normal(0, noise_scale)
        self.predict(noise_dist, prediction)
# First we need some boilerplate to create a class and define a .model() method.
class Model1(ForecastingModel):
    # We then implement the .model() method. Since this is a generative model, it shouldn't
    # look at data; however it is convenient to see the shape of data we're supposed to
    # generate, so this inputs a zeros_like(data) tensor instead of the actual data.
    def model(self, zero_data, covariates):
        data_dim = zero_data.size(-1)  # Should be 1 in this univariate tutorial.
        feature_dim = covariates.size(-1)

        # The first part of the model is a probabilistic program to create a prediction.
        # We use the zero_data as a template for the shape of the prediction.
        bias = pyro.sample("bias", dist.Normal(0, 10).expand([data_dim]).to_event(1))
        weight = pyro.sample("weight", dist.Normal(0, 0.1).expand([feature_dim]).to_event(1))
        prediction = bias + (weight * covariates).sum(-1, keepdim=True)
        # The prediction should have the same shape as zero_data (duration, obs_dim),
        # but may have additional sample dimensions on the left.
        assert prediction.shape[-2:] == zero_data.shape

        # The next part of the model creates a likelihood or noise distribution.
        # Again we'll be Bayesian and write this as a probabilistic program with
        # priors over parameters.
        noise_scale = pyro.sample("noise_scale", dist.LogNormal(-5, 5).expand([1]).to_event(1))
        noise_dist = dist.Normal(0, noise_scale)

        # The final step is to call the .predict() method.
        self.predict(noise_dist, prediction)

我们现在可以通过创建一个预报员对象。我们将把数据分成[T0,T1)用于培训和[T1,T2)为了测试。

[5]:
T0 = 0              # begining
T2 = data.size(-2)  # end
T1 = T2 - 52        # train/test split
[6]:
%%time
pyro.set_rng_seed(1)
pyro.clear_param_store()
time = torch.arange(float(T2)) / 365
covariates = torch.stack([time], dim=-1)
forecaster = Forecaster(Model1(), data[:T1], covariates[:T1], learning_rate=0.1)
INFO     step    0 loss = 484401
INFO     step  100 loss = 0.609042
INFO     step  200 loss = -0.535144
INFO     step  300 loss = -0.605789
INFO     step  400 loss = -0.59744
INFO     step  500 loss = -0.596203
INFO     step  600 loss = -0.614217
INFO     step  700 loss = -0.612415
INFO     step  800 loss = -0.613236
INFO     step  900 loss = -0.59879
INFO     step 1000 loss = -0.601271
CPU times: user 4.37 s, sys: 30.4 ms, total: 4.4 s
Wall time: 4.4 s

接下来,我们可以从预测者那里抽取后验样本进行评估,传递全部协变量,但只传递部分数据。我们会用Pyro的分位数()绘制中位数和80%置信区间的函数。为了评估适合度,我们将使用eval_crps()计算连续分级概率得分;这是评估重尾分布的分布拟合度的一个很好的指标。

[7]:
samples = forecaster(data[:T1], covariates, num_samples=1000)
p10, p50, p90 = quantile(samples, (0.1, 0.5, 0.9)).squeeze(-1)
crps = eval_crps(samples, data[T1:])
print(samples.shape, p10.shape)

plt.figure(figsize=(9, 3))
plt.fill_between(torch.arange(T1, T2), p10, p90, color="red", alpha=0.3)
plt.plot(torch.arange(T1, T2), p50, 'r-', label='forecast')
plt.plot(data, 'k-', label='truth')
plt.title("Total weekly ridership (CRPS = {:0.3g})".format(crps))
plt.ylabel("log(# rides)")
plt.xlabel("Week after 2011-01-01")
plt.xlim(0, None)
plt.legend(loc="best");
torch.Size([1000, 52, 1]) torch.Size([52])

_images/forecasting_i_11_1.png

放大到预测区域,我们看到该模型忽略了季节性行为。

[8]:
plt.figure(figsize=(9, 3))
plt.fill_between(torch.arange(T1, T2), p10, p90, color="red", alpha=0.3)
plt.plot(torch.arange(T1, T2), p50, 'r-', label='forecast')
plt.plot(torch.arange(T1, T2), data[T1:], 'k-', label='truth')
plt.title("Total weekly ridership (CRPS = {:0.3g})".format(crps))
plt.ylabel("log(# rides)")
plt.xlabel("Week after 2011-01-01")
plt.xlim(T1, None)
plt.legend(loc="best");

_images/forecasting_i_13_0.png

我们可以简单地通过添加新的协变量来添加每年的季节性成分(注意,我们已经在模型中处理了feature_dim > 1).

[9]:
%%time
pyro.set_rng_seed(1)
pyro.clear_param_store()
time = torch.arange(float(T2)) / 365
covariates = torch.cat([time.unsqueeze(-1),
                        periodic_features(T2, 365.25 / 7)], dim=-1)
forecaster = Forecaster(Model1(), data[:T1], covariates[:T1], learning_rate=0.1)
INFO     step    0 loss = 53174.4
INFO     step  100 loss = 0.519148
INFO     step  200 loss = -0.0264822
INFO     step  300 loss = -0.314983
INFO     step  400 loss = -0.413243
INFO     step  500 loss = -0.487756
INFO     step  600 loss = -0.472516
INFO     step  700 loss = -0.595866
INFO     step  800 loss = -0.500985
INFO     step  900 loss = -0.558623
INFO     step 1000 loss = -0.589603
CPU times: user 4.5 s, sys: 34.3 ms, total: 4.53 s
Wall time: 4.54 s
[10]:
samples = forecaster(data[:T1], covariates, num_samples=1000)
p10, p50, p90 = quantile(samples, (0.1, 0.5, 0.9)).squeeze(-1)
crps = eval_crps(samples, data[T1:])

plt.figure(figsize=(9, 3))
plt.fill_between(torch.arange(T1, T2), p10, p90, color="red", alpha=0.3)
plt.plot(torch.arange(T1, T2), p50, 'r-', label='forecast')
plt.plot(data, 'k-', label='truth')
plt.title("Total weekly ridership (CRPS = {:0.3g})".format(crps))
plt.ylabel("log(# rides)")
plt.xlabel("Week after 2011-01-01")
plt.xlim(0, None)
plt.legend(loc="best");

_images/forecasting_i_16_0.png

[11]:
plt.figure(figsize=(9, 3))
plt.fill_between(torch.arange(T1, T2), p10, p90, color="red", alpha=0.3)
plt.plot(torch.arange(T1, T2), p50, 'r-', label='forecast')
plt.plot(torch.arange(T1, T2), data[T1:], 'k-', label='truth')
plt.title("Total weekly ridership (CRPS = {:0.3g})".format(crps))
plt.ylabel("log(# rides)")
plt.xlabel("Week after 2011-01-01")
plt.xlim(T1, None)
plt.legend(loc="best");

_images/forecasting_i_17_0.png

时间局部随机变量:self.time_plate

到目前为止,我们已经看到了ForecastingModel.model()方法和self.predict()。特定于预测的最后一部分语法是self.time_plate时间局部变量的上下文。要了解这是如何工作的,请考虑将上面的全球线性趋势模型改为局部水平模型。请注意poutine.reparam()handler是一个通用的Pyro推理技巧,不是专门针对预测的。

[12]:
# 定义一个继承自ForecastingModel的Model2类。
class Model2(ForecastingModel):
    def model(self, zero_data, covariates):
        data_dim = zero_data.size(-1)
        feature_dim = covariates.size(-1)
        # 定义模型的偏差项,使用正态分布进行采样。
        bias = pyro.sample("bias", dist.Normal(0, 10).expand([data_dim]).to_event(1))
        # 定义模型的权重,使用正态分布进行采样。
        weight = pyro.sample("weight", dist.Normal(0, 0.1).expand([feature_dim]).to_event(1))

        # 我们将会采样一个全局的时间尺度参数,在时间板(plate)之外,
        # 然后在时间板内部采样局部独立同分布(iid)噪声。
        drift_scale = pyro.sample("drift_scale",
                                  dist.LogNormal(-20, 5).expand([1]).to_event(1))
        with self.time_plate:  # 使用时间板来表示时间序列数据。
            # 使用重参数化技术来提高变分拟合的效果。即使移除这个上下文管理器,
            # 模型仍然是正确的,但拟合效果看起来会更差。
            with poutine.reparam(config={"drift": LocScaleReparam()}):
                drift = pyro.sample("drift", dist.Normal(zero_data, drift_scale).to_event(1))

        # 采样了iid "drift" 噪声之后,我们可以以任何时间依赖的方式组合它。
        # 重要的是保持板内的所有内容独立,并在板外应用依赖变换。
        motion = drift.cumsum(-2)  # 一个布朗运动。

        # 预测现在包括三个项:布朗运动项、偏差项和协变量项。
        prediction = motion + bias + (weight * covariates).sum(-1, keepdim=True)
        # 确保预测的形状与zero_data的形状一致。
        assert prediction.shape[-2:] == zero_data.shape

        # 构建噪声分布并进行预测。
        noise_scale = pyro.sample("noise_scale", dist.LogNormal(-5, 5).expand([1]).to_event(1))
        noise_dist = dist.Normal(0, noise_scale)
        self.predict(noise_dist, prediction)
class Model2(ForecastingModel):
    def model(self, zero_data, covariates):
        data_dim = zero_data.size(-1)
        feature_dim = covariates.size(-1)
        bias = pyro.sample("bias", dist.Normal(0, 10).expand([data_dim]).to_event(1))
        weight = pyro.sample("weight", dist.Normal(0, 0.1).expand([feature_dim]).to_event(1))

        # We'll sample a time-global scale parameter outside the time plate,
        # then time-local iid noise inside the time plate.
        drift_scale = pyro.sample("drift_scale",
                                  dist.LogNormal(-20, 5).expand([1]).to_event(1))
        with self.time_plate:
            # We'll use a reparameterizer to improve variational fit. The model would still be
            # correct if you removed this context manager, but the fit appears to be worse.
            with poutine.reparam(config={"drift": LocScaleReparam()}):
                drift = pyro.sample("drift", dist.Normal(zero_data, drift_scale).to_event(1))

        # After we sample the iid "drift" noise we can combine it in any time-dependent way.
        # It is important to keep everything inside the plate independent and apply dependent
        # transforms outside the plate.
        motion = drift.cumsum(-2)  # A Brownian motion.

        # The prediction now includes three terms.
        prediction = motion + bias + (weight * covariates).sum(-1, keepdim=True)
        assert prediction.shape[-2:] == zero_data.shape

        # Construct the noise distribution and predict.
        noise_scale = pyro.sample("noise_scale", dist.LogNormal(-5, 5).expand([1]).to_event(1))
        noise_dist = dist.Normal(0, noise_scale)
        self.predict(noise_dist, prediction)
[13]:
%%time
pyro.set_rng_seed(1)
pyro.clear_param_store()
time = torch.arange(float(T2)) / 365
covariates = periodic_features(T2, 365.25 / 7)
forecaster = Forecaster(Model2(), data[:T1], covariates[:T1], learning_rate=0.1,
                        time_reparam="dct",
                       )
INFO     step    0 loss = 1.73259e+09
INFO     step  100 loss = 0.935019
INFO     step  200 loss = -0.0290582
INFO     step  300 loss = -0.193718
INFO     step  400 loss = -0.292689
INFO     step  500 loss = -0.411964
INFO     step  600 loss = -0.291355
INFO     step  700 loss = -0.414344
INFO     step  800 loss = -0.472016
INFO     step  900 loss = -0.480997
INFO     step 1000 loss = -0.540629
CPU times: user 9.47 s, sys: 56.4 ms, total: 9.52 s
Wall time: 9.54 s
[14]:
samples = forecaster(data[:T1], covariates, num_samples=1000)
p10, p50, p90 = quantile(samples, (0.1, 0.5, 0.9)).squeeze(-1)
crps = eval_crps(samples, data[T1:])

plt.figure(figsize=(9, 3))
plt.fill_between(torch.arange(T1, T2), p10, p90, color="red", alpha=0.3)
plt.plot(torch.arange(T1, T2), p50, 'r-', label='forecast')
plt.plot(data, 'k-', label='truth')
plt.title("Total weekly ridership (CRPS = {:0.3g})".format(crps))
plt.ylabel("log(# rides)")
plt.xlabel("Week after 2011-01-01")
plt.xlim(0, None)
plt.legend(loc="best");

_images/forecasting_i_21_0.png

[15]:
plt.figure(figsize=(9, 3))
plt.fill_between(torch.arange(T1, T2), p10, p90, color="red", alpha=0.3)
plt.plot(torch.arange(T1, T2), p50, 'r-', label='forecast')
plt.plot(torch.arange(T1, T2), data[T1:], 'k-', label='truth')
plt.title("Total weekly ridership (CRPS = {:0.3g})".format(crps))
plt.ylabel("log(# rides)")
plt.xlabel("Week after 2011-01-01")
plt.xlim(T1, None)
plt.legend(loc="best");

_images/forecasting_i_22_0.png

重尾噪声¶

我们最终的单变量模型将从高斯噪声推广到重尾噪声稳定的噪音。唯一的区别是noise_dist它现在有两个新参数:stability确定尾部重量和skew确定正尖峰与负尖峰的相对大小。

这稳定分布是正态分布的自然重尾推广,但由于其密度函数难以处理,因此很难处理。Pyro实现了处理稳定分布的辅助变量方法。为了通知Pyro使用这些辅助变量方法,我们将最后一行用poutine.reparam()应用稳定程序变换到名为“残差”的隐式观察点。您可以通过指定以下内容为其他站点使用稳定的发行版config={"my_site_name": StableReparam()}.

[16]:
class Model3(ForecastingModel):
    def model(self, zero_data, covariates):
        data_dim = zero_data.size(-1)
        feature_dim = covariates.size(-1)
        bias = pyro.sample("bias", dist.Normal(0, 10).expand([data_dim]).to_event(1))
        weight = pyro.sample("weight", dist.Normal(0, 0.1).expand([feature_dim]).to_event(1))

        drift_scale = pyro.sample("drift_scale", dist.LogNormal(-20, 5).expand([1]).to_event(1))
        with self.time_plate:
            with poutine.reparam(config={"drift": LocScaleReparam()}):
                drift = pyro.sample("drift", dist.Normal(zero_data, drift_scale).to_event(1))
        motion = drift.cumsum(-2)  # A Brownian motion.

        prediction = motion + bias + (weight * covariates).sum(-1, keepdim=True)
        assert prediction.shape[-2:] == zero_data.shape

        # The next part of the model creates a likelihood or noise distribution.
        # Again we'll be Bayesian and write this as a probabilistic program with
        # priors over parameters.
        stability = pyro.sample("noise_stability", dist.Uniform(1, 2).expand([1]).to_event(1))
        skew = pyro.sample("noise_skew", dist.Uniform(-1, 1).expand([1]).to_event(1))
        scale = pyro.sample("noise_scale", dist.LogNormal(-5, 5).expand([1]).to_event(1))
        noise_dist = dist.Stable(stability, skew, scale)

        # We need to use a reparameterizer to handle the Stable distribution.
        # Note "residual" is the name of Pyro's internal sample site in self.predict().
        with poutine.reparam(config={"residual": StableReparam()}):
            self.predict(noise_dist, prediction)
[17]:
%%time
pyro.set_rng_seed(2)
pyro.clear_param_store()
time = torch.arange(float(T2)) / 365
covariates = periodic_features(T2, 365.25 / 7)
forecaster = Forecaster(Model3(), data[:T1], covariates[:T1], learning_rate=0.1,
                        time_reparam="dct")
for name, value in forecaster.guide.median().items():
    if value.numel() == 1:
        print("{} = {:0.4g}".format(name, value.item()))
INFO     step    0 loss = 5.92061e+07
INFO     step  100 loss = 13.6553
INFO     step  200 loss = 3.18891
INFO     step  300 loss = 0.884046
INFO     step  400 loss = 0.27383
INFO     step  500 loss = -0.0354842
INFO     step  600 loss = -0.211247
INFO     step  700 loss = -0.311198
INFO     step  800 loss = -0.259799
INFO     step  900 loss = -0.326406
INFO     step 1000 loss = -0.306335
bias = 14.64
drift_scale = 3.234e-08
noise_stability = 1.937
noise_skew = 0.004095
noise_scale = 0.06038
CPU times: user 19.5 s, sys: 103 ms, total: 19.6 s
Wall time: 19.7 s
[18]:
samples = forecaster(data[:T1], covariates, num_samples=1000)
p10, p50, p90 = quantile(samples, (0.1, 0.5, 0.9)).squeeze(-1)
crps = eval_crps(samples, data[T1:])

plt.figure(figsize=(9, 3))
plt.fill_between(torch.arange(T1, T2), p10, p90, color="red", alpha=0.3)
plt.plot(torch.arange(T1, T2), p50, 'r-', label='forecast')
plt.plot(data, 'k-', label='truth')
plt.title("Total weekly ridership (CRPS = {:0.3g})".format(crps))
plt.ylabel("log(# rides)")
plt.xlabel("Week after 2011-01-01")
plt.xlim(0, None)
plt.legend(loc="best");

_images/forecasting_i_26_0.png

[19]:
plt.figure(figsize=(9, 3))
plt.fill_between(torch.arange(T1, T2), p10, p90, color="red", alpha=0.3)
plt.plot(torch.arange(T1, T2), p50, 'r-', label='forecast')
plt.plot(torch.arange(T1, T2), data[T1:], 'k-', label='truth')
plt.title("Total weekly ridership (CRPS = {:0.3g})".format(crps))
plt.ylabel("log(# rides)")
plt.xlabel("Week after 2011-01-01")
plt.xlim(T1, None)
plt.legend(loc="best");

_images/forecasting_i_27_0.png

回溯测试¶

来比较我们的高斯Model2而且稳定Model3我们将使用一个简单的回溯测试()帮手。默认情况下,该助手评估三个指标:CRPS评估重尾数据的分布准确性,平均绝对误差评估重尾数据的点精度,以及均方根误差评估正态尾数据的准确性。这里的一个细微差别是设置warm_start=True以减少随机重启的需要。

[20]:
%%time
pyro.set_rng_seed(1)
pyro.clear_param_store()
windows2 = backtest(data, covariates, Model2,
                    min_train_window=104, test_window=52, stride=26,
                    forecaster_options={"learning_rate": 0.1, "time_reparam": "dct",
                                        "log_every": 1000, "warm_start": True})
INFO     Training on window [0:104], testing on window [104:156]
INFO     step    0 loss = 3543.21
INFO     step 1000 loss = 0.140962
INFO     Training on window [0:130], testing on window [130:182]
INFO     step    0 loss = 0.27281
INFO     step 1000 loss = -0.227765
INFO     Training on window [0:156], testing on window [156:208]
INFO     step    0 loss = 0.622017
INFO     step 1000 loss = -0.0232647
INFO     Training on window [0:182], testing on window [182:234]
INFO     step    0 loss = 0.181045
INFO     step 1000 loss = -0.104492
INFO     Training on window [0:208], testing on window [208:260]
INFO     step    0 loss = 0.160061
INFO     step 1000 loss = -0.184363
INFO     Training on window [0:234], testing on window [234:286]
INFO     step    0 loss = 0.0414903
INFO     step 1000 loss = -0.207943
INFO     Training on window [0:260], testing on window [260:312]
INFO     step    0 loss = -0.00223408
INFO     step 1000 loss = -0.256718
INFO     Training on window [0:286], testing on window [286:338]
INFO     step    0 loss = -0.0552213
INFO     step 1000 loss = -0.277793
INFO     Training on window [0:312], testing on window [312:364]
INFO     step    0 loss = -0.141342
INFO     step 1000 loss = -0.36945
INFO     Training on window [0:338], testing on window [338:390]
INFO     step    0 loss = -0.148779
INFO     step 1000 loss = -0.332914
INFO     Training on window [0:364], testing on window [364:416]
INFO     step    0 loss = -0.27899
INFO     step 1000 loss = -0.462222
INFO     Training on window [0:390], testing on window [390:442]
INFO     step    0 loss = -0.328539
INFO     step 1000 loss = -0.463518
INFO     Training on window [0:416], testing on window [416:468]
INFO     step    0 loss = -0.400719
INFO     step 1000 loss = -0.494253
CPU times: user 1min 57s, sys: 502 ms, total: 1min 57s
Wall time: 1min 57s
[21]:
%%time
pyro.set_rng_seed(1)
pyro.clear_param_store()
windows3 = backtest(data, covariates, Model3,
                    min_train_window=104, test_window=52, stride=26,
                    forecaster_options={"learning_rate": 0.1, "time_reparam": "dct",
                                        "log_every": 1000, "warm_start": True})
INFO     Training on window [0:104], testing on window [104:156]
INFO     step    0 loss = 1852.88
INFO     step 1000 loss = 0.533988
INFO     Training on window [0:130], testing on window [130:182]
INFO     step    0 loss = 2.60906
INFO     step 1000 loss = 0.0715323
INFO     Training on window [0:156], testing on window [156:208]
INFO     step    0 loss = 2.60063
INFO     step 1000 loss = 0.110426
INFO     Training on window [0:182], testing on window [182:234]
INFO     step    0 loss = 1.99784
INFO     step 1000 loss = 0.020393
INFO     Training on window [0:208], testing on window [208:260]
INFO     step    0 loss = 1.63004
INFO     step 1000 loss = -0.0936131
INFO     Training on window [0:234], testing on window [234:286]
INFO     step    0 loss = 1.33227
INFO     step 1000 loss = -0.114948
INFO     Training on window [0:260], testing on window [260:312]
INFO     step    0 loss = 1.19163
INFO     step 1000 loss = -0.193086
INFO     Training on window [0:286], testing on window [286:338]
INFO     step    0 loss = 1.01131
INFO     step 1000 loss = -0.242592
INFO     Training on window [0:312], testing on window [312:364]
INFO     step    0 loss = 0.983859
INFO     step 1000 loss = -0.279851
INFO     Training on window [0:338], testing on window [338:390]
INFO     step    0 loss = 0.560554
INFO     step 1000 loss = -0.209488
INFO     Training on window [0:364], testing on window [364:416]
INFO     step    0 loss = 0.716816
INFO     step 1000 loss = -0.369162
INFO     Training on window [0:390], testing on window [390:442]
INFO     step    0 loss = 0.391474
INFO     step 1000 loss = -0.45527
INFO     Training on window [0:416], testing on window [416:468]
INFO     step    0 loss = 0.37326
INFO     step 1000 loss = -0.508014
CPU times: user 4min 1s, sys: 960 ms, total: 4min 2s
Wall time: 4min 2s
[22]:
fig, axes = plt.subplots(3, figsize=(8, 6), sharex=True)
axes[0].set_title("Gaussian versus Stable accuracy over {} windows".format(len(windows2)))
axes[0].plot([w["crps"] for w in windows2], "b<", label="Gaussian")
axes[0].plot([w["crps"] for w in windows3], "r>", label="Stable")
axes[0].set_ylabel("CRPS")
axes[1].plot([w["mae"] for w in windows2], "b<", label="Gaussian")
axes[1].plot([w["mae"] for w in windows3], "r>", label="Stable")
axes[1].set_ylabel("MAE")
axes[2].plot([w["rmse"] for w in windows2], "b<", label="Gaussian")
axes[2].plot([w["rmse"] for w in windows3], "r>", label="Stable")
axes[2].set_ylabel("RMSE")
axes[0].legend(loc="best")
plt.tight_layout()

_images/forecasting_i_31_0.png

请注意,RMSE是评估重尾数据的一个很差的指标。我们的稳定模型有如此重的尾部,以至于它的方差是无限的,所以我们不能期望RMSE收敛,因此偶尔会有边远点。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2069972.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

加密学中的零知识证明(Zero-Knowledge Proof, ZKP)到底是什么?

加密学中的零知识证明&#xff08;Zero-Knowledge Proof, ZKP&#xff09;到底是什么&#xff1f; 引言 在加密学的应用中&#xff0c;零知识证明&#xff08;Zero-Knowledge Proof, ZKP&#xff09;无疑是一颗璀璨的明星。它不仅挑战了我们对信息验证的传统认知&#xff0c;…

如何使用ssm实现理发店会员管理系统的设计和实现+vue

TOC ssm089理发店会员管理系统的设计和实现vue 绪论 1.1 选题背景 网络技术和计算机技术发展至今&#xff0c;已经拥有了深厚的理论基础&#xff0c;并在现实中进行了充分运用&#xff0c;尤其是基于计算机运行的软件更是受到各界的关注。计算机软件可以针对不同行业的营业…

C语言刷题日记(附详解)(2)

一、有理数加法 输入格式&#xff1a; 输入在一行中按照a1/b1 a2/b2的格式给出两个分数形式的有理数&#xff0c;其中分子和分母全是整形范围内的正整数。 输出格式&#xff1a; 在一行中按照a/b的格式输出两个有理数的和。注意必须是该有理数的最简分数形式&#xff0c;若…

OpenCSG全网首发!Phi-3.5 Mini Instruct全参微调中文版

前沿科技速递&#x1f680; &#x1f389; 震撼发布&#xff01;OpenCSG正式推出全参数微调的Phi-3.5-mini-instruct中文版模型&#xff01; &#x1f50d; 本次发布的Phi-3.5-mini-instruct中文版模型基于最新的Phi-3.5架构&#xff0c;经过全参数微调&#xff0c;专为中文场景…

软件测试——JMeter安装配置

文章目录 JMeter介绍JMeter下载及配置配置错误 提示此时不应有...修改语言为中文 JMeter介绍 Apache JMeter 是 Apache 组织基于 Java 开发的压⼒测试⼯具&#xff0c;⽤于对软件做性能测试 JMeter下载及配置 环境要求&#xff1a;JDK版本在1.8及以上 下载压缩包&#xff0c;…

设计模式—代理模式

文章目录 以前自己做的笔记动态代理(重点)1.基于jdk的动态代理2.基于cglib的动态代理 新资料第 15 章 代理模式1、代理模式的基本介绍2、静态代码模式3、动态代理模式4、Cglib 代理模式5、代理模式(Proxy)的变体 代理模式是给某一个对象提供一个代理&#xff0c;并通过代理对象…

第12章 网络 (6)

12.8 网络层 12.8.4 分组转发 转发IP分组&#xff0c;根据目标地址分为&#xff1a; 1. 直接和本地相连。 2. 不直接相连&#xff0c;需要网关转发。 int ip_route_input_noref(skb, daddr, saddr, tos, net_dev)&#xff1a; //查找路由表。 如果 skb->_skb_r…

安捷伦色谱仪器LabVIEW软件替换与禁运配件开发

可行性分析及实现路径 可行性&#xff1a; 软件替换&#xff1a; 驱动程序支持&#xff1a; 要实现LabVIEW对安捷伦色谱仪器的控制&#xff0c;需要检查安捷伦是否提供LabVIEW驱动程序。如果没有现成的驱动&#xff0c;则可能需要开发自定义的驱动程序&#xff0c;通过LabVIEW…

微软推出全新多语言高质量Phi-3.5语言模型

每周跟踪AI热点新闻动向和震撼发展 想要探索生成式人工智能的前沿进展吗&#xff1f;订阅我们的简报&#xff0c;深入解析最新的技术突破、实际应用案例和未来的趋势。与全球数同行一同&#xff0c;从行业内部的深度分析和实用指南中受益。不要错过这个机会&#xff0c;成为AI领…

css flex布局 justify-content: space-between 最后两张居左

比如如果是8张&#xff0c;最后两张两边对齐&#xff0c;第八张最后一张 放个占位符就OK了 <div class"previewPadding flex" > <div class"picList picList3" v-for"(item,index) in picDataList" :key"index"> <…

6个免费字体网站,无需担心版权问题~

在设计项目中&#xff0c;选择合适的字体至关重要。然而&#xff0c;许多高质量的字体往往价格不菲。幸运的是&#xff0c;有一些网站提供了免费的商用字体&#xff0c;既能满足设计需求&#xff0c;又不需要额外的预算。在这篇文章中&#xff0c;分享6个免费商用字体网站&…

济南网站制作方案定制

在当今数字化时代&#xff0c;拥有一个专业的网站已经成为企业发展不可或缺的一部分。济南作为山东省的省会&#xff0c;经济发展迅速&#xff0c;各行各业对网站制作的需求也日益增加。因此&#xff0c;定制化的网站制作方案在济南显得尤为重要&#xff0c;能够帮助企业在激烈…

深入探究为什么 RAG 并不总是按预期工作:概述其背后的业务价值、数据和技术。

添加图片注释&#xff0c;不超过 140 字&#xff08;可选&#xff09; 欢迎来到雲闪世界。我们将首先探讨决定基于 RAG 的项目成败的业务要素。然后&#xff0c;我们将深入探讨常见的技术障碍&#xff08;从数据处理到性能优化&#xff09;&#xff0c;并讨论克服这些障碍的策略…

数据结构(邓俊辉)学习笔记】优先级队列 10——左式堆:插入 + 删除

文章目录 1. 插入即是合并2. 删除亦是合并 1. 插入即是合并 以上&#xff0c;我们已经实现了&#xff0c;对于左式堆来说最为在意的合并算法。非常有意思的是&#xff0c;尽管合并操作并非优先级队列所要求的基本操作接口。但基于合并操作&#xff0c;我们却同样可以实现左式堆…

超全大模型训练流程,教你如何训练自己的大模型

“大模型的核心主要有两部分&#xff0c;一是训练数据&#xff0c;二是机器学习模型。” 现在大模型发展得如火如荼&#xff0c;但是没有学过人工智能技术的开发者&#xff0c;只会调用其接口&#xff0c;但不清楚怎么训练一个大模型。 今天就简单介绍一下自己的理解&#xf…

Transformer系列-10丨一文理解透Transformer

一、引言 "Attention Is All You Need"是一篇于2017年发表的开创性论文&#xff0c;首次介绍了Transformer模型。 这篇论文彻底改变了自然语言处理&#xff08;NLP&#xff09;领域的研究方向&#xff0c;为后续的众多NLP模型和应用奠定了基础。我们熟知的ChatGPT也…

【022】字符串的处理(输出,分割,删除,新增,替换,查找,长度)_#VBA

字符串的处理——输出,分割,删除,新增,替换,查找,长度 字符串的处理1. 输出2. 长度3. 查找4. 删除5. 新增6. 分割7. 替换字符串的处理 为了更好快捷查找对应的字符串处理方法,将对应的方法汇总,可以直接使用,没有过多的介绍,直接代码块及对应效果。包括字符串的输出…

全国上市公司网络安全风险指数(2001-2023年)

数据来源&#xff1a;本数据参考耿勇老师等&#xff08;2024&#xff09;做法采集了2001-2023年的上市公司年报&#xff0c;所有年报均来自于深交所和上交所官方网站&#xff0c;通过对上市公司的年报进行精读&#xff0c;提取出包括网络安全、网络攻击等在内的39个关键词构成企…

自定义@ResponseBody以及SpringMVC总结

文章目录 1.需求分析2.目录3.自定义ResponseBody注解4.MonsterController.java5.Monster.java 实现序列化接口6.引入jackson7.Adapter.java 如果有ResponseBody注解就返回json8.测试9.SpringMVC执行流程 1.需求分析 2.目录 3.自定义ResponseBody注解 package com.sunxiansheng…

大数据技术之 Flume概述、安装(1)

目录 Flume 概述 Flume 定义 为什么选用 Flume Flume 基础架构 Agent Source Sink Channel Event Flume 安装 Flume 安装部署 安装地址 安装部署 Flume 概述 Flume 定义 Flume 是 Cloudera 提供的一个高可用的、高可靠的、分布式的海量日志采集、聚合和传输的系统。Flume…