ESP32神经网络初步使用

摘要

本文档描述了如何使用Python和TensorFlow训练一个简单的神经网络模型来预测正弦函数，并将其部署到ESP32微控制器上。

参考文章

使用Python和Arduino在ESP32上预测正弦函数 - Dapenson - 博客园 (cnblogs.com)

最简单体验TinyML、TensorFlow Lite——ESP32跑机器学习（全代码）-CSDN博客

安装tensorflow

在windows上安装python3.8

在cmd中输入

pip install tensorflow

安装图形显示库，方便直观观察数据

pip install matplotlib

编写训练脚本

首先准备数据的python代码

import math
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers
# Matplotlib is a graphing library
import matplotlib.pyplot as plt


def get_model():
    SAMPLES = 1000
    np.random.seed(1337)
    x_values = np.random.uniform(low=0, high=2 * math.pi, size=SAMPLES)
    # shuffle and add noise
    np.random.shuffle(x_values)
    y_values = np.sin(x_values)
    y_values += 0.1 * np.random.randn(*y_values.shape)
    # Plot our data
    plt.plot(x_values, y_values, 'b.')
    plt.show()

    # split into train, validation, test
    TRAIN_SPLIT = int(0.6 * SAMPLES)
    TEST_SPLIT = int(0.2 * SAMPLES + TRAIN_SPLIT)
    x_train, x_test, x_validate = np.split(x_values, [TRAIN_SPLIT, TEST_SPLIT])
    y_train, y_test, y_validate = np.split(y_values, [TRAIN_SPLIT, TEST_SPLIT])

    # create a NN with 2 layers of 16 neurons
    model = tf.keras.Sequential()
    model.add(layers.Dense(16, activation='relu', input_shape=(1, )))
    model.add(layers.Dense(16, activation='relu'))
    model.add(layers.Dense(1))
    model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])
    model.fit(x_train,
              y_train,
              epochs=200,
              batch_size=16,
              validation_data=(x_validate, y_validate))
    return model

if __name__ == '__main__':
    model = get_model()

定义了一个名为get_model的函数，用于创建一个神经网络模型。以下是代码的详细解释：

首先，导入了所需的库和模块，如numpy、math、matplotlib.pyplot和tensorflow。

定义了一个名为SAMPLES的常量，表示数据集的大小。

使用numpy的random.seed函数设置随机数种子，以便结果可重复。

生成一个包含SAMPLES个均匀分布的随机数的数组x_values，范围在0到2π之间。

对x_values数组进行洗牌和添加噪声，使其更接近实际数据。

使用matplotlib.pyplot的plot函数绘制x_values和y_values的关系图。

将数据集划分为训练集、验证集和测试集。

创建一个具有两个隐藏层（每层16个神经元）的神经网络模型。

使用tensorflow的Sequential类创建一个神经网络模型。

添加两个全连接层，分别具有16个神经元和激活函数relu。

添加一个输出层，具有1个神经元和线性激活函数。

使用compile方法编译模型，指定优化器为rmsprop，损失函数为mse（均方误差），评估指标为mae（平均绝对误差）。

使用fit方法训练模型，传入训练数据x_train和y_train，以及训练参数，如迭代次数（epochs）、批次大小（batch_size）和验证数据（x_validate和y_validate）。

返回训练好的模型。

    # create a NN with 2 layers of 16 neurons
    model = tf.keras.Sequential()
    model.add(layers.Dense(16, activation='relu', input_shape=(1, )))
    model.add(layers.Dense(16, activation='relu'))
    model.add(layers.Dense(1))

这段代码是使用TensorFlow框架构建一个简单的全连接神经网络模型。以下是模型的详细信息：

model = tf.keras.Sequential()：创建一个空的序列模型对象。Sequential是TensorFlow中用于构建神经网络的简单接口，可以堆叠多个层以创建神经网络模型。

model.add(layers.Dense(16, activation='relu', input_shape=(1, )))：向模型中添加第一个全连接层。layers.Dense表示全连接层，16表示该层的神经元数量，activation='relu'表示使用ReLU激活函数。input_shape=(1,)表示该层的输入维度为(1,)，即每个输入样本只有一个特征。

model.add(layers.Dense(16, activation='relu'))：向模型中添加第二个全连接层。该层的参数与第一个全连接层相同。

model.add(layers.Dense(1))：向模型中添加输出层。layers.Dense表示全连接层，1表示该层的神经元数量，即输出结果的维度为1。

    model.fit(x_train,
              y_train,
              epochs=200,
              batch_size=16,
              validation_data=(x_validate, y_validate))

一旦我们定义了模型，我们就可以使用数据来训练它。训练包括向神经网络传递一个x值，检查网络的输出与期望的y值偏离多少，调整神经元的权值和偏差，以便下次输出更有可能是正确的。训练在完整数据集上多次运行这个过程，每次完整的运行都被称为一个epoch。训练中要运行的epoch数是我们可以设置的参数。在每个epoch期间，数据在网络中以多个批次运行。每个批处理，几个数据片段被传递到网络，产生输出值。这些输出的正确性是整体衡量的，网络的权重和偏差是相应调整的，每批一次。批处理大小也是我们可以设置的参数。下面单元格中的代码使用来自训练数据的x和y值来训练模型。它运行200个epoch，每个批处理中有16条数据。我们还传入一些用于验证的数据。

生成随机数据的直观样式

下面通过模型生成tflite

if __name__ == '__main__':
    model = get_model()
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
    tflite_model = converter.convert()

这段代码是将一个TensorFlow模型转换为TensorFlow Lite格式。TensorFlow Lite是一种轻量级的TensorFlow库，专为移动和嵌入式设备优化。将TensorFlow模型转换为TensorFlow Lite格式可以使模型在移动和嵌入式设备上运行得更快。

tf.lite.TFLiteConverter 是 TensorFlow Lite 中的一个类，用于将 TensorFlow 模型转换为 TensorFlow Lite 格式。from_keras_model 方法将一个 TensorFlow 模型作为输入，并返回一个 TFLiteConverter 对象。这个对象可以用来将模型转换为 TensorFlow Lite 格式。

converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]：设置优化选项，以便在转换过程中减小模型大小。OPTIMIZE_FOR_SIZE表示在保持模型准确性的同时尽可能减小模型大小。

tflite_model = converter.convert()：使用convert()方法将TensorFlow模型转换为TensorFlow Lite模型，并将结果存储在tflite_model变量中。

Epoch 195/200
38/38 [==============================] - 0s 1ms/step - loss: 0.0149 - mae: 0.0969 - val_loss: 0.0127 - val_mae: 0.0886
Epoch 196/200
38/38 [==============================] - 0s 1ms/step - loss: 0.0151 - mae: 0.0985 - val_loss: 0.0144 - val_mae: 0.0963
Epoch 197/200
38/38 [==============================] - 0s 1ms/step - loss: 0.0145 - mae: 0.0955 - val_loss: 0.0115 - val_mae: 0.0813
Epoch 198/200
38/38 [==============================] - 0s 1ms/step - loss: 0.0147 - mae: 0.0953 - val_loss: 0.0110 - val_mae: 0.0816
Epoch 199/200
38/38 [==============================] - 0s 1ms/step - loss: 0.0149 - mae: 0.0972 - val_loss: 0.0137 - val_mae: 0.0920
Epoch 200/200
38/38 [==============================] - 0s 1ms/step - loss: 0.0142 - mae: 0.0946 - val_loss: 0.0113 - val_mae: 0.0850

这些是训练过程中的日志输出，表示模型在每个训练周期（epoch）的损失（loss）和平均绝对误差（mae）。每个周期后，都会在验证数据集上评估模型性能，并报告验证损失和平均绝对误差。

Epoch 191/200表示当前周期是第191个周期，总共200个周期。

38/38 [==============================] - 0s 1ms/step表示在当前周期中，训练数据集的38个样本已经训练完毕，训练耗时0秒，每步训练耗时1毫秒。

- loss: 0.0149 - mae: 0.0961表示在当前周期中，训练数据集的损失为0.0149，平均绝对误差为0.0961。

- val_loss: 0.0149 - val_mae: 0.0982表示在当前周期中，验证数据集的损失为0.0149，平均绝对误差为0.0982。

以此类推，后面的每个周期都会输出类似的信息。

下面我们将tflite_model的TensorFlow Lite模型文件写入名为sine_model.tflite的文件中

if __name__ == '__main__':
    model = get_model()
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    tflite_model = converter.convert()

    # # Save the model to disk
    open("sine_model.tflite", "wb").write(tflite_model)

执行之后会在运行目录下生成文件sine_model.tflite

下面将TensorFlow Lite模型（tflite_model）转换为一个C语言数组，并将结果写入一个名为sine_model_c.h的头文件中。

# Function: Convert some hex value into an array for C programming
def hex_to_c_array(hex_data, var_name):

    c_str = ''

    # Create header guard
    c_str += '#ifndef ' + var_name.upper() + '_H\n'
    c_str += '#define ' + var_name.upper() + '_H\n\n'

    # Add array length at top of file
    c_str += '\nunsigned int ' + var_name + '_len = ' + str(
        len(hex_data)) + ';\n'

    # Declare C variable
    c_str += 'unsigned char ' + var_name + '[] = {'
    hex_array = []
    for i, val in enumerate(hex_data):

        # Construct string from hex
        hex_str = format(val, '#04x')

        # Add formatting so each line stays within 80 characters
        if (i + 1) < len(hex_data):
            hex_str += ','
        if (i + 1) % 12 == 0:
            hex_str += '\n '
        hex_array.append(hex_str)

    # Add closing brace
    c_str += '\n ' + format(' '.join(hex_array)) + '\n};\n\n'

    # Close out header guard
    c_str += '#endif //' + var_name.upper() + '_H'

    return c_str


if __name__ == '__main__':
    model = get_model()
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    tflite_model = converter.convert()

    # # Save the model to disk
    open("sine_model.tflite", "wb").write(tflite_model)

    # # Write TFLite model to a C source (or header) file
    c_model_name = 'sine_model'
    with open(c_model_name + '.h', 'w') as file:
        file.write(hex_to_c_array(tflite_model, c_model_name))

注意：每次训练后的模型都不同，因为每次生成的训练素材都是随机提供的，所以不要以为每次都会有相同的数组。

模型已经被我们训练好了，但一般来说正常训练好的DL模型不能被部署到单片机上，因为太大了，我们将使用TensorFlow Lite转换器。转换器以一种特殊的、节省空间的格式输出文件，以便在内存受限的设备上使用。由于这个模型将部署在一个微控制器上，我们希望它尽可能小!量化是一种减小模型尺寸的技术。它降低了模型权值的精度，节省了内存。

# Function: Post-conversion validation of the model
def validate_model():
    SAMPLES = 1000
    np.random.seed(1337)
    x_values = np.random.uniform(low=0, high=2 * math.pi, size=SAMPLES)
    # shuffle and add noise
    np.random.shuffle(x_values)
    y_values = np.sin(x_values)
    y_values += 0.1 * np.random.randn(*y_values.shape)

    # split into train, validation, test
    TRAIN_SPLIT = int(0.6 * SAMPLES)
    TEST_SPLIT = int(0.2 * SAMPLES + TRAIN_SPLIT)
    x_train, x_test, x_validate = np.split(x_values, [TRAIN_SPLIT, TEST_SPLIT])
    y_train, y_test, y_validate = np.split(y_values, [TRAIN_SPLIT, TEST_SPLIT])

    # create a NN with 2 layers of 16 neurons
    model = tf.keras.Sequential()
    model.add(layers.Dense(16, activation='relu', input_shape=(1, )))
    model.add(layers.Dense(16, activation='relu'))
    model.add(layers.Dense(1))
    model.compile(optimizer='rmsprop', loss='mse', metrics=['mae'])
    history_1 = model.fit(x_train,
              y_train,
              epochs=200,
              batch_size=16,
              validation_data=(x_validate, y_validate))
    # Make predictions based on our test dataset
    predictions = model.predict(x_test)

    # Convert the model to the TensorFlow Lite format without quantization
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    tflite_model = converter.convert()

    # Save the model to disk
    open("sine_model_new.tflite", "wb").write(tflite_model)

    # Convert the model to the TensorFlow Lite format with quantization
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    # Indicate that we want to perform the default optimizations,
    # which includes quantization
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    # Define a generator function that provides our test data's x values
    # as a representative dataset, and tell the converter to use it
    def representative_dataset_generator():
        for value in x_test:
            # Each scalar value must be inside of a 2D array that is wrapped in a list
            yield [np.array(value, dtype=np.float32, ndmin=2)]
    converter.representative_dataset = representative_dataset_generator
    # Convert the model
    tflite_model = converter.convert()

    # Save the model to disk
    open("sine_model_quantized.tflite", "wb").write(tflite_model)

    # Instantiate an interpreter for each model
    sine_model = tf.lite.Interpreter('sine_model_new.tflite')
    sine_model_quantized = tf.lite.Interpreter('sine_model_quantized.tflite')

    # Allocate memory for each model
    sine_model.allocate_tensors()
    sine_model_quantized.allocate_tensors()

    # Get indexes of the input and output tensors
    sine_model_input_index = sine_model.get_input_details()[0]["index"]
    sine_model_output_index = sine_model.get_output_details()[0]["index"]
    sine_model_quantized_input_index = sine_model_quantized.get_input_details()[0]["index"]
    sine_model_quantized_output_index = sine_model_quantized.get_output_details()[0]["index"]

    # Create arrays to store the results
    sine_model_predictions = []
    sine_model_quantized_predictions = []

    # Run each model's interpreter for each value and store the results in arrays
    for x_value in x_test:
        # Create a 2D tensor wrapping the current x value
        x_value_tensor = tf.convert_to_tensor([[x_value]], dtype=np.float32)
        # Write the value to the input tensor
        sine_model.set_tensor(sine_model_input_index, x_value_tensor)
        # Run inference
        sine_model.invoke()
        # Read the prediction from the output tensor
        sine_model_predictions.append(
            sine_model.get_tensor(sine_model_output_index)[0])
        # Do the same for the quantized model
        sine_model_quantized.set_tensor(sine_model_quantized_input_index, x_value_tensor)
        sine_model_quantized.invoke()
        sine_model_quantized_predictions.append(
            sine_model_quantized.get_tensor(sine_model_quantized_output_index)[0])


    # See how they line up with the data
    plt.clf()
    plt.title('Comparison of various models against actual values')
    plt.plot(x_test, y_test, 'bo', label='Actual')
    plt.plot(x_test, predictions, 'ro', label='Original predictions')
    plt.plot(x_test, sine_model_predictions, 'bx', label='Lite predictions')
    plt.plot(x_test, sine_model_quantized_predictions, 'gx', label='Lite quantized predictions')
    plt.legend()
    plt.show()

这个函数会重新进行训练两个完全一样的模型，然后其中一个进行转换，用未转换的和已转换的进行比较。

可见几乎没有损失。

生成编程使用的数组

我们可以使用xxd命令生成数组，但是不能生成完整的头文件，没有ifdefine

PS E:\tensorflow> & 'D:\Program Files\Git\usr\bin\xxd.exe' -i .\sine_model.tflite > sina_modelxxd.h

xxd是一个Linux命令，Windows安装了git就有，可使用everything自行搜索xxd.exe，将其所在目录添加到环境变量中即可使用xxd命令。

这将生成一个名为sine_modelxxd.h的文件，其中包含模型的C数组表示形式。

unsigned char __sine_model_tflite[] = {
  0x1c, 0x00, 0x00, 0x00, 0x54, 0x46, 0x4c, 0x33, 0x14, 0x00, 0x20, 0x00,
  0x1c, 0x00, 0x18, 0x00, 0x14, 0x00, 0x10, 0x00, 0x0c, 0x00, 0x00, 0x00,
  0x08, 0x00, 0x04, 0x00, 0x14, 0x00, 0x00, 0x00, 0x1c, 0x00, 0x00, 0x00,
  0x90, 0x00, 0x00, 0x00, 0xe8, 0x00, 0x00, 0x00, 0x04, 0x07, 0x00, 0x00,
  ......
  ......
  ......
  0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x10, 0x00, 0x00, 0x00,
  0x0c, 0x00, 0x0c, 0x00, 0x0b, 0x00, 0x00, 0x00, 0x00, 0x00, 0x04, 0x00,
  0x0c, 0x00, 0x00, 0x00, 0x09, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x09
};
unsigned int __sine_model_tflite_len = 3168;

在上一步中的python代码中，已添加python转化模型为c数组的功能，因此此步骤可选

使用模型

现在，我们将转向Arduino部分，将训练好的模型加载到ESP32上，并进行正弦函数的预测。

安装运行环境

示例程序具体分以下几个步骤：

引入库和头文件在Arduino IDE中，我们需要引入EloquentTinyML库和模型头文件。这些库提供了在ESP32上加载和运行TensorFlow Lite模型的功能。

注意：要使用esp32通用芯片需要0.0.10版本

编写arduino文件

#include <Arduino.h>

// https://github.com/eloquentarduino/EloquentTinyML  Version:0.0.10
#include <EloquentTinyML.h>
// sine_model.h contains the array you exported from the previous step with xxd or tinymlgen
#include "sine_model.h"

#define NUMBER_OF_INPUTS 1
#define NUMBER_OF_OUTPUTS 1
// in future projects you may need to tweek this value: it's a trial and error process
#define TENSOR_ARENA_SIZE 2 * 1024

Eloquent::TinyML::TfLite<NUMBER_OF_INPUTS, NUMBER_OF_OUTPUTS, TENSOR_ARENA_SIZE> ml;


void setup() {
  // put your setup code here, to run once:
  Serial.begin(115200);
  ml.begin(sine_model);
}

void loop() {
  // pick up a random x and predict its sine
  float x = 3.14 * random(100) / 100;
  float y = sin(x);
  float input[1] = {x};
  float predicted = ml.predict(input);

  Serial.print("sin(");
  Serial.print(x);
  Serial.print(") = ");
  Serial.print(y);
  Serial.print("\t predicted: ");
  Serial.println(predicted);
  delay(100);
}

这段代码是ESP32的Arduino代码，用于加载和运行之前在Python中训练并转换为TensorFlow Lite格式的模型，以进行正弦函数预测。

以下是代码的详细解释：

引入必要的库和头文件：

#include <Arduino.h>
#include <EloquentTinyML.h>
#include "sine_model.h"

定义输入和输出的数量：

#define NUMBER_OF_INPUTS 1
#define NUMBER_OF_OUTPUTS 1

定义Tensor Arena的大小，用于分配内存空间给模型：

#define TENSOR_ARENA_SIZE 2 * 1024

创建TinyML对象并设置模型数据：

Eloquent::TinyML::TfLite<NUMBER_OF_INPUTS, NUMBER_OF_OUTPUTS, TENSOR_ARENA_SIZE> ml;

在setup()函数中初始化串口通信和TinyML模型：

void setup() {
  // put your setup code here, to run once:
  Serial.begin(115200);
  ml.begin(sine_model);
}

在loop()函数中进行循环预测：

void loop() {
  // pick up a random x and predict its sine
  float x = 3.14 * random(100) / 100;
  float y = sin(x);
  float input[1] = {x};
  float predicted = ml.predict(input);

  Serial.print("sin(");
  Serial.print(x);
  Serial.print(") = ");
  Serial.print(y);
  Serial.print("\t predicted: ");
  Serial.print(predicted);
  Serial.print("\t ");
  Serial.print(abs(y-predicted)*100/y);
  Serial.print("%\n");
  delay(500);
}

运行如下

使用最新版本

如果需要使用最新版本需要使用ESP32S3模块

还需要安装依赖库

编写arduino文件

/**
 * Run a TensorFlow model to predict the IRIS dataset
 * For a complete guide, visit
 * https://eloquentarduino.com/tensorflow-lite-esp32
 */
// replace with your own model
// include BEFORE <eloquent_tinyml.h>!
#include "sine_model.h"
// include the runtime specific for your board
// either tflm_esp32 or tflm_cortexm
#include <tflm_esp32.h>
// now you can include the eloquent tinyml wrapper
#include <eloquent_tinyml.h>

// this is trial-and-error process
// when developing a new model, start with a high value
// (e.g. 10000), then decrease until the model stops
// working as expected
#define ARENA_SIZE 2000
#define TF_NUM_OPS 2

Eloquent::TF::Sequential<TF_NUM_OPS, ARENA_SIZE> tf;



void setup() {
  // put your setup code here, to run once:
  Serial.begin(115200);
  delay(3000);
  Serial.println("__TENSORFLOW IRIS__");

  // configure input/output
  // (not mandatory if you generated the .h model
  // using the everywhereml Python package)
  tf.setNumInputs(1);
  tf.setNumOutputs(1);
  // add required ops
  // (not mandatory if you generated the .h model
  // using the everywhereml Python package)
  // tf.resolver.AddFullyConnected();
  // tf.resolver.AddSoftmax();

  while (!tf.begin(sine_model).isOk())
    Serial.println(tf.exception.toString());
}

void loop() {
  // pick up a random x and predict its sine
  float x = 3.14 * random(100) / 100;
  float y = sin(x);
  float input[1] = { x };

  Serial.print("sin(");
  Serial.print(x);
  Serial.print(") = ");
  Serial.print(y);
  if (!tf.predict(input).isOk()) {
    Serial.println(tf.exception.toString());
    return;
  }
  Serial.print("\t predicted: ");
  Serial.println(tf.classification);
  // 此处会取出运算结果中最大值，如果需要取出所有值，可以获取tf.outputs，这个是一个float*

  // how long does it take to run a single prediction?
  Serial.print("It takes ");
  Serial.print(tf.benchmark.microseconds());
  Serial.println("us for a single prediction");
  delay(100);
}

缺少运行截图，后续s3板子好后补充

进阶

下面对训练模型算法进行修正，让预测更加准确

对训练集、验证集和测试集画图

    # split into train, validation, test
    TRAIN_SPLIT = int(0.6 * SAMPLES)
    TEST_SPLIT = int(0.2 * SAMPLES + TRAIN_SPLIT)
    x_train, x_test, x_validate = np.split(x_values, [TRAIN_SPLIT, TEST_SPLIT])
    y_train, y_test, y_validate = np.split(y_values, [TRAIN_SPLIT, TEST_SPLIT])

    plt.plot(x_train, y_train, 'b.', label="Train")
    plt.plot(x_validate, y_validate, 'y.', label="Validate")
    plt.plot(x_test, y_test, 'r.', label="Test")
    plt.legend()
    plt.show()

在训练过程中，模型的性能不断地根据我们的训练数据和我们早先留出的验证数据进行测量。训练产生一个数据日志，告诉我们模型的性能在训练过程中是如何变化的。

    history_1 = model.fit(x_train,
              y_train,
              epochs=200,
              batch_size=16,
              validation_data=(x_validate, y_validate))
    
    # Draw a graph of the loss, which is the distance between
    # the predicted and actual values during training and validation.
    loss = history_1.history['loss']
    val_loss = history_1.history['val_loss']

    epochs = range(1, len(loss) + 1)

    plt.plot(epochs, loss, 'g.', label='Training loss')
    plt.plot(epochs, val_loss, 'b', label='Validation loss')
    plt.title('Training and validation loss')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()
    plt.show()

针对局部观察

    # Exclude the first few epochs so the graph is easier to read
    SKIP = 100

    plt.plot(epochs[SKIP:], loss[SKIP:], 'g.', label='Training loss')
    plt.plot(epochs[SKIP:], val_loss[SKIP:], 'b.', label='Validation loss')
    plt.title('Training and validation loss')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()
    plt.show()

下面我们将绘制平均绝对误差MAE，这是另一种衡量网络预测距离实际数字有多远的方法:

    # Draw a graph of mean absolute error, which is another way of
    # measuring the amount of error in the prediction.
    mae = history_1.history['mae']
    val_mae = history_1.history['val_mae']

    plt.plot(epochs[SKIP:], mae[SKIP:], 'g.', label='Training MAE')
    plt.plot(epochs[SKIP:], val_mae[SKIP:], 'b.', label='Validation MAE')
    plt.title('Training and validation mean absolute error')
    plt.xlabel('Epochs')
    plt.ylabel('MAE')
    plt.legend()
    plt.show()

可以看出，200次的训练最终MAE误差大约在30%左右。

我们画出拟合曲线看看：

    # Use the model to make predictions from our validation data
    predictions = model.predict(x_train)

    # Plot the predictions along with to the test data
    plt.clf()
    plt.title('Training data predicted vs actual values')
    plt.plot(x_test, y_test, 'b.', label='Actual')
    plt.plot(x_train, predictions, 'r.', label='Predicted')
    plt.legend()
    plt.show()

当我们把拟合次数改为1000后

可以看出，1000次的训练最终MAE误差大约在11%以下。

我们画出拟合曲线看看：

这张图清楚地表明，我们的网络已经学会了以一种非常有限的方式近似正弦函数。这些预测是高度线性的，只能非常粗略地符合数据。这种拟合的刚性表明，该模型没有足够的能力学习正弦波函数的全部复杂性，所以它只能以一种过于简单的方式近似它。把我们的模型做大，我们就能提高它的性能。

我们也可以将神经网络层数减少为两层试一下

    # create a NN with 2 layers of 16 neurons
    model = tf.keras.Sequential()
    model.add(layers.Dense(16, activation='relu', input_shape=(1, )))
    model.add(layers.Dense(1))

数据图会差很多

使用TensorFlowLite_ESP32库

安装库

需要使用ESP32-S3目标模块，并且修改代码TensorFlowLite_ESP32\src\tensorflow\lite\micro\compatibility.h

#ifdef TF_LITE_STATIC_MEMORY
#define TF_LITE_REMOVE_VIRTUAL_DELETE \
  // void operator delete(void* p) {}
#else
#define TF_LITE_REMOVE_VIRTUAL_DELETE
#endif

#endif  // TENSORFLOW_LITE_MICRO_COMPATIBILITY_H_

ino代码

#include <TensorFlowLite_ESP32.h>
#include <tensorflow/lite/micro/all_ops_resolver.h>

#include "sine_model.h"
#include "tensorflow/lite/micro/micro_error_reporter.h"
#include "tensorflow/lite/micro/micro_interpreter.h"
#include "tensorflow/lite/micro/system_setup.h"
#include "tensorflow/lite/schema/schema_generated.h"

// This is a small number so that it's easy to read the logs
const int kInferencesPerCycle = 20;

// 定义了一个常量 kXrange，其值为 2 * π，用于表示模型训练时使用的 x 值的范围。这里用 2 * π 来近似 π
// 值，以避免引入额外的库。
const float kXrange = 2.f * 3.14159265359f;

// Globals, used for compatibility with Arduino-style sketches.
namespace
{
tflite::ErrorReporter* error_reporter = nullptr;
const tflite::Model* model = nullptr;
tflite::MicroInterpreter* interpreter = nullptr;
TfLiteTensor* input = nullptr;
TfLiteTensor* output = nullptr;
int inference_count = 0;

constexpr int kTensorArenaSize = 2000;
uint8_t tensor_arena[kTensorArenaSize];
}  // namespace

void setup()
{
    // 用于设置日志记录的。
    // NOLINTNEXTLINE(runtime-global-variables)是一个编译器指令，用于告诉Lint工具（如cpplint）不要警告全局变量的使用。
    // NOLINTNEXTLINE(runtime-global-variables)
    static tflite::MicroErrorReporter micro_error_reporter;
    error_reporter = &micro_error_reporter;

    // 用于将模型（sine_model）映射到一个可用的数据结构中。这是一个非常轻量级的操作，不涉及复制或解析。
    model = tflite::GetModel(sine_model);
    if (model->version() != TFLITE_SCHEMA_VERSION) {
        TF_LITE_REPORT_ERROR(error_reporter,
                             "Model provided is schema version %d not equal "
                             "to supported version %d.",
                             model->version(), TFLITE_SCHEMA_VERSION);
        return;
    }

    // 用于解析TensorFlow Lite操作实现。
    // NOLINTNEXTLINE(runtime-global-variables)
    static tflite::AllOpsResolver resolver;

    // 用于构建一个TensorFlow Lite模型解释器
    static tflite::MicroInterpreter static_interpreter(model, resolver, tensor_arena, kTensorArenaSize, error_reporter);
    interpreter = &static_interpreter;

    // 用于获取模型（interpreter）的输入和输出张量（tensors）的指针。
    // 这里使用了interpreter->input(0)和interpreter->output(0)来获取输入和输出张量，其中0表示张量的索引。
    input = interpreter->input(0);
    output = interpreter->output(0);

    // Keep track of how many inferences we have performed.
    inference_count = 0;
}

void loop()
{
    // 计算要输入到模型中的 x 值。我们将当前inference_count与每个周期的推理次数进行比较，以确定我们在模型训练的可能 x
    // 值范围内的位置，并使用它来计算值。
    float position = static_cast<float>(inference_count) / static_cast<float>(kInferencesPerCycle);
    float x = position * kXrange;

    // 通过除以输入张量的比例因子（scale）并加上零点（zero_point）将浮点数x转换为整数x_quantized。
    int8_t x_quantized = x / input->params.scale + input->params.zero_point;
    // 将x_quantized放入模型的输入张量input->data.int8[0]中
    input->data.int8[0] = x_quantized;

    // 执行模型推理, and report any error
    TfLiteStatus invoke_status = interpreter->Invoke();
    if (invoke_status != kTfLiteOk) {
        TF_LITE_REPORT_ERROR(error_reporter, "Invoke failed on x: %f\n", static_cast<double>(x));
        return;
    }

    // 从模型输出的张量中获取量化后的输出值，存储在变量 y_quantized 中。
    int8_t y_quantized = output->data.int8[0];
    // 将 y_quantized 从整数转换为浮点数，并将结果存储在变量 y
    // 中。这里需要减去输出张量的零点（output->params.zero_point）并乘以输出张量的缩放因子（output->params.scale）。
    float y = (y_quantized - output->params.zero_point) * output->params.scale;

    // Output the results. A custom HandleOutput function can be implemented
    // for each supported hardware target.
    // Log the current X and Y values
    TF_LITE_REPORT_ERROR(error_reporter, "x_value: %f, y_value: %f\n", static_cast<double>(x), static_cast<double>(y));

    // Increment the inference_counter, and reset it if we have reached
    // the total number per cycle
    inference_count += 1;
    if (inference_count >= kInferencesPerCycle) inference_count = 0;
}

缺少运行截图，后续s3板子好后补充