本笔记主要记录tf.GradientTape和tf.gradient的用法
import tensorflow as tf
import numpy as nptf.__version__#要计算梯度的所有参数计算过程必须放到gradient tape中
#with tf.GradientTape as tape:
w tf.constant(1.)
x tf.constant(2.)with tf.GradientTape() as tap…
Reinforcement Learning from Human Feedback
基于Google Vertex AI 和 Llama 2进行RLHF训练和评估
课程地址:https://www.deeplearning.ai/short-courses/reinforcement-learning-from-human-feedback/
Topic:
Get a conceptual understanding of Reinforcemen…