背景:Unity接入的是 Google Protobuf 3.21.12 版本,排查下来反序列化过程中的一些GC点,处理了几个严重的,网上也有一些分析,这里就不一一展开,默认读者已经略知一二了。
如果下面有任何问题请评论区留言提出,我会留意修改的!
GC点1
每次反序列化解析Message的时候,会将Stream传给MessageParser.cs,然后传给MessageExtensions.cs,这里每次都会new CodeInputStream();造成GC(如下图1,2)
这里的做法是改成了单例Instance,将每处new改成获取单例,然后调用Reset,参考以下部分代码,替换单例的调用代码这里略过(搜引用即可)。
这里有个易错点,Reset的bytes.length值,必须传(0,0),我改成(0,bytes.length)报错了,参考CodedInputStream构造函数本身也是(0,0)。
private static CodedInputStream _bytesInstance;
public static CodedInputStream GetBytesInstance(byte[] buffer, int bufferPos, int bufferSize)
{
if (_bytesInstance == null)
{
_bytesInstance = new CodedInputStream(buffer, bufferPos, bufferSize);
}
else
{
_bytesInstance.Reset(buffer, bufferPos, bufferSize, true);
}
return _bytesInstance;
}
private static byte[] bytes = new byte[BufferSize];
private static CodedInputStream _streamInstance;
public static CodedInputStream GetSteamInstance(Stream input)
{
if (_streamInstance == null)
{
_streamInstance = new CodedInputStream(input);
}
else
{
_streamInstance.Reset(bytes, 0, 0, false, ProtoPreconditions.CheckNotNull(input, "input"));
}
return _streamInstance;
}
private static CodedInputStream _streamBytesInstance;
public static CodedInputStream GetSteamBytesInstance(Stream input, byte[] buffer)
{
if (_streamBytesInstance == null)
{
_streamBytesInstance = new CodedInputStream(input, buffer);
}
else
{
_streamBytesInstance.Reset(buffer, 0, 0, false, ProtoPreconditions.CheckNotNull(input, "input"));
}
return _streamBytesInstance;
}
...
...
...
public void Reset(byte[] buffer, int bufferPos, int bufferSize, bool leaveOpen, Stream input = null)
{
this.input = input;
this.buffer = buffer;
this.state = default;
this.state.bufferPos = bufferPos;
this.state.bufferSize = bufferSize;
this.state.sizeLimit = DefaultSizeLimit;
this.state.recursionLimit = DefaultRecursionLimit;
SegmentedBufferHelper.Initialize(this, out this.state.segmentedBufferHelper);
this.leaveOpen = leaveOpen;
this.state.currentLimit = int.MaxValue;
}
GC点2
protoc.exe 生成的proto message 的 cs 模板代码,都会带一个Parser给业务方使用,使用Parser来反序列化数据流(下图)
然后仔细看生成的代码(下图),_parser是static readonly,初始化的时候就构造好了,常驻内存,但这里有个延迟初始化,将lambda () => new ToyTrackingSurvivorData() 透传给MessageParser。
我们看看MessageParser做了啥(下图)
这里的ParseFrom是我们业务调过来的,也就是每一次的反序列化,都会factory()一次,GC点无疑了,那么问题已经找到了,需要怎么解决呢。
一开始想的是这里也做成单例,每次factory()改成每次先reset然后再返回,但报错了,错误原因是当.proto里面的字段是repeated或者map的时候,需要同时factory()多个对象出来,这里单例就走不通了,那么就做对象池把。
关于对象池设计的思考:
- Protobuf源码里需要有一个池子,每次factory()实例化给出去的对象,业务用完了要回池子,下次业务取的时候优先从池子里面取
- Parser每次MergeFrom的时候(这里可以理解为每次业务从池子里取出来的时候),需要把从池子里取出来的对象数据成员都Reset为default,或者Clear数据,这里值类型是default,repeated & map是引用类型,需要Clear,注意:存在proto里面是repeated<message>套repeated<message>再套repeated<int>的情况,所以需要考虑递归去清理。
- 因为Parser所在的cs文件是protoc.exe生成的代码,需要改生成模板的代码工具,也就是protoc.exe的源码
- 设计业务回收池策略,也就是业务什么时候用完,返给池子
关于第一点我这里踩了个小坑,因为考虑到每个message的类型都不一样,所以需要做Dictionary<className, MObjectPool>的池子,也实现了,但发现每次池子里的个数都是1,才反应过来下面这段代码的设计理念
private static readonly pb::MessageParser<ToyTrackingSurvivorData> _parser = new pb::MessageParser<ToyTrackingSurvivorData>(() => new ToyTrackingSurvivorData());
它通过范型MessageParser<T>生成了无数个_parser<T>,每个message类都一一对应,这样也不需要做Dictionary了,也就是每个Parser都自带一个MObjectPool,代码就简洁多了。
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Text;
namespace Google.Protobuf
{
internal interface IObjectPool
{
int countAll { get; } // 总对象个数
int countActive { get; } // 当前活跃对象个数
int countInActive { get; } // 当前队列可用对象个数
}
internal class MObjectPool<T> : IObjectPool
{
private const int LimitNum = 1024;
private readonly Queue<T> _queue = new Queue<T>(LimitNum);
private readonly Func<T> _create;
private readonly Action<T> _get;
private readonly Action<T> _release;
private readonly Action<T> _destroy;
public int countAll { get; private set; }
public int countActive { get { return countAll - countInActive; } }
public int countInActive { get { return _queue.Count; } }
public MObjectPool(Func<T> create, Action<T> get = null, Action<T> release = null, Action<T> destroy = null)
{
_create = create;
_get = get;
_release = release;
_destroy = destroy;
}
public T Get()
{
T t;
if (_queue.Count == 0)
{
t = _create();
countAll++;
}
else
{
t = _queue.Dequeue();
}
_get?.Invoke(t);
return t;
}
public void Recycle(T t)
{
if (t == null) return;
if (countInActive < LimitNum)
{
_queue.Enqueue(t);
}
else
{
countAll--;
}
_release?.Invoke(t);
}
public void Destroy()
{
if (_destroy != null)
{
while(_queue.Count > 0)
{
_destroy(_queue.Dequeue());
}
}
_queue.Clear();
countAll = 0;
}
}
public class MObjcetPoolMgr<T> where T : IMessage<T>
{
private MObjectPool<T> _pool;
private static MObjcetPoolMgr<T> _instance;
public static MObjcetPoolMgr<T> Instance
{
get
{
if (_instance == null)
{
_instance = new MObjcetPoolMgr<T>();
}
return _instance;
}
}
public T Get(Func<T> create, Action<T> get = null, Action<T> release = null, Action<T> clear = null)
{
if (_pool == null)
{
_pool = new MObjectPool<T>(create, get, release, clear);
}
var t = _pool.Get();
//log("Get");
return t;
}
public void Recycle(T t)
{
_pool.Recycle(t);
//log("Recycle");
}
public void Destroy()
{
_pool.Destroy();
//log("Destroy");
}
private static StringBuilder str = new StringBuilder();
private void log(string op)
{
str.Clear();
str.Append($"[{nameof(MObjcetPoolMgr<T>)}][{op}] {typeof(T).Name} countAll:{_pool.countAll.ToString()} countActive:{_pool.countActive.ToString()} countInActive:{_pool.countInActive.ToString()}");
UnityEngine.Debug.Log(str.ToString());
}
}
}
关于MessageParser的调用如下(简略版),这样factory()的替代品池子就做好了!
public new T ParseFrom(CodedInputStream input)
{
//T message = factory();
T message = _poolGet();
MergeFrom(message, input);
return message;
}
private T _poolGet()
{
return MObjcetPoolMgr<T>.Instance.Get(factory);
}
public void PoolRecycle(T t)
{
if (t == null) return;
MObjcetPoolMgr<T>.Instance.Recycle(t);
}
public void PoolDestroy()
{
MObjcetPoolMgr<T>.Instance.Destroy();
}
下面关于2 3两点其实是一个问题,就是如何修改protoc.exe生成的模板代码,这里网上的参考资料有一些零碎,我也是拼起来写完的,思路就是在每个message class里加一个MessageClear方法,来清理池子里的数据,然后在每次用的时候,调用下MessageClear()就行了,直接看我的修改
csharp_message.cc
第一处修改:
WriteGeneratedCodeAttributes(printer);
printer->Print("public void MessageClear()\n{\n");
for (int i = 0; i < descriptor_->field_count(); i++){
const FieldDescriptor* fieldDescriptor = descriptor_->field(i);
std::string fieldName = UnderscoresToCamelCase(fieldDescriptor->name(), false);
if (fieldDescriptor->type() == FieldDescriptor::Type::TYPE_MESSAGE || fieldDescriptor->type() == FieldDescriptor::Type::TYPE_GROUP) {
if (fieldDescriptor->is_repeated()) {
if (fieldDescriptor->is_map()) {
if (fieldDescriptor->message_type()->map_value()->type() == FieldDescriptor::Type::TYPE_MESSAGE || fieldDescriptor->message_type()->map_value()->type() == FieldDescriptor::Type::TYPE_GROUP){
printer->Print(" if($field_name$_ != null) { for (int i = 0; i < $field_name$_.Count; i++) { $field_name$_[i].MessageClear(); } $field_name$_.Clear(); }\n", "field_name", fieldName);
} else {
printer->Print(" if($field_name$_ != null) $field_name$_.Clear();\n", "field_name", fieldName);
}
} else {
printer->Print(" if($field_name$_ != null) { for (int i = 0; i < $field_name$_.Count; i++) { $field_name$_[i].MessageClear(); } $field_name$_.Clear(); }\n", "field_name", fieldName);
}
} else {
printer->Print(" if($field_name$_ != null) $field_name$_.MessageClear();\n", "field_name", fieldName);
}
}
else if (fieldDescriptor->type() == FieldDescriptor::Type::TYPE_BYTES) {
if (fieldDescriptor->is_repeated()) {
printer->Print(" if($field_name$_ != null) $field_name$_.Clear();\n", "field_name", fieldName);
} else {
printer->Print(" if($field_name$_.Length != 0) $field_name$_ = pb::ByteString.Empty;\n", "field_name", fieldName);
}
}
else if (fieldDescriptor->type() == FieldDescriptor::Type::TYPE_ENUM){
if (fieldDescriptor->is_repeated()) {
printer->Print(" if($field_name$_ != null) $field_name$_.Clear();\n", "field_name", fieldName);
} else {
printer->Print(
" $field_name$_ = $field_type$.$default_value$;\n", "field_type", GetClassName(fieldDescriptor->enum_type()), "field_name", fieldName, "default_value", GetEnumValueName(fieldDescriptor->default_value_enum()->type()->name(), fieldDescriptor->default_value_enum()->name()));
}
}
else{
if (fieldDescriptor->is_repeated()) {
printer->Print(" if($field_name$_ != null) $field_name$_.Clear();\n", "field_name", fieldName);
} else {
printer->Print(
" $field_name$_ = $default_value$;\n", "field_name", fieldName, "default_value", "default");
}
}
}
printer->Print("}\n");
csharp_message.cc
第二处修改:
printer->Print("MessageClear();\n");
csharp_message.cc
第三处修改:
printer->Indent();
printer->Print("MessageClear();\n");
printer->Outdent();
到此,protoc.exe的生成代码就改好了,解决了2 3点的问题!
接下来是第四点,业务代码的回收策略了,这里比较吃项目,有很多需要手改的地方,但好在也有模板,可以参考下,我们使用了ProtoGen.exe工具生成协议代码,每次协议使用完之后回收进池子就OK了。