C#实现自己的Json解析器(LALR(1)+miniDFA)

news2025/3/25 21:09:52

C#实现自己的Json解析器(LALR(1)+miniDFA)

Json是一个用处广泛、文法简单的数据格式。本文介绍如何用bitParser(拥有自己的解析器(C#实现LALR(1)语法解析器和miniDFA词法分析器的生成器)迅速实现一个简单高效的Json解析器。

读者可在(https://gitee.com/bitzhuwei/bitParser-demos/tree/master/bitzhuwei.JsonFormat.TestConsole)查看、下载完整代码。

Json格式的文法

我们可以在(https://ecma-international.org/wp-content/uploads/ECMA-404_2nd_edition_december_2017.pdf )找到Json格式的详细说明。据此,可得如下文法:

// Json grammar according to ECMA-404 2nd Edition / December 2017
Json = Object | Array ;
Object = '{' '}' | '{' Members '}' ;
Array = '[' ']' | '[' Elements ']' ;
Members = Members ',' Member | Member ;
Elements = Elements ',' Element | Element ;
Member = 'string' ':' Value ;
Element = Value ;
Value = 'null' | 'true' | 'false' | 'number' | 'string'
      | Object | Array ;

%%"([^"\\\u0000-\u001F]|\\["\\/bfnrt]|\\u[0-9A-Fa-f]{4})*"%% 'string'
%%[-]?(0|[1-9][0-9]*)([.][0-9]+)?([eE][+-]?[0-9]+)?%% 'number'

实际上这个文法是我用AI写出来后再整理成的。

此文法说明:

  1. 一个Json要么是一个Object,要么是一个Array

  2. 一个Object包含0-多个键值对("key" : value),用{ }括起来。

  3. 一个Array包含0-多个value,用[ ]括起来。

  4. 一个value有如下几种类型:nulltruefalsenumberstringObjectArray

其中:

nulltruefalse就是字面意思,因而可以省略不写。如果要在文法中显式地书写,就是这样:

%%null%% 'null'
%%true%% 'true'
%%false%% 'false'

{}[],:也都是字面意思,因而可以省略不写。如果要在文法中显式地书写,就是这样:

%%\{%% '{'
%%}%% '}'
%%\[%% '['
%%]%% ']'
%%,%% ','
%%:%% ':'

number可由下图描述:

image

图上直观地说明了number这个token的正则表达式由4个依次排列的部分组成:

[-]?  (0|[1-9][0-9]*)  ([.][0-9]+)?  ([eE][+-]?[0-9]+)?

string可由下图描述:

image

图上直观地说明了string这个token的正则表达式是用"包裹起来的某些字符或转义字符:

" (  [^"\\\u0000-\u001F]  |  \\["\\/bfnrt]  |  \\u[0-9A-Fa-f]{4}  )*  "
/*
实际含义为:
非"、非\、非控制字符(\u0000-\u001F)
\"、\\、\/、\b、\f、\n、\r、\t
\uNNNN
*/

Value = Object | Array;说明Json中的数据是可以嵌套的。

将此文法作为输入,提供给bitParser,就可以一键生成下述章节介绍的Json解析器代码和文档了。

生成的词法分析器

image

DFA

image

DFA文件夹下是依据确定的有限自动机原理生成的词法分析器的全部词法状态。

初始状态lexicalState0

using System;
using bitzhuwei.Compiler;

namespace bitzhuwei.JsonFormat {
    partial class CompilerJson {
        private static readonly Action<LexicalContext, char, CurrentStateWrap> lexicalState0 =
        static (context, c, wrap) => {
            if (false) { /* for simpler code generation purpose. */ }
            /* user-input condition code */
            /* [1-9] */
            else if (/* possible Vt : 'number' */
            /* no possible signal */
            /* [xxx] scope */
            '1'/*'\u0031'(49)*/ <= c && c <= '9'/*'\u0039'(57)*/) {
                BeginToken(context);
                ExtendToken(context, st.@number);
                wrap.currentState = lexicalState1;
            }
            /* user-input condition code */
            /* 0 */
            else if (/* possible Vt : 'number' */
            /* no possible signal */
            /* single char */
            c == '0'/*'\u0030'(48)*/) {
                BeginToken(context);
                ExtendToken(context, st.@number);
                wrap.currentState = lexicalState2;
            }
            /* user-input condition code */
            /* [-] */
            else if (/* possible Vt : 'number' */
            /* no possible signal */
            /* [xxx] scope */
            c == '-'/*'\u002D'(45)*/) {
                BeginToken(context);
                wrap.currentState = lexicalState3;
            }
            /* user-input condition code */
            /* " */
            else if (/* possible Vt : 'string' */
            /* no possible signal */
            /* single char */
            c == '"'/*'\u0022'(34)*/) {
                BeginToken(context);
                wrap.currentState = lexicalState4;
            }
            /* user-input condition code */
            /* f */
            else if (/* possible Vt : 'false' */
            /* no possible signal */
            /* single char */
            c == 'f'/*'\u0066'(102)*/) {
                BeginToken(context);
                wrap.currentState = lexicalState5;
            }
            /* user-input condition code */
            /* t */
            else if (/* possible Vt : 'true' */
            /* no possible signal */
            /* single char */
            c == 't'/*'\u0074'(116)*/) {
                BeginToken(context);
                wrap.currentState = lexicalState6;
            }
            /* user-input condition code */
            /* n */
            else if (/* possible Vt : 'null' */
            /* no possible signal */
            /* single char */
            c == 'n'/*'\u006E'(110)*/) {
                BeginToken(context);
                wrap.currentState = lexicalState7;
            }
            /* user-input condition code */
            /* : */
            else if (/* possible Vt : ':' */
            /* no possible signal */
            /* single char */
            c == ':'/*'\u003A'(58)*/) {
                BeginToken(context);
                ExtendToken(context, st.@Colon符);
                wrap.currentState = lexicalState8;
            }
            /* user-input condition code */
            /* , */
            else if (/* possible Vt : ',' */
            /* no possible signal */
            /* single char */
            c == ','/*'\u002C'(44)*/) {
                BeginToken(context);
                ExtendToken(context, st.@Comma符);
                wrap.currentState = lexicalState9;
            }
            /* user-input condition code */
            /* ] */
            else if (/* possible Vt : ']' */
            /* no possible signal */
            /* single char */
            c == ']'/*'\u005D'(93)*/) {
                BeginToken(context);
                ExtendToken(context, st.@RightBracket符);
                wrap.currentState = lexicalState10;
            }
            /* user-input condition code */
            /* \[ */
            else if (/* possible Vt : '[' */
            /* no possible signal */
            /* single char */
            c == '['/*'\u005B'(91)*/) {
                BeginToken(context);
                ExtendToken(context, st.@LeftBracket符);
                wrap.currentState = lexicalState11;
            }
            /* user-input condition code */
            /* } */
            else if (/* possible Vt : '}' */
            /* no possible signal */
            /* single char */
            c == '}'/*'\u007D'(125)*/) {
                BeginToken(context);
                ExtendToken(context, st.@RightBrace符);
                wrap.currentState = lexicalState12;
            }
            /* user-input condition code */
            /* \{ */
            else if (/* possible Vt : '{' */
            /* no possible signal */
            /* single char */
            c == '{'/*'\u007B'(123)*/) {
                BeginToken(context);
                ExtendToken(context, st.@LeftBrace符);
                wrap.currentState = lexicalState13;
            }
            /* deal with everything else. */
            else if (c == ' ' || c == '\r' || c == '\n' || c == '\t' || c == '\0') {
                wrap.currentState = lexicalState0; // skip them.
            }
            else { // unexpected char.
                BeginToken(context);
                ExtendToken(context);
                AcceptToken(st.Error错, context);
                wrap.currentState = lexicalState0;
            }
        };
    }
}

DFA文件夹下的实现是最初的也是最直观的实现。它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

miniDFA

image

miniDFA文件夹下是依据Hopcroft算法得到的最小化的有限自动机的全部词法状态。它与DFA的区别仅在于词法状态数量可能减少了。

它是第二个实现,它也已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

tableDFA

image

tableDFA文件夹下是二维数组形式(ElseIf[][])的miniDFA。它与miniDFA表示的内容相同,区别在于:miniDFA用一个函数(Action<LexicalContext, char, CurrentStateWrap>)表示一个词法状态,而它用一个数组(ElseIf[])表示一个词法状态。这样可以减少内存占用。

二维数组形式的miniDFA

using System;
using bitzhuwei.Compiler;

namespace bitzhuwei.JsonFormat {
    partial class CompilerJson {
        private static readonly ElseIf[] omitChars = new ElseIf[] {
                new('\u0000'/*(0)*/, nextStateId: 0, Acts.None),
                new('\t'/*'\u0009'(9)*/, '\n'/*'\u000A'(10)*/, nextStateId: 0, Acts.None),
                new('\r'/*'\u000D'(13)*/, nextStateId: 0, Acts.None),
                new(' '/*'\u0020'(32)*/, nextStateId: 0, Acts.None),

        };
        private static readonly ElseIf[][] lexiStates = new ElseIf[47][];
        static void InitializeLexiTable() {
            ElseIf segment_48_48_25_3_ints_number = new('0'/*'\u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number);//refered 2 times
            ElseIf segment_49_57_24_3_ints_number = new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number);//refered 2 times
            ElseIf segment_48_57_37_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number);//refered 3 times
            ElseIf segment_48_57_38_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 38, Acts.Extend, st.@number);//refered 2 times
            ElseIf segment_48_57_44_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number);//refered 3 times
            ElseIf segment_48_57_45_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 45, Acts.Extend, st.@number);//refered 2 times
            ElseIf segment_46_46_8_0 = new('.'/*'\u002E'(46)*/, 8, Acts.None);//refered 9 times
            ElseIf segment_48_48_33_2_ints_number = new('0'/*'\u0030'(48)*/, 33, Acts.Extend, st.@number);//refered 2 times
            ElseIf segment_49_57_32_2_ints_number = new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 32, Acts.Extend, st.@number);//refered 2 times
            ElseIf segment_69_69_7_0 = new('E'/*'\u0045'(69)*/, 7, Acts.None);//refered 11 times
            ElseIf segment_101_101_7_0 = new('e'/*'\u0065'(101)*/, 7, Acts.None);//refered 11 times
            ElseIf segment_0_65535_0_4_ints_number = new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number);//refered 13 times
            ElseIf segment_48_48_40_2_ints_number = new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number);//refered 3 times
            ElseIf segment_49_57_39_2_ints_number = new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number);//refered 3 times
            ElseIf segment_48_57_41_2_ints_number = new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 41, Acts.Extend, st.@number);//refered 2 times
            lexiStates[0] = new ElseIf[] {
            // possible Vt: 'string'
            /*0*/new('"'/*'\u0022'(34)*/, 2, Acts.Begin),
            // possible Vt: ','
            /*1*/new(','/*'\u002C'(44)*/, 27, Acts.Begin | Acts.Extend, st.@Comma符),
            // possible Vt: 'number'
            /*2*/new('-'/*'\u002D'(45)*/, 1, Acts.Begin),
            // possible Vt: 'number'
            /*3*///new('0'/*'\u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number),
            /*3*/segment_48_48_25_3_ints_number,
            // possible Vt: 'number'
            /*4*///new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number),
            /*4*/segment_49_57_24_3_ints_number,
            // possible Vt: ':'
            /*5*/new(':'/*'\u003A'(58)*/, 26, Acts.Begin | Acts.Extend, st.@Colon符),
            // possible Vt: '['
            /*6*/new('['/*'\u005B'(91)*/, 29, Acts.Begin | Acts.Extend, st.@LeftBracket符),
            // possible Vt: ']'
            /*7*/new(']'/*'\u005D'(93)*/, 28, Acts.Begin | Acts.Extend, st.@RightBracket符),
            // possible Vt: 'false'
            /*8*/new('f'/*'\u0066'(102)*/, 3, Acts.Begin),
            // possible Vt: 'null'
            /*9*/new('n'/*'\u006E'(110)*/, 5, Acts.Begin),
            // possible Vt: 'true'
            /*10*/new('t'/*'\u0074'(116)*/, 4, Acts.Begin),
            // possible Vt: '{'
            /*11*/new('{'/*'\u007B'(123)*/, 31, Acts.Begin | Acts.Extend, st.@LeftBrace符),
            // possible Vt: '}'
            /*12*/new('}'/*'\u007D'(125)*/, 30, Acts.Begin | Acts.Extend, st.@RightBrace符),
            };
            lexiStates[1] = new ElseIf[] {
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, 25, Acts.Begin | Acts.Extend, st.@number),
            segment_48_48_25_3_ints_number,
            // possible Vt: 'number'
            //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 24, Acts.Begin | Acts.Extend, st.@number),
            segment_49_57_24_3_ints_number,
            };
            lexiStates[2] = new ElseIf[] {
            // possible Vt: 'string'
            new(' '/*'\u0020'(32)*/, '!'/*'\u0021'(33)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('"'/*'\u0022'(34)*/, 36, Acts.Extend, st.@string),
            // possible Vt: 'string'
            new('#'/*'\u0023'(35)*/, '['/*'\u005B'(91)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('\\'/*'\u005C'(92)*/, 9, Acts.None),
            // possible Vt: 'string'
            new(']'/*'\u005D'(93)*/, '\uFFFF'/*�(65535)*/, 2, Acts.None),
            };
            lexiStates[3] = new ElseIf[] {
            // possible Vt: 'false'
            new('a'/*'\u0061'(97)*/, 10, Acts.None),
            };
            lexiStates[4] = new ElseIf[] {
            // possible Vt: 'true'
            new('r'/*'\u0072'(114)*/, 6, Acts.None),
            };
            lexiStates[5] = new ElseIf[] {
            // possible Vt: 'null'
            new('u'/*'\u0075'(117)*/, 11, Acts.None),
            };
            lexiStates[6] = new ElseIf[] {
            // possible Vt: 'true'
            new('u'/*'\u0075'(117)*/, 18, Acts.None),
            };
            lexiStates[7] = new ElseIf[] {
            // possible Vt: 'number'
            new('+'/*'\u002B'(43)*/, 12, Acts.None),
            // possible Vt: 'number'
            new('-'/*'\u002D'(45)*/, 12, Acts.None),
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number),
            segment_48_57_37_2_ints_number,
            };
            lexiStates[8] = new ElseIf[] {
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 38, Acts.Extend, st.@number),
            segment_48_57_38_2_ints_number,
            };
            lexiStates[9] = new ElseIf[] {
            // possible Vt: 'string'
            new('"'/*'\u0022'(34)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('/'/*'\u002F'(47)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('\\'/*'\u005C'(92)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('b'/*'\u0062'(98)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('f'/*'\u0066'(102)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('n'/*'\u006E'(110)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('r'/*'\u0072'(114)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('t'/*'\u0074'(116)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('u'/*'\u0075'(117)*/, 13, Acts.None),
            };
            lexiStates[10] = new ElseIf[] {
            // possible Vt: 'false'
            new('l'/*'\u006C'(108)*/, 17, Acts.None),
            };
            lexiStates[11] = new ElseIf[] {
            // possible Vt: 'null'
            new('l'/*'\u006C'(108)*/, 19, Acts.None),
            };
            lexiStates[12] = new ElseIf[] {
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number),
            segment_48_57_37_2_ints_number,
            };
            lexiStates[13] = new ElseIf[] {
            // possible Vt: 'string'
            new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 14, Acts.None),
            // possible Vt: 'string'
            new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 14, Acts.None),
            // possible Vt: 'string'
            new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 14, Acts.None),
            };
            lexiStates[14] = new ElseIf[] {
            // possible Vt: 'string'
            new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 15, Acts.None),
            // possible Vt: 'string'
            new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 15, Acts.None),
            // possible Vt: 'string'
            new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 15, Acts.None),
            };
            lexiStates[15] = new ElseIf[] {
            // possible Vt: 'string'
            new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 16, Acts.None),
            // possible Vt: 'string'
            new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 16, Acts.None),
            // possible Vt: 'string'
            new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 16, Acts.None),
            };
            lexiStates[16] = new ElseIf[] {
            // possible Vt: 'string'
            new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('A'/*'\u0041'(65)*/, 'F'/*'\u0046'(70)*/, 2, Acts.None),
            // possible Vt: 'string'
            new('a'/*'\u0061'(97)*/, 'f'/*'\u0066'(102)*/, 2, Acts.None),
            };
            lexiStates[17] = new ElseIf[] {
            // possible Vt: 'false'
            new('s'/*'\u0073'(115)*/, 22, Acts.None),
            };
            lexiStates[18] = new ElseIf[] {
            // possible Vt: 'true'
            new('e'/*'\u0065'(101)*/, 42, Acts.Extend, st.@true),
            };
            lexiStates[19] = new ElseIf[] {
            // possible Vt: 'null'
            new('l'/*'\u006C'(108)*/, 43, Acts.Extend, st.@null),
            };
            lexiStates[20] = new ElseIf[] {
            // possible Vt: 'number'
            new('+'/*'\u002B'(43)*/, 23, Acts.None),
            // possible Vt: 'number'
            new('-'/*'\u002D'(45)*/, 23, Acts.None),
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number),
            segment_48_57_44_2_ints_number,
            };
            lexiStates[21] = new ElseIf[] {
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 45, Acts.Extend, st.@number),
            segment_48_57_45_2_ints_number,
            };
            lexiStates[22] = new ElseIf[] {
            // possible Vt: 'false'
            new('e'/*'\u0065'(101)*/, 46, Acts.Extend, st.@false),
            };
            lexiStates[23] = new ElseIf[] {
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number),
            segment_48_57_44_2_ints_number,
            };
            lexiStates[24] = new ElseIf[] {
            // possible Vt: 'number'
            //new('.'/*'\u002E'(46)*/, 8, Acts.None),
            segment_46_46_8_0,
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, 33, Acts.Extend, st.@number),
            segment_48_48_33_2_ints_number,
            // possible Vt: 'number'
            //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 32, Acts.Extend, st.@number),
            segment_49_57_32_2_ints_number,
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[25] = new ElseIf[] {
            // possible Vt: 'number'
            //new('.'/*'\u002E'(46)*/, 8, Acts.None),
            segment_46_46_8_0,
            // possible Vt: 'number'
            new('0'/*'\u0030'(48)*/, 35, Acts.Extend, st.@number),
            // possible Vt: 'number'
            new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 34, Acts.Extend, st.@number),
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[26] = new ElseIf[] {
            // possible Vt: ':'
            new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@Colon符),
            };
            lexiStates[27] = new ElseIf[] {
            // possible Vt: ','
            new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@Comma符),
            };
            lexiStates[28] = new ElseIf[] {
            // possible Vt: ']'
            new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@RightBracket符),
            };
            lexiStates[29] = new ElseIf[] {
            // possible Vt: '['
            new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@LeftBracket符),
            };
            lexiStates[30] = new ElseIf[] {
            // possible Vt: '}'
            new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@RightBrace符),
            };
            lexiStates[31] = new ElseIf[] {
            // possible Vt: '{'
            new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@LeftBrace符),
            };
            lexiStates[32] = new ElseIf[] {
            // possible Vt: 'number'
            //new('.'/*'\u002E'(46)*/, 8, Acts.None),
            segment_46_46_8_0,
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number),
            segment_48_48_40_2_ints_number,
            // possible Vt: 'number'
            //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number),
            segment_49_57_39_2_ints_number,
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[33] = new ElseIf[] {
            // possible Vt: 'number'
            //new('.'/*'\u002E'(46)*/, 8, Acts.None),
            segment_46_46_8_0,
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, 33, Acts.Extend, st.@number),
            segment_48_48_33_2_ints_number,
            // possible Vt: 'number'
            //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 32, Acts.Extend, st.@number),
            segment_49_57_32_2_ints_number,
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[34] = new ElseIf[] {
            // possible Vt: 'number'
            //new('.'/*'\u002E'(46)*/, 8, Acts.None),
            segment_46_46_8_0,
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 41, Acts.Extend, st.@number),
            segment_48_57_41_2_ints_number,
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[35] = new ElseIf[] {
            // possible Vt: 'number'
            //new('.'/*'\u002E'(46)*/, 8, Acts.None),
            segment_46_46_8_0,
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[36] = new ElseIf[] {
            // possible Vt: 'string'
            new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@string),
            };
            lexiStates[37] = new ElseIf[] {
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 37, Acts.Extend, st.@number),
            segment_48_57_37_2_ints_number,
            // possible Vt: 'number'
            new('E'/*'\u0045'(69)*/, 20, Acts.None),
            // possible Vt: 'number'
            new('e'/*'\u0065'(101)*/, 20, Acts.None),
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[38] = new ElseIf[] {
            // possible Vt: 'number'
            new('.'/*'\u002E'(46)*/, 21, Acts.None),
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 38, Acts.Extend, st.@number),
            segment_48_57_38_2_ints_number,
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[39] = new ElseIf[] {
            // possible Vt: 'number'
            //new('.'/*'\u002E'(46)*/, 8, Acts.None),
            segment_46_46_8_0,
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number),
            segment_48_48_40_2_ints_number,
            // possible Vt: 'number'
            //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number),
            segment_49_57_39_2_ints_number,
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[40] = new ElseIf[] {
            // possible Vt: 'number'
            //new('.'/*'\u002E'(46)*/, 8, Acts.None),
            segment_46_46_8_0,
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, 40, Acts.Extend, st.@number),
            segment_48_48_40_2_ints_number,
            // possible Vt: 'number'
            //new('1'/*'\u0031'(49)*/, '9'/*'\u0039'(57)*/, 39, Acts.Extend, st.@number),
            segment_49_57_39_2_ints_number,
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[41] = new ElseIf[] {
            // possible Vt: 'number'
            //new('.'/*'\u002E'(46)*/, 8, Acts.None),
            segment_46_46_8_0,
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 41, Acts.Extend, st.@number),
            segment_48_57_41_2_ints_number,
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[42] = new ElseIf[] {
            // possible Vt: 'true'
            new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@true),
            };
            lexiStates[43] = new ElseIf[] {
            // possible Vt: 'null'
            new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@null),
            };
            lexiStates[44] = new ElseIf[] {
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 44, Acts.Extend, st.@number),
            segment_48_57_44_2_ints_number,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[45] = new ElseIf[] {
            // possible Vt: 'number'
            //new('0'/*'\u0030'(48)*/, '9'/*'\u0039'(57)*/, 45, Acts.Extend, st.@number),
            segment_48_57_45_2_ints_number,
            // possible Vt: 'number'
            //new('E'/*'\u0045'(69)*/, 7, Acts.None),
            segment_69_69_7_0,
            // possible Vt: 'number'
            //new('e'/*'\u0065'(101)*/, 7, Acts.None),
            segment_101_101_7_0,
            // possible Vt: 'number'
            //new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@number),
            segment_0_65535_0_4_ints_number,
            };
            lexiStates[46] = new ElseIf[] {
            // possible Vt: 'false'
            new('\u0000'/*(0)*/, '\uFFFF'/*�(65535)*/, 0, Acts.Accept, st.@false),
            };
        }
    }
}

它是第三个实现,它也已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

Json.LexiTable.gen.bin

image

这是将二维数组形式(ElseIf[][])的miniDFA写入了一个二进制文件。加载Json解析器时,读取此文件即可得到二维数组形式(ElseIf[][])的miniDFA。这就不需要将整个ElseIf[][]硬编码到源代码中了,从而进一步减少了内存占用。

为了方便调试、参考,我为其准备了对应的文本格式:

Json.LexiTable.gen.txt

ElseIf
4 omit chars:
0('\u0000'/*(0)*/->'\u0000'/*(0)*/)=>None,0
0('\t'/*'\u0009'(9)*/->'\n'/*'\u000A'(10)*/)=>None,0
0('\r'/*'\u000D'(13)*/->'\r'/*'\u000D'(13)*/)=>None,0
0(' '/*'\u0020'(32)*/->' '/*'\u0020'(32)*/)=>None,0

0 re-used int[] Vts:
0 re-used IfVt ifVt:
0 re-used IfVt[] ifVts:
15 re-used ElseIf2 segment:
25('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Begin, Extend,11
24('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Begin, Extend,11
37('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,11
38('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,11
44('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,11
45('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,11
8('.'/*'\u002E'(46)*/->'.'/*'\u002E'(46)*/)=>None,0
33('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Extend,11
32('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Extend,11
7('E'/*'\u0045'(69)*/->'E'/*'\u0045'(69)*/)=>None,0
7('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>None,0
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,11
40('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Extend,11
39('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Extend,11
41('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>Extend,11
47 ElseIf2[] row:
LexiTable.Rows[0] has 13 segments:
2('"'/*'\u0022'(34)*/->'"'/*'\u0022'(34)*/)=>Begin,0
27(','/*'\u002C'(44)*/->','/*'\u002C'(44)*/)=>Begin, Extend,5
1('-'/*'\u002D'(45)*/->'-'/*'\u002D'(45)*/)=>Begin,0
-1
-2
26(':'/*'\u003A'(58)*/->':'/*'\u003A'(58)*/)=>Begin, Extend,7
29('['/*'\u005B'(91)*/->'['/*'\u005B'(91)*/)=>Begin, Extend,3
28(']'/*'\u005D'(93)*/->']'/*'\u005D'(93)*/)=>Begin, Extend,4
3('f'/*'\u0066'(102)*/->'f'/*'\u0066'(102)*/)=>Begin,0
5('n'/*'\u006E'(110)*/->'n'/*'\u006E'(110)*/)=>Begin,0
4('t'/*'\u0074'(116)*/->'t'/*'\u0074'(116)*/)=>Begin,0
31('{'/*'\u007B'(123)*/->'{'/*'\u007B'(123)*/)=>Begin, Extend,1
30('}'/*'\u007D'(125)*/->'}'/*'\u007D'(125)*/)=>Begin, Extend,2

LexiTable.Rows[1] has 2 segments:
-1
-2

LexiTable.Rows[2] has 5 segments:
2(' '/*'\u0020'(32)*/->'!'/*'\u0021'(33)*/)=>None,0
36('"'/*'\u0022'(34)*/->'"'/*'\u0022'(34)*/)=>Extend,6
2('#'/*'\u0023'(35)*/->'['/*'\u005B'(91)*/)=>None,0
9('\\'/*'\u005C'(92)*/->'\\'/*'\u005C'(92)*/)=>None,0
2(']'/*'\u005D'(93)*/->'\uFFFF'/*�(65535)*/)=>None,0

LexiTable.Rows[3] has 1 segments:
10('a'/*'\u0061'(97)*/->'a'/*'\u0061'(97)*/)=>None,0

LexiTable.Rows[4] has 1 segments:
6('r'/*'\u0072'(114)*/->'r'/*'\u0072'(114)*/)=>None,0

LexiTable.Rows[5] has 1 segments:
11('u'/*'\u0075'(117)*/->'u'/*'\u0075'(117)*/)=>None,0

LexiTable.Rows[6] has 1 segments:
18('u'/*'\u0075'(117)*/->'u'/*'\u0075'(117)*/)=>None,0

LexiTable.Rows[7] has 3 segments:
12('+'/*'\u002B'(43)*/->'+'/*'\u002B'(43)*/)=>None,0
12('-'/*'\u002D'(45)*/->'-'/*'\u002D'(45)*/)=>None,0
-3

LexiTable.Rows[8] has 1 segments:
-4

LexiTable.Rows[9] has 9 segments:
2('"'/*'\u0022'(34)*/->'"'/*'\u0022'(34)*/)=>None,0
2('/'/*'\u002F'(47)*/->'/'/*'\u002F'(47)*/)=>None,0
2('\\'/*'\u005C'(92)*/->'\\'/*'\u005C'(92)*/)=>None,0
2('b'/*'\u0062'(98)*/->'b'/*'\u0062'(98)*/)=>None,0
2('f'/*'\u0066'(102)*/->'f'/*'\u0066'(102)*/)=>None,0
2('n'/*'\u006E'(110)*/->'n'/*'\u006E'(110)*/)=>None,0
2('r'/*'\u0072'(114)*/->'r'/*'\u0072'(114)*/)=>None,0
2('t'/*'\u0074'(116)*/->'t'/*'\u0074'(116)*/)=>None,0
13('u'/*'\u0075'(117)*/->'u'/*'\u0075'(117)*/)=>None,0

LexiTable.Rows[10] has 1 segments:
17('l'/*'\u006C'(108)*/->'l'/*'\u006C'(108)*/)=>None,0

LexiTable.Rows[11] has 1 segments:
19('l'/*'\u006C'(108)*/->'l'/*'\u006C'(108)*/)=>None,0

LexiTable.Rows[12] has 1 segments:
-3

LexiTable.Rows[13] has 3 segments:
14('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,0
14('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,0
14('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0

LexiTable.Rows[14] has 3 segments:
15('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,0
15('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,0
15('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0

LexiTable.Rows[15] has 3 segments:
16('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,0
16('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,0
16('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0

LexiTable.Rows[16] has 3 segments:
2('0'/*'\u0030'(48)*/->'9'/*'\u0039'(57)*/)=>None,0
2('A'/*'\u0041'(65)*/->'F'/*'\u0046'(70)*/)=>None,0
2('a'/*'\u0061'(97)*/->'f'/*'\u0066'(102)*/)=>None,0

LexiTable.Rows[17] has 1 segments:
22('s'/*'\u0073'(115)*/->'s'/*'\u0073'(115)*/)=>None,0

LexiTable.Rows[18] has 1 segments:
42('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>Extend,9

LexiTable.Rows[19] has 1 segments:
43('l'/*'\u006C'(108)*/->'l'/*'\u006C'(108)*/)=>Extend,8

LexiTable.Rows[20] has 3 segments:
23('+'/*'\u002B'(43)*/->'+'/*'\u002B'(43)*/)=>None,0
23('-'/*'\u002D'(45)*/->'-'/*'\u002D'(45)*/)=>None,0
-5

LexiTable.Rows[21] has 1 segments:
-6

LexiTable.Rows[22] has 1 segments:
46('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>Extend,10

LexiTable.Rows[23] has 1 segments:
-5

LexiTable.Rows[24] has 6 segments:
-7
-8
-9
-10
-11
-12

LexiTable.Rows[25] has 6 segments:
-7
35('0'/*'\u0030'(48)*/->'0'/*'\u0030'(48)*/)=>Extend,11
34('1'/*'\u0031'(49)*/->'9'/*'\u0039'(57)*/)=>Extend,11
-10
-11
-12

LexiTable.Rows[26] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,7

LexiTable.Rows[27] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,5

LexiTable.Rows[28] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,4

LexiTable.Rows[29] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,3

LexiTable.Rows[30] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,2

LexiTable.Rows[31] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,1

LexiTable.Rows[32] has 6 segments:
-7
-13
-14
-10
-11
-12

LexiTable.Rows[33] has 6 segments:
-7
-8
-9
-10
-11
-12

LexiTable.Rows[34] has 5 segments:
-7
-15
-10
-11
-12

LexiTable.Rows[35] has 4 segments:
-7
-10
-11
-12

LexiTable.Rows[36] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,6

LexiTable.Rows[37] has 4 segments:
-3
20('E'/*'\u0045'(69)*/->'E'/*'\u0045'(69)*/)=>None,0
20('e'/*'\u0065'(101)*/->'e'/*'\u0065'(101)*/)=>None,0
-12

LexiTable.Rows[38] has 5 segments:
21('.'/*'\u002E'(46)*/->'.'/*'\u002E'(46)*/)=>None,0
-4
-10
-11
-12

LexiTable.Rows[39] has 6 segments:
-7
-13
-14
-10
-11
-12

LexiTable.Rows[40] has 6 segments:
-7
-13
-14
-10
-11
-12

LexiTable.Rows[41] has 5 segments:
-7
-15
-10
-11
-12

LexiTable.Rows[42] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,9

LexiTable.Rows[43] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,8

LexiTable.Rows[44] has 2 segments:
-5
-12

LexiTable.Rows[45] has 4 segments:
-6
-10
-11
-12

LexiTable.Rows[46] has 1 segments:
0('\u0000'/*(0)*/->'\uFFFF'/*�(65535)*/)=>Accept,10

它是第四个实现,这是目前使用的实现方式。为了加载路径上的方便,我将其从Json.gen\LexicalAnalyzer文件夹挪到了Json.gen文件夹下。

Json.LexicalScripts.gen.cs

这是各个词法分析状态都可能用到的函数,包括3类:BeginExtendAccept。其作用是:记录一个token的起始位置(Begin)和结束位置(Extend),设置其类型、行数、列数等信息,将其加入List<Token> tokens数组(Accept)。

Json.LexicalScripts.gen.cs

using System;
using System.Collections.Generic;
using bitzhuwei.Compiler;

namespace bitzhuwei.JsonFormat {
    partial class CompilerJson {
        // this is where new <see cref="Token"/> starts.
        private static void BeginToken(LexicalContext context) {
            if (context.analyzingToken.type != AnalyzingToken.NotYet) {
                context.analyzingToken.Reset(index: context.result.Count, start: context.cursor);
            }
        }

        // extend value of current token(<see cref="LexicalContext.analyzingToken"/>)
        private static void ExtendToken(LexicalContext context, int Vt) {
            context.analyzingToken.ends[Vt] = context.cursor;
        }
        private static void ExtendToken2(LexicalContext context, params int[] Vts) {
            for (int i = 0; i < Vts.Length; i++) {
                var Vt = Vts[i];
                context.analyzingToken.ends[Vt] = context.cursor;
            }
        }
        private static void ExtendToken3(LexicalContext context, params IfVt[] ifVts) {
            for (int i = 0; i < ifVts.Length; i++) {
                var Vt = ifVts[i].Vt;
                context.analyzingToken.ends[Vt] = context.cursor;
            }
        }

        // accept current Token
        // set Token.type and neutralize the last LexicalContext.MoveForward()
        private static void AcceptToken(LexicalContext context, int Vt) {
            var startIndex = context.analyzingToken.start.index;
            var end = context.analyzingToken.ends[Vt];
            context.analyzingToken.value = context.sourceCode.Substring(
                startIndex, end.index - startIndex + 1);
            context.analyzingToken.type = Vt;

            // cancel forward steps for post-regex
            var backStep = context.cursor.index - end.index;
            if (backStep > 0) { context.MoveBack(backStep); }
            // next operation: LexicalContext.MoveForward();

            var token = context.analyzingToken.Dump(
#if DEBUG
                context.stArray,
#endif
                end);
            context.result.Add(token);
            // 没有注释可跳过 no comment to skip
            context.lastSyntaxValidToken = token;
            if (token.type == st.Error错) {
                context.result.token2ErrorInfo.Add(token,
                    new TokenErrorInfo(token, "token type unrecognized!"));
            }
        }
        private static void AcceptToken2(LexicalContext context, params int[] Vts) {
            AcceptToken(context, Vts[0]);
        }
        private static void AcceptToken3(LexicalContext context, params IfVt[] ifVts) {
            var typeSet = false;
            int lastType = st.@终;
            if (context.lastSyntaxValidToken != null) {
                lastType = context.lastSyntaxValidToken.type;
            }
            for (var i = 0; i < ifVts.Length; i++) {
                var ifVt = ifVts[i];
                if (ifVt.signalCondition == context.signalCondition
                 // if preVt is string.Empty, let's use the first type.
                 // otherwise, preVt must be the lastType.
                 && (ifVt.preVt == st.@终 // default preVt
                  || ifVt.preVt == lastType)) { // <'Vt'>
                    context.analyzingToken.type = ifVt.Vt;
                    if (ifVt.nextSignal != null) { context.signalCondition = ifVt.nextSignal; }
                    typeSet = true;
                    break;
                }
            }
            if (!typeSet) {
                for (var i = 0; i < ifVts.Length; i++) {
                    var ifVt = ifVts[i];
                    if (// ingnore signal condition and try to assgin a type.
                        // if preVt is string.Empty, let's use the first type.
                        // otherwise, preVt must be the lastType.
                        (ifVt.preVt == st.@终 // default preVt
                      || ifVt.preVt == lastType)) { // <'Vt'>
                        context.analyzingToken.type = ifVt.Vt;
                        context.signalCondition = LexicalContext.defaultSignal;
                        typeSet = true;
                        break;
                    }
                }
            }

            var startIndex = context.analyzingToken.start.index;
            var end = context.analyzingToken.start;
            if (!typeSet) {
                // we failed to assign type according to lexi statements.
                // this indicates token error in source code or inappropriate lexi statements.
                //throw new Exception("Algorithm error: token type not set!");
                context.analyzingToken.type = st.Error错;
                context.signalCondition = LexicalContext.defaultSignal;
                // choose longest value
                for (int i = 0; i < context.analyzingToken.ends.Length; i++) {
                    var item = context.analyzingToken.ends[i];
                    if (end.index < item.index) { end = item; }
                }
            }
            else { end = context.analyzingToken.ends[context.analyzingToken.type]; }
            context.analyzingToken.value = context.sourceCode.Substring(startIndex, end.index - startIndex + 1);

            // cancel forward steps for post-regex
            var backStep = context.cursor.index - end.index;
            if (backStep > 0) { context.MoveBack(backStep); }
            // next operation: context.MoveForward();

            var token = context.analyzingToken.Dump(
#if DEBUG
                context.stArray,
#endif
                end);
            context.result.Add(token);
            // 没有注释可跳过 no comment to skip
            context.lastSyntaxValidToken = token;
            if (token.type == st.Error错) {
                context.result.token2ErrorInfo.Add(token,
                    new TokenErrorInfo(token, "token type unrecognized!"));
            }
        }
    }
}

Json.LexicalReservedWords.gen.cs

这里记录了Json文法的全部保留字(任何编程语言中的keyword),也就是{}[],:nulltruefalse这些。显然这是辅助的东西,不必在意。

Json.LexicalReservedWords.gen.cs

using System;
using bitzhuwei.Compiler;

namespace bitzhuwei.JsonFormat {
    partial class CompilerJson {

        public static class reservedWord {
            /// <summary>
            /// {
            /// </summary>
            public const string @LeftBrace符 = "{";
            /// <summary>
            /// }
            /// </summary>
            public const string @RightBrace符 = "}";
            /// <summary>
            /// [
            /// </summary>
            public const string @LeftBracket符 = "[";
            /// <summary>
            /// ]
            /// </summary>
            public const string @RightBracket符 = "]";
            /// <summary>
            /// ,
            /// </summary>
            public const string @Comma符 = ",";
            /// <summary>
            /// :
            /// </summary>
            public const string @Colon符 = ":";
            /// <summary>
            /// null
            /// </summary>
            public const string @null = "null";
            /// <summary>
            /// true
            /// </summary>
            public const string @true = "true";
            /// <summary>
            /// false
            /// </summary>
            public const string @false = "false";

        }

        /// <summary>
        /// if <paramref name="token"/> is a reserved word, assign correspond type and return true.
        /// <para>otherwise, return false.</para>
        /// </summary>
        /// <param name="token"></param>
        /// <returns></returns>
        private static bool CheckReservedWord(AnalyzingToken token) {
            bool isReservedWord = true;
            switch (token.value) {
            case reservedWord.@LeftBrace符: token.type = st.@LeftBrace符; break;
            case reservedWord.@RightBrace符: token.type = st.@RightBrace符; break;
            case reservedWord.@LeftBracket符: token.type = st.@LeftBracket符; break;
            case reservedWord.@RightBracket符: token.type = st.@RightBracket符; break;
            case reservedWord.@Comma符: token.type = st.@Comma符; break;
            case reservedWord.@Colon符: token.type = st.@Colon符; break;
            case reservedWord.@null: token.type = st.@null; break;
            case reservedWord.@true: token.type = st.@true; break;
            case reservedWord.@false: token.type = st.@false; break;

            default: isReservedWord = false; break;
            }

            return isReservedWord;
        }
    }
}

README.gen.md

这是词法分析器的说明文档,用mermaid画出了各个token的状态机和整个文法的总状态机,如下图所示。

image

我知道你们看不清。我也看不清。找个大屏幕直接看README.gen.md文件吧。

生成的语法分析器

image

Dicitonary<int, LRParseAction>

Json.Dict.LALR(1).gen.cs_是LALR(1)的语法分析状态机,每个语法状态都是一个Dicitonary<int, LRParseAction>对象。

Json.Dict.LALR(1).gen.cs_

using System;
using bitzhuwei.Compiler;

namespace bitzhuwei.JsonFormat {
    partial class CompilerJson {

        private static Dictionary<int, LRParseAction>[] InitializeSyntaxStates() {
            const int syntaxStateCount = 29;
            var states = new Dictionary<int, LRParseAction>[syntaxStateCount];
            // 102 actions
            // conflicts(0)=not sovled(0)+solved(0)(0 warnings)
            #region create objects of syntax states
            states[0] = new(capacity: 5);
            states[1] = new(capacity: 1);
            states[2] = new(capacity: 1);
            states[3] = new(capacity: 1);
            states[4] = new(capacity: 4);
            states[5] = new(capacity: 13);
            states[6] = new(capacity: 4);
            states[7] = new(capacity: 2);
            states[8] = new(capacity: 2);
            states[9] = new(capacity: 1);
            states[10] = new(capacity: 4);
            states[11] = new(capacity: 2);
            states[12] = new(capacity: 2);
            states[13] = new(capacity: 2);
            states[14] = new(capacity: 3);
            states[15] = new(capacity: 3);
            states[16] = new(capacity: 3);
            states[17] = new(capacity: 3);
            states[18] = new(capacity: 3);
            states[19] = new(capacity: 3);
            states[20] = new(capacity: 3);
            states[21] = new(capacity: 4);
            states[22] = new(capacity: 2);
            states[23] = new(capacity: 10);
            states[24] = new(capacity: 4);
            states[25] = new(capacity: 11);
            states[26] = new(capacity: 2);
            states[27] = new(capacity: 2);
            states[28] = new(capacity: 2);
            #endregion create objects of syntax states

            #region re-used actions
            LRParseAction aShift4 = new(LRParseAction.Kind.Shift, states[4]);// refered 4 times
            LRParseAction aShift5 = new(LRParseAction.Kind.Shift, states[5]);// refered 4 times
            LRParseAction aShift9 = new(LRParseAction.Kind.Shift, states[9]);// refered 2 times
            LRParseAction aGoto13 = new(LRParseAction.Kind.Goto, states[13]);// refered 2 times
            LRParseAction aShift14 = new(LRParseAction.Kind.Shift, states[14]);// refered 3 times
            LRParseAction aShift15 = new(LRParseAction.Kind.Shift, states[15]);// refered 3 times
            LRParseAction aShift16 = new(LRParseAction.Kind.Shift, states[16]);// refered 3 times
            LRParseAction aShift17 = new(LRParseAction.Kind.Shift, states[17]);// refered 3 times
            LRParseAction aShift18 = new(LRParseAction.Kind.Shift, states[18]);// refered 3 times
            LRParseAction aGoto19 = new(LRParseAction.Kind.Goto, states[19]);// refered 3 times
            LRParseAction aGoto20 = new(LRParseAction.Kind.Goto, states[20]);// refered 3 times
            LRParseAction aReduce2 = new(regulations[2]);// refered 4 times
            LRParseAction aReduce7 = new(regulations[7]);// refered 2 times
            LRParseAction aReduce4 = new(regulations[4]);// refered 4 times
            LRParseAction aReduce9 = new(regulations[9]);// refered 2 times
            LRParseAction aReduce11 = new(regulations[11]);// refered 2 times
            LRParseAction aReduce12 = new(regulations[12]);// refered 3 times
            LRParseAction aReduce13 = new(regulations[13]);// refered 3 times
            LRParseAction aReduce14 = new(regulations[14]);// refered 3 times
            LRParseAction aReduce15 = new(regulations[15]);// refered 3 times
            LRParseAction aReduce16 = new(regulations[16]);// refered 3 times
            LRParseAction aReduce17 = new(regulations[17]);// refered 3 times
            LRParseAction aReduce18 = new(regulations[18]);// refered 3 times
            LRParseAction aReduce3 = new(regulations[3]);// refered 4 times
            LRParseAction aReduce5 = new(regulations[5]);// refered 4 times
            LRParseAction aReduce6 = new(regulations[6]);// refered 2 times
            LRParseAction aReduce10 = new(regulations[10]);// refered 2 times
            LRParseAction aReduce8 = new(regulations[8]);// refered 2 times
            #endregion re-used actions

            // 102 actions
            // conflicts(0)=not sovled(0)+solved(0)(0 warnings)
            #region init actions of syntax states
            // syntaxStates[0]:
            // [-1] Json' : ⏳ Json ;☕ '¥' 
            // [0] Json : ⏳ Object ;☕ '¥' 
            // [1] Json : ⏳ Array ;☕ '¥' 
            // [2] Object : ⏳ '{' '}' ;☕ '¥' 
            // [3] Object : ⏳ '{' Members '}' ;☕ '¥' 
            // [4] Array : ⏳ '[' ']' ;☕ '¥' 
            // [5] Array : ⏳ '[' Elements ']' ;☕ '¥' 
            /*0*/states[0].Add(st.Json枝, new(LRParseAction.Kind.Goto, states[1]));
            /*1*/states[0].Add(st.Object枝, new(LRParseAction.Kind.Goto, states[2]));
            /*2*/states[0].Add(st.Array枝, new(LRParseAction.Kind.Goto, states[3]));
            /*3*/states[0].Add(st.@LeftBrace符, aShift4);
            /*4*/states[0].Add(st.@LeftBracket符, aShift5);
            // syntaxStates[1]:
            // [-1] Json' : Json ⏳ ;☕ '¥' 
            /*5*/states[1].Add(st.@终, LRParseAction.accept);
            // syntaxStates[2]:
            // [0] Json : Object ⏳ ;☕ '¥' 
            /*6*/states[2].Add(st.@终, new(regulations[0]));
            // syntaxStates[3]:
            // [1] Json : Array ⏳ ;☕ '¥' 
            /*7*/states[3].Add(st.@终, new(regulations[1]));
            // syntaxStates[4]:
            // [2] Object : '{' ⏳ '}' ;☕ ',' ']' '}' '¥' 
            // [3] Object : '{' ⏳ Members '}' ;☕ ',' ']' '}' '¥' 
            // [6] Members : ⏳ Members ',' Member ;☕ ',' '}' 
            // [7] Members : ⏳ Member ;☕ ',' '}' 
            // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}' 
            /*8*/states[4].Add(st.@RightBrace符, new(LRParseAction.Kind.Shift, states[6]));
            /*9*/states[4].Add(st.Members枝, new(LRParseAction.Kind.Goto, states[7]));
            /*10*/states[4].Add(st.Member枝, new(LRParseAction.Kind.Goto, states[8]));
            /*11*/states[4].Add(st.@string, aShift9);
            // syntaxStates[5]:
            // [4] Array : '[' ⏳ ']' ;☕ ',' ']' '}' '¥' 
            // [5] Array : '[' ⏳ Elements ']' ;☕ ',' ']' '}' '¥' 
            // [8] Elements : ⏳ Elements ',' Element ;☕ ',' ']' 
            // [9] Elements : ⏳ Element ;☕ ',' ']' 
            // [11] Element : ⏳ Value ;☕ ',' ']' 
            // [12] Value : ⏳ 'null' ;☕ ',' ']' 
            // [13] Value : ⏳ 'true' ;☕ ',' ']' 
            // [14] Value : ⏳ 'false' ;☕ ',' ']' 
            // [15] Value : ⏳ 'number' ;☕ ',' ']' 
            // [16] Value : ⏳ 'string' ;☕ ',' ']' 
            // [17] Value : ⏳ Object ;☕ ',' ']' 
            // [18] Value : ⏳ Array ;☕ ',' ']' 
            // [2] Object : ⏳ '{' '}' ;☕ ',' ']' 
            // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']' 
            // [4] Array : ⏳ '[' ']' ;☕ ',' ']' 
            // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']' 
            /*12*/states[5].Add(st.@RightBracket符, new(LRParseAction.Kind.Shift, states[10]));
            /*13*/states[5].Add(st.Elements枝, new(LRParseAction.Kind.Goto, states[11]));
            /*14*/states[5].Add(st.Element枝, new(LRParseAction.Kind.Goto, states[12]));
            /*15*/states[5].Add(st.Value枝, aGoto13);
            /*16*/states[5].Add(st.@null, aShift14);
            /*17*/states[5].Add(st.@true, aShift15);
            /*18*/states[5].Add(st.@false, aShift16);
            /*19*/states[5].Add(st.@number, aShift17);
            /*20*/states[5].Add(st.@string, aShift18);
            /*21*/states[5].Add(st.Object枝, aGoto19);
            /*22*/states[5].Add(st.Array枝, aGoto20);
            /*23*/states[5].Add(st.@LeftBrace符, aShift4);
            /*24*/states[5].Add(st.@LeftBracket符, aShift5);
            // syntaxStates[6]:
            // [2] Object : '{' '}' ⏳ ;☕ ',' ']' '}' '¥' 
            /*25*/states[6].Add(st.@Comma符, aReduce2);
            /*26*/states[6].Add(st.@RightBracket符, aReduce2);
            /*27*/states[6].Add(st.@RightBrace符, aReduce2);
            /*28*/states[6].Add(st.@终, aReduce2);
            // syntaxStates[7]:
            // [3] Object : '{' Members ⏳ '}' ;☕ ',' ']' '}' '¥' 
            // [6] Members : Members ⏳ ',' Member ;☕ ',' '}' 
            /*29*/states[7].Add(st.@RightBrace符, new(LRParseAction.Kind.Shift, states[21]));
            /*30*/states[7].Add(st.@Comma符, new(LRParseAction.Kind.Shift, states[22]));
            // syntaxStates[8]:
            // [7] Members : Member ⏳ ;☕ ',' '}' 
            /*31*/states[8].Add(st.@Comma符, aReduce7);
            /*32*/states[8].Add(st.@RightBrace符, aReduce7);
            // syntaxStates[9]:
            // [10] Member : 'string' ⏳ ':' Value ;☕ ',' '}' 
            /*33*/states[9].Add(st.@Colon符, new(LRParseAction.Kind.Shift, states[23]));
            // syntaxStates[10]:
            // [4] Array : '[' ']' ⏳ ;☕ ',' ']' '}' '¥' 
            /*34*/states[10].Add(st.@Comma符, aReduce4);
            /*35*/states[10].Add(st.@RightBracket符, aReduce4);
            /*36*/states[10].Add(st.@RightBrace符, aReduce4);
            /*37*/states[10].Add(st.@终, aReduce4);
            // syntaxStates[11]:
            // [5] Array : '[' Elements ⏳ ']' ;☕ ',' ']' '}' '¥' 
            // [8] Elements : Elements ⏳ ',' Element ;☕ ',' ']' 
            /*38*/states[11].Add(st.@RightBracket符, new(LRParseAction.Kind.Shift, states[24]));
            /*39*/states[11].Add(st.@Comma符, new(LRParseAction.Kind.Shift, states[25]));
            // syntaxStates[12]:
            // [9] Elements : Element ⏳ ;☕ ',' ']' 
            /*40*/states[12].Add(st.@Comma符, aReduce9);
            /*41*/states[12].Add(st.@RightBracket符, aReduce9);
            // syntaxStates[13]:
            // [11] Element : Value ⏳ ;☕ ',' ']' 
            /*42*/states[13].Add(st.@Comma符, aReduce11);
            /*43*/states[13].Add(st.@RightBracket符, aReduce11);
            // syntaxStates[14]:
            // [12] Value : 'null' ⏳ ;☕ ',' ']' '}' 
            /*44*/states[14].Add(st.@Comma符, aReduce12);
            /*45*/states[14].Add(st.@RightBracket符, aReduce12);
            /*46*/states[14].Add(st.@RightBrace符, aReduce12);
            // syntaxStates[15]:
            // [13] Value : 'true' ⏳ ;☕ ',' ']' '}' 
            /*47*/states[15].Add(st.@Comma符, aReduce13);
            /*48*/states[15].Add(st.@RightBracket符, aReduce13);
            /*49*/states[15].Add(st.@RightBrace符, aReduce13);
            // syntaxStates[16]:
            // [14] Value : 'false' ⏳ ;☕ ',' ']' '}' 
            /*50*/states[16].Add(st.@Comma符, aReduce14);
            /*51*/states[16].Add(st.@RightBracket符, aReduce14);
            /*52*/states[16].Add(st.@RightBrace符, aReduce14);
            // syntaxStates[17]:
            // [15] Value : 'number' ⏳ ;☕ ',' ']' '}' 
            /*53*/states[17].Add(st.@Comma符, aReduce15);
            /*54*/states[17].Add(st.@RightBracket符, aReduce15);
            /*55*/states[17].Add(st.@RightBrace符, aReduce15);
            // syntaxStates[18]:
            // [16] Value : 'string' ⏳ ;☕ ',' ']' '}' 
            /*56*/states[18].Add(st.@Comma符, aReduce16);
            /*57*/states[18].Add(st.@RightBracket符, aReduce16);
            /*58*/states[18].Add(st.@RightBrace符, aReduce16);
            // syntaxStates[19]:
            // [17] Value : Object ⏳ ;☕ ',' ']' '}' 
            /*59*/states[19].Add(st.@Comma符, aReduce17);
            /*60*/states[19].Add(st.@RightBracket符, aReduce17);
            /*61*/states[19].Add(st.@RightBrace符, aReduce17);
            // syntaxStates[20]:
            // [18] Value : Array ⏳ ;☕ ',' ']' '}' 
            /*62*/states[20].Add(st.@Comma符, aReduce18);
            /*63*/states[20].Add(st.@RightBracket符, aReduce18);
            /*64*/states[20].Add(st.@RightBrace符, aReduce18);
            // syntaxStates[21]:
            // [3] Object : '{' Members '}' ⏳ ;☕ ',' ']' '}' '¥' 
            /*65*/states[21].Add(st.@Comma符, aReduce3);
            /*66*/states[21].Add(st.@RightBracket符, aReduce3);
            /*67*/states[21].Add(st.@RightBrace符, aReduce3);
            /*68*/states[21].Add(st.@终, aReduce3);
            // syntaxStates[22]:
            // [6] Members : Members ',' ⏳ Member ;☕ ',' '}' 
            // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}' 
            /*69*/states[22].Add(st.Member枝, new(LRParseAction.Kind.Goto, states[26]));
            /*70*/states[22].Add(st.@string, aShift9);
            // syntaxStates[23]:
            // [10] Member : 'string' ':' ⏳ Value ;☕ ',' '}' 
            // [12] Value : ⏳ 'null' ;☕ ',' '}' 
            // [13] Value : ⏳ 'true' ;☕ ',' '}' 
            // [14] Value : ⏳ 'false' ;☕ ',' '}' 
            // [15] Value : ⏳ 'number' ;☕ ',' '}' 
            // [16] Value : ⏳ 'string' ;☕ ',' '}' 
            // [17] Value : ⏳ Object ;☕ ',' '}' 
            // [18] Value : ⏳ Array ;☕ ',' '}' 
            // [2] Object : ⏳ '{' '}' ;☕ ',' '}' 
            // [3] Object : ⏳ '{' Members '}' ;☕ ',' '}' 
            // [4] Array : ⏳ '[' ']' ;☕ ',' '}' 
            // [5] Array : ⏳ '[' Elements ']' ;☕ ',' '}' 
            /*71*/states[23].Add(st.Value枝, new(LRParseAction.Kind.Goto, states[27]));
            /*72*/states[23].Add(st.@null, aShift14);
            /*73*/states[23].Add(st.@true, aShift15);
            /*74*/states[23].Add(st.@false, aShift16);
            /*75*/states[23].Add(st.@number, aShift17);
            /*76*/states[23].Add(st.@string, aShift18);
            /*77*/states[23].Add(st.Object枝, aGoto19);
            /*78*/states[23].Add(st.Array枝, aGoto20);
            /*79*/states[23].Add(st.@LeftBrace符, aShift4);
            /*80*/states[23].Add(st.@LeftBracket符, aShift5);
            // syntaxStates[24]:
            // [5] Array : '[' Elements ']' ⏳ ;☕ ',' ']' '}' '¥' 
            /*81*/states[24].Add(st.@Comma符, aReduce5);
            /*82*/states[24].Add(st.@RightBracket符, aReduce5);
            /*83*/states[24].Add(st.@RightBrace符, aReduce5);
            /*84*/states[24].Add(st.@终, aReduce5);
            // syntaxStates[25]:
            // [8] Elements : Elements ',' ⏳ Element ;☕ ',' ']' 
            // [11] Element : ⏳ Value ;☕ ',' ']' 
            // [12] Value : ⏳ 'null' ;☕ ',' ']' 
            // [13] Value : ⏳ 'true' ;☕ ',' ']' 
            // [14] Value : ⏳ 'false' ;☕ ',' ']' 
            // [15] Value : ⏳ 'number' ;☕ ',' ']' 
            // [16] Value : ⏳ 'string' ;☕ ',' ']' 
            // [17] Value : ⏳ Object ;☕ ',' ']' 
            // [18] Value : ⏳ Array ;☕ ',' ']' 
            // [2] Object : ⏳ '{' '}' ;☕ ',' ']' 
            // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']' 
            // [4] Array : ⏳ '[' ']' ;☕ ',' ']' 
            // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']' 
            /*85*/states[25].Add(st.Element枝, new(LRParseAction.Kind.Goto, states[28]));
            /*86*/states[25].Add(st.Value枝, aGoto13);
            /*87*/states[25].Add(st.@null, aShift14);
            /*88*/states[25].Add(st.@true, aShift15);
            /*89*/states[25].Add(st.@false, aShift16);
            /*90*/states[25].Add(st.@number, aShift17);
            /*91*/states[25].Add(st.@string, aShift18);
            /*92*/states[25].Add(st.Object枝, aGoto19);
            /*93*/states[25].Add(st.Array枝, aGoto20);
            /*94*/states[25].Add(st.@LeftBrace符, aShift4);
            /*95*/states[25].Add(st.@LeftBracket符, aShift5);
            // syntaxStates[26]:
            // [6] Members : Members ',' Member ⏳ ;☕ ',' '}' 
            /*96*/states[26].Add(st.@Comma符, aReduce6);
            /*97*/states[26].Add(st.@RightBrace符, aReduce6);
            // syntaxStates[27]:
            // [10] Member : 'string' ':' Value ⏳ ;☕ ',' '}' 
            /*98*/states[27].Add(st.@Comma符, aReduce10);
            /*99*/states[27].Add(st.@RightBrace符, aReduce10);
            // syntaxStates[28]:
            // [8] Elements : Elements ',' Element ⏳ ;☕ ',' ']' 
            /*100*/states[28].Add(st.@Comma符, aReduce8);
            /*101*/states[28].Add(st.@RightBracket符, aReduce8);
            #endregion init actions of syntax states

            return states;
        }
    }
}

另外3个Json.Dict.*.gen.cs_分别是LR(0)、SLR(1)、LR(1)的语法分析状态机,不再赘述。

这是最初的也是最直观的实现,它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

int[]+LRParseAction[]

Json.Table.LALR(1).gen.cs_是LALR(1)的语法分析状态机,每个语法状态都是一个包含int[]LRParseAction[]的对象。这里的每个int[t]LRParseAction[t]合起来就代替了Dictionary<int, LRParseAction>对象的一个键值对(key/value),从而减少了内存占用,也稍微提升了运行效率。

Json.Table.LALR(1).gen.cs_

using System;
using bitzhuwei.Compiler;

namespace bitzhuwei.JsonFormat {
    partial class CompilerJson {

        private static LRParseState[] InitializeSyntaxStates() {
            const int syntaxStateCount = 29;
            var states = new LRParseState[syntaxStateCount];
            // 102 actions
            // conflicts(0)=not sovled(0)+solved(0)(0 warnings)
            for (var i = 0; i < syntaxStateCount; i++) { states[i] = new(); }

            #region re-used actions
            LRParseAction aShift4 = new(LRParseAction.Kind.Shift, states[4]);// refered 4 times
            LRParseAction aShift5 = new(LRParseAction.Kind.Shift, states[5]);// refered 4 times
            LRParseAction aShift9 = new(LRParseAction.Kind.Shift, states[9]);// refered 2 times
            LRParseAction aGoto13 = new(LRParseAction.Kind.Goto, states[13]);// refered 2 times
            LRParseAction aShift14 = new(LRParseAction.Kind.Shift, states[14]);// refered 3 times
            LRParseAction aShift15 = new(LRParseAction.Kind.Shift, states[15]);// refered 3 times
            LRParseAction aShift16 = new(LRParseAction.Kind.Shift, states[16]);// refered 3 times
            LRParseAction aShift17 = new(LRParseAction.Kind.Shift, states[17]);// refered 3 times
            LRParseAction aShift18 = new(LRParseAction.Kind.Shift, states[18]);// refered 3 times
            LRParseAction aGoto19 = new(LRParseAction.Kind.Goto, states[19]);// refered 3 times
            LRParseAction aGoto20 = new(LRParseAction.Kind.Goto, states[20]);// refered 3 times
            LRParseAction aReduce2 = new(regulations[2]);// refered 4 times
            LRParseAction aReduce7 = new(regulations[7]);// refered 2 times
            LRParseAction aReduce4 = new(regulations[4]);// refered 4 times
            LRParseAction aReduce9 = new(regulations[9]);// refered 2 times
            LRParseAction aReduce11 = new(regulations[11]);// refered 2 times
            LRParseAction aReduce12 = new(regulations[12]);// refered 3 times
            LRParseAction aReduce13 = new(regulations[13]);// refered 3 times
            LRParseAction aReduce14 = new(regulations[14]);// refered 3 times
            LRParseAction aReduce15 = new(regulations[15]);// refered 3 times
            LRParseAction aReduce16 = new(regulations[16]);// refered 3 times
            LRParseAction aReduce17 = new(regulations[17]);// refered 3 times
            LRParseAction aReduce18 = new(regulations[18]);// refered 3 times
            LRParseAction aReduce3 = new(regulations[3]);// refered 4 times
            LRParseAction aReduce5 = new(regulations[5]);// refered 4 times
            LRParseAction aReduce6 = new(regulations[6]);// refered 2 times
            LRParseAction aReduce10 = new(regulations[10]);// refered 2 times
            LRParseAction aReduce8 = new(regulations[8]);// refered 2 times
            #endregion re-used actions

            // 102 actions
            // conflicts(0)=not sovled(0)+solved(0)(0 warnings)
            #region init actions of syntax states
            // syntaxStates[0]:
            // [-1] Json' : ⏳ Json ;☕ '¥' 
            // [0] Json : ⏳ Object ;☕ '¥' 
            // [1] Json : ⏳ Array ;☕ '¥' 
            // [2] Object : ⏳ '{' '}' ;☕ '¥' 
            // [3] Object : ⏳ '{' Members '}' ;☕ '¥' 
            // [4] Array : ⏳ '[' ']' ;☕ '¥' 
            // [5] Array : ⏳ '[' Elements ']' ;☕ '¥' 
            states[0].nodes = new int[] {
                /*0*/st.@LeftBrace符, // (1) -> aShift4
                /*1*/st.@LeftBracket符, // (3) -> aShift5
                /*2*/st.Json枝, // (12) -> new(LRParseAction.Kind.Goto, states[1])
                /*3*/st.Object枝, // (13) -> new(LRParseAction.Kind.Goto, states[2])
                /*4*/st.Array枝, // (14) -> new(LRParseAction.Kind.Goto, states[3])
            };
            states[0].actions = new LRParseAction[] {
                /*0*//* st.@LeftBrace符(1), */aShift4,
                /*1*//* st.@LeftBracket符(3), */aShift5,
                /*2*//* st.Json枝(12), */new(LRParseAction.Kind.Goto, states[1]),
                /*3*//* st.Object枝(13), */new(LRParseAction.Kind.Goto, states[2]),
                /*4*//* st.Array枝(14), */new(LRParseAction.Kind.Goto, states[3]),
            };
            // syntaxStates[1]:
            // [-1] Json' : Json ⏳ ;☕ '¥' 
            states[1].nodes = new int[] {
                /*5*/st.@终, // (0) -> LRParseAction.accept
            };
            states[1].actions = new LRParseAction[] {
                /*5*//* st.@终(0), */LRParseAction.accept,
            };
            // syntaxStates[2]:
            // [0] Json : Object ⏳ ;☕ '¥' 
            states[2].nodes = new int[] {
                /*6*/st.@终, // (0) -> new(regulations[0])
            };
            states[2].actions = new LRParseAction[] {
                /*6*//* st.@终(0), */new(regulations[0]),
            };
            // syntaxStates[3]:
            // [1] Json : Array ⏳ ;☕ '¥' 
            states[3].nodes = new int[] {
                /*7*/st.@终, // (0) -> new(regulations[1])
            };
            states[3].actions = new LRParseAction[] {
                /*7*//* st.@终(0), */new(regulations[1]),
            };
            // syntaxStates[4]:
            // [2] Object : '{' ⏳ '}' ;☕ ',' ']' '}' '¥' 
            // [3] Object : '{' ⏳ Members '}' ;☕ ',' ']' '}' '¥' 
            // [6] Members : ⏳ Members ',' Member ;☕ ',' '}' 
            // [7] Members : ⏳ Member ;☕ ',' '}' 
            // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}' 
            states[4].nodes = new int[] {
                /*8*/st.@RightBrace符, // (2) -> new(LRParseAction.Kind.Shift, states[6])
                /*9*/st.@string, // (6) -> aShift9
                /*10*/st.Members枝, // (15) -> new(LRParseAction.Kind.Goto, states[7])
                /*11*/st.Member枝, // (17) -> new(LRParseAction.Kind.Goto, states[8])
            };
            states[4].actions = new LRParseAction[] {
                /*8*//* st.@RightBrace符(2), */new(LRParseAction.Kind.Shift, states[6]),
                /*9*//* st.@string(6), */aShift9,
                /*10*//* st.Members枝(15), */new(LRParseAction.Kind.Goto, states[7]),
                /*11*//* st.Member枝(17), */new(LRParseAction.Kind.Goto, states[8]),
            };
            // syntaxStates[5]:
            // [4] Array : '[' ⏳ ']' ;☕ ',' ']' '}' '¥' 
            // [5] Array : '[' ⏳ Elements ']' ;☕ ',' ']' '}' '¥' 
            // [8] Elements : ⏳ Elements ',' Element ;☕ ',' ']' 
            // [9] Elements : ⏳ Element ;☕ ',' ']' 
            // [11] Element : ⏳ Value ;☕ ',' ']' 
            // [12] Value : ⏳ 'null' ;☕ ',' ']' 
            // [13] Value : ⏳ 'true' ;☕ ',' ']' 
            // [14] Value : ⏳ 'false' ;☕ ',' ']' 
            // [15] Value : ⏳ 'number' ;☕ ',' ']' 
            // [16] Value : ⏳ 'string' ;☕ ',' ']' 
            // [17] Value : ⏳ Object ;☕ ',' ']' 
            // [18] Value : ⏳ Array ;☕ ',' ']' 
            // [2] Object : ⏳ '{' '}' ;☕ ',' ']' 
            // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']' 
            // [4] Array : ⏳ '[' ']' ;☕ ',' ']' 
            // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']' 
            states[5].nodes = new int[] {
                /*12*/st.@LeftBrace符, // (1) -> aShift4
                /*13*/st.@LeftBracket符, // (3) -> aShift5
                /*14*/st.@RightBracket符, // (4) -> new(LRParseAction.Kind.Shift, states[10])
                /*15*/st.@string, // (6) -> aShift18
                /*16*/st.@null, // (8) -> aShift14
                /*17*/st.@true, // (9) -> aShift15
                /*18*/st.@false, // (10) -> aShift16
                /*19*/st.@number, // (11) -> aShift17
                /*20*/st.Object枝, // (13) -> aGoto19
                /*21*/st.Array枝, // (14) -> aGoto20
                /*22*/st.Elements枝, // (16) -> new(LRParseAction.Kind.Goto, states[11])
                /*23*/st.Element枝, // (18) -> new(LRParseAction.Kind.Goto, states[12])
                /*24*/st.Value枝, // (19) -> aGoto13
            };
            states[5].actions = new LRParseAction[] {
                /*12*//* st.@LeftBrace符(1), */aShift4,
                /*13*//* st.@LeftBracket符(3), */aShift5,
                /*14*//* st.@RightBracket符(4), */new(LRParseAction.Kind.Shift, states[10]),
                /*15*//* st.@string(6), */aShift18,
                /*16*//* st.@null(8), */aShift14,
                /*17*//* st.@true(9), */aShift15,
                /*18*//* st.@false(10), */aShift16,
                /*19*//* st.@number(11), */aShift17,
                /*20*//* st.Object枝(13), */aGoto19,
                /*21*//* st.Array枝(14), */aGoto20,
                /*22*//* st.Elements枝(16), */new(LRParseAction.Kind.Goto, states[11]),
                /*23*//* st.Element枝(18), */new(LRParseAction.Kind.Goto, states[12]),
                /*24*//* st.Value枝(19), */aGoto13,
            };
            // syntaxStates[6]:
            // [2] Object : '{' '}' ⏳ ;☕ ',' ']' '}' '¥' 
            states[6].nodes = new int[] {
                /*25*/st.@终, // (0) -> aReduce2
                /*26*/st.@RightBrace符, // (2) -> aReduce2
                /*27*/st.@RightBracket符, // (4) -> aReduce2
                /*28*/st.@Comma符, // (5) -> aReduce2
            };
            states[6].actions = new LRParseAction[] {
                /*25*//* st.@终(0), */aReduce2,
                /*26*//* st.@RightBrace符(2), */aReduce2,
                /*27*//* st.@RightBracket符(4), */aReduce2,
                /*28*//* st.@Comma符(5), */aReduce2,
            };
            // syntaxStates[7]:
            // [3] Object : '{' Members ⏳ '}' ;☕ ',' ']' '}' '¥' 
            // [6] Members : Members ⏳ ',' Member ;☕ ',' '}' 
            states[7].nodes = new int[] {
                /*29*/st.@RightBrace符, // (2) -> new(LRParseAction.Kind.Shift, states[21])
                /*30*/st.@Comma符, // (5) -> new(LRParseAction.Kind.Shift, states[22])
            };
            states[7].actions = new LRParseAction[] {
                /*29*//* st.@RightBrace符(2), */new(LRParseAction.Kind.Shift, states[21]),
                /*30*//* st.@Comma符(5), */new(LRParseAction.Kind.Shift, states[22]),
            };
            // syntaxStates[8]:
            // [7] Members : Member ⏳ ;☕ ',' '}' 
            states[8].nodes = new int[] {
                /*31*/st.@RightBrace符, // (2) -> aReduce7
                /*32*/st.@Comma符, // (5) -> aReduce7
            };
            states[8].actions = new LRParseAction[] {
                /*31*//* st.@RightBrace符(2), */aReduce7,
                /*32*//* st.@Comma符(5), */aReduce7,
            };
            // syntaxStates[9]:
            // [10] Member : 'string' ⏳ ':' Value ;☕ ',' '}' 
            states[9].nodes = new int[] {
                /*33*/st.@Colon符, // (7) -> new(LRParseAction.Kind.Shift, states[23])
            };
            states[9].actions = new LRParseAction[] {
                /*33*//* st.@Colon符(7), */new(LRParseAction.Kind.Shift, states[23]),
            };
            // syntaxStates[10]:
            // [4] Array : '[' ']' ⏳ ;☕ ',' ']' '}' '¥' 
            states[10].nodes = new int[] {
                /*34*/st.@终, // (0) -> aReduce4
                /*35*/st.@RightBrace符, // (2) -> aReduce4
                /*36*/st.@RightBracket符, // (4) -> aReduce4
                /*37*/st.@Comma符, // (5) -> aReduce4
            };
            states[10].actions = new LRParseAction[] {
                /*34*//* st.@终(0), */aReduce4,
                /*35*//* st.@RightBrace符(2), */aReduce4,
                /*36*//* st.@RightBracket符(4), */aReduce4,
                /*37*//* st.@Comma符(5), */aReduce4,
            };
            // syntaxStates[11]:
            // [5] Array : '[' Elements ⏳ ']' ;☕ ',' ']' '}' '¥' 
            // [8] Elements : Elements ⏳ ',' Element ;☕ ',' ']' 
            states[11].nodes = new int[] {
                /*38*/st.@RightBracket符, // (4) -> new(LRParseAction.Kind.Shift, states[24])
                /*39*/st.@Comma符, // (5) -> new(LRParseAction.Kind.Shift, states[25])
            };
            states[11].actions = new LRParseAction[] {
                /*38*//* st.@RightBracket符(4), */new(LRParseAction.Kind.Shift, states[24]),
                /*39*//* st.@Comma符(5), */new(LRParseAction.Kind.Shift, states[25]),
            };
            // syntaxStates[12]:
            // [9] Elements : Element ⏳ ;☕ ',' ']' 
            states[12].nodes = new int[] {
                /*40*/st.@RightBracket符, // (4) -> aReduce9
                /*41*/st.@Comma符, // (5) -> aReduce9
            };
            states[12].actions = new LRParseAction[] {
                /*40*//* st.@RightBracket符(4), */aReduce9,
                /*41*//* st.@Comma符(5), */aReduce9,
            };
            // syntaxStates[13]:
            // [11] Element : Value ⏳ ;☕ ',' ']' 
            states[13].nodes = new int[] {
                /*42*/st.@RightBracket符, // (4) -> aReduce11
                /*43*/st.@Comma符, // (5) -> aReduce11
            };
            states[13].actions = new LRParseAction[] {
                /*42*//* st.@RightBracket符(4), */aReduce11,
                /*43*//* st.@Comma符(5), */aReduce11,
            };
            // syntaxStates[14]:
            // [12] Value : 'null' ⏳ ;☕ ',' ']' '}' 
            states[14].nodes = new int[] {
                /*44*/st.@RightBrace符, // (2) -> aReduce12
                /*45*/st.@RightBracket符, // (4) -> aReduce12
                /*46*/st.@Comma符, // (5) -> aReduce12
            };
            states[14].actions = new LRParseAction[] {
                /*44*//* st.@RightBrace符(2), */aReduce12,
                /*45*//* st.@RightBracket符(4), */aReduce12,
                /*46*//* st.@Comma符(5), */aReduce12,
            };
            // syntaxStates[15]:
            // [13] Value : 'true' ⏳ ;☕ ',' ']' '}' 
            states[15].nodes = new int[] {
                /*47*/st.@RightBrace符, // (2) -> aReduce13
                /*48*/st.@RightBracket符, // (4) -> aReduce13
                /*49*/st.@Comma符, // (5) -> aReduce13
            };
            states[15].actions = new LRParseAction[] {
                /*47*//* st.@RightBrace符(2), */aReduce13,
                /*48*//* st.@RightBracket符(4), */aReduce13,
                /*49*//* st.@Comma符(5), */aReduce13,
            };
            // syntaxStates[16]:
            // [14] Value : 'false' ⏳ ;☕ ',' ']' '}' 
            states[16].nodes = new int[] {
                /*50*/st.@RightBrace符, // (2) -> aReduce14
                /*51*/st.@RightBracket符, // (4) -> aReduce14
                /*52*/st.@Comma符, // (5) -> aReduce14
            };
            states[16].actions = new LRParseAction[] {
                /*50*//* st.@RightBrace符(2), */aReduce14,
                /*51*//* st.@RightBracket符(4), */aReduce14,
                /*52*//* st.@Comma符(5), */aReduce14,
            };
            // syntaxStates[17]:
            // [15] Value : 'number' ⏳ ;☕ ',' ']' '}' 
            states[17].nodes = new int[] {
                /*53*/st.@RightBrace符, // (2) -> aReduce15
                /*54*/st.@RightBracket符, // (4) -> aReduce15
                /*55*/st.@Comma符, // (5) -> aReduce15
            };
            states[17].actions = new LRParseAction[] {
                /*53*//* st.@RightBrace符(2), */aReduce15,
                /*54*//* st.@RightBracket符(4), */aReduce15,
                /*55*//* st.@Comma符(5), */aReduce15,
            };
            // syntaxStates[18]:
            // [16] Value : 'string' ⏳ ;☕ ',' ']' '}' 
            states[18].nodes = new int[] {
                /*56*/st.@RightBrace符, // (2) -> aReduce16
                /*57*/st.@RightBracket符, // (4) -> aReduce16
                /*58*/st.@Comma符, // (5) -> aReduce16
            };
            states[18].actions = new LRParseAction[] {
                /*56*//* st.@RightBrace符(2), */aReduce16,
                /*57*//* st.@RightBracket符(4), */aReduce16,
                /*58*//* st.@Comma符(5), */aReduce16,
            };
            // syntaxStates[19]:
            // [17] Value : Object ⏳ ;☕ ',' ']' '}' 
            states[19].nodes = new int[] {
                /*59*/st.@RightBrace符, // (2) -> aReduce17
                /*60*/st.@RightBracket符, // (4) -> aReduce17
                /*61*/st.@Comma符, // (5) -> aReduce17
            };
            states[19].actions = new LRParseAction[] {
                /*59*//* st.@RightBrace符(2), */aReduce17,
                /*60*//* st.@RightBracket符(4), */aReduce17,
                /*61*//* st.@Comma符(5), */aReduce17,
            };
            // syntaxStates[20]:
            // [18] Value : Array ⏳ ;☕ ',' ']' '}' 
            states[20].nodes = new int[] {
                /*62*/st.@RightBrace符, // (2) -> aReduce18
                /*63*/st.@RightBracket符, // (4) -> aReduce18
                /*64*/st.@Comma符, // (5) -> aReduce18
            };
            states[20].actions = new LRParseAction[] {
                /*62*//* st.@RightBrace符(2), */aReduce18,
                /*63*//* st.@RightBracket符(4), */aReduce18,
                /*64*//* st.@Comma符(5), */aReduce18,
            };
            // syntaxStates[21]:
            // [3] Object : '{' Members '}' ⏳ ;☕ ',' ']' '}' '¥' 
            states[21].nodes = new int[] {
                /*65*/st.@终, // (0) -> aReduce3
                /*66*/st.@RightBrace符, // (2) -> aReduce3
                /*67*/st.@RightBracket符, // (4) -> aReduce3
                /*68*/st.@Comma符, // (5) -> aReduce3
            };
            states[21].actions = new LRParseAction[] {
                /*65*//* st.@终(0), */aReduce3,
                /*66*//* st.@RightBrace符(2), */aReduce3,
                /*67*//* st.@RightBracket符(4), */aReduce3,
                /*68*//* st.@Comma符(5), */aReduce3,
            };
            // syntaxStates[22]:
            // [6] Members : Members ',' ⏳ Member ;☕ ',' '}' 
            // [10] Member : ⏳ 'string' ':' Value ;☕ ',' '}' 
            states[22].nodes = new int[] {
                /*69*/st.@string, // (6) -> aShift9
                /*70*/st.Member枝, // (17) -> new(LRParseAction.Kind.Goto, states[26])
            };
            states[22].actions = new LRParseAction[] {
                /*69*//* st.@string(6), */aShift9,
                /*70*//* st.Member枝(17), */new(LRParseAction.Kind.Goto, states[26]),
            };
            // syntaxStates[23]:
            // [10] Member : 'string' ':' ⏳ Value ;☕ ',' '}' 
            // [12] Value : ⏳ 'null' ;☕ ',' '}' 
            // [13] Value : ⏳ 'true' ;☕ ',' '}' 
            // [14] Value : ⏳ 'false' ;☕ ',' '}' 
            // [15] Value : ⏳ 'number' ;☕ ',' '}' 
            // [16] Value : ⏳ 'string' ;☕ ',' '}' 
            // [17] Value : ⏳ Object ;☕ ',' '}' 
            // [18] Value : ⏳ Array ;☕ ',' '}' 
            // [2] Object : ⏳ '{' '}' ;☕ ',' '}' 
            // [3] Object : ⏳ '{' Members '}' ;☕ ',' '}' 
            // [4] Array : ⏳ '[' ']' ;☕ ',' '}' 
            // [5] Array : ⏳ '[' Elements ']' ;☕ ',' '}' 
            states[23].nodes = new int[] {
                /*71*/st.@LeftBrace符, // (1) -> aShift4
                /*72*/st.@LeftBracket符, // (3) -> aShift5
                /*73*/st.@string, // (6) -> aShift18
                /*74*/st.@null, // (8) -> aShift14
                /*75*/st.@true, // (9) -> aShift15
                /*76*/st.@false, // (10) -> aShift16
                /*77*/st.@number, // (11) -> aShift17
                /*78*/st.Object枝, // (13) -> aGoto19
                /*79*/st.Array枝, // (14) -> aGoto20
                /*80*/st.Value枝, // (19) -> new(LRParseAction.Kind.Goto, states[27])
            };
            states[23].actions = new LRParseAction[] {
                /*71*//* st.@LeftBrace符(1), */aShift4,
                /*72*//* st.@LeftBracket符(3), */aShift5,
                /*73*//* st.@string(6), */aShift18,
                /*74*//* st.@null(8), */aShift14,
                /*75*//* st.@true(9), */aShift15,
                /*76*//* st.@false(10), */aShift16,
                /*77*//* st.@number(11), */aShift17,
                /*78*//* st.Object枝(13), */aGoto19,
                /*79*//* st.Array枝(14), */aGoto20,
                /*80*//* st.Value枝(19), */new(LRParseAction.Kind.Goto, states[27]),
            };
            // syntaxStates[24]:
            // [5] Array : '[' Elements ']' ⏳ ;☕ ',' ']' '}' '¥' 
            states[24].nodes = new int[] {
                /*81*/st.@终, // (0) -> aReduce5
                /*82*/st.@RightBrace符, // (2) -> aReduce5
                /*83*/st.@RightBracket符, // (4) -> aReduce5
                /*84*/st.@Comma符, // (5) -> aReduce5
            };
            states[24].actions = new LRParseAction[] {
                /*81*//* st.@终(0), */aReduce5,
                /*82*//* st.@RightBrace符(2), */aReduce5,
                /*83*//* st.@RightBracket符(4), */aReduce5,
                /*84*//* st.@Comma符(5), */aReduce5,
            };
            // syntaxStates[25]:
            // [8] Elements : Elements ',' ⏳ Element ;☕ ',' ']' 
            // [11] Element : ⏳ Value ;☕ ',' ']' 
            // [12] Value : ⏳ 'null' ;☕ ',' ']' 
            // [13] Value : ⏳ 'true' ;☕ ',' ']' 
            // [14] Value : ⏳ 'false' ;☕ ',' ']' 
            // [15] Value : ⏳ 'number' ;☕ ',' ']' 
            // [16] Value : ⏳ 'string' ;☕ ',' ']' 
            // [17] Value : ⏳ Object ;☕ ',' ']' 
            // [18] Value : ⏳ Array ;☕ ',' ']' 
            // [2] Object : ⏳ '{' '}' ;☕ ',' ']' 
            // [3] Object : ⏳ '{' Members '}' ;☕ ',' ']' 
            // [4] Array : ⏳ '[' ']' ;☕ ',' ']' 
            // [5] Array : ⏳ '[' Elements ']' ;☕ ',' ']' 
            states[25].nodes = new int[] {
                /*85*/st.@LeftBrace符, // (1) -> aShift4
                /*86*/st.@LeftBracket符, // (3) -> aShift5
                /*87*/st.@string, // (6) -> aShift18
                /*88*/st.@null, // (8) -> aShift14
                /*89*/st.@true, // (9) -> aShift15
                /*90*/st.@false, // (10) -> aShift16
                /*91*/st.@number, // (11) -> aShift17
                /*92*/st.Object枝, // (13) -> aGoto19
                /*93*/st.Array枝, // (14) -> aGoto20
                /*94*/st.Element枝, // (18) -> new(LRParseAction.Kind.Goto, states[28])
                /*95*/st.Value枝, // (19) -> aGoto13
            };
            states[25].actions = new LRParseAction[] {
                /*85*//* st.@LeftBrace符(1), */aShift4,
                /*86*//* st.@LeftBracket符(3), */aShift5,
                /*87*//* st.@string(6), */aShift18,
                /*88*//* st.@null(8), */aShift14,
                /*89*//* st.@true(9), */aShift15,
                /*90*//* st.@false(10), */aShift16,
                /*91*//* st.@number(11), */aShift17,
                /*92*//* st.Object枝(13), */aGoto19,
                /*93*//* st.Array枝(14), */aGoto20,
                /*94*//* st.Element枝(18), */new(LRParseAction.Kind.Goto, states[28]),
                /*95*//* st.Value枝(19), */aGoto13,
            };
            // syntaxStates[26]:
            // [6] Members : Members ',' Member ⏳ ;☕ ',' '}' 
            states[26].nodes = new int[] {
                /*96*/st.@RightBrace符, // (2) -> aReduce6
                /*97*/st.@Comma符, // (5) -> aReduce6
            };
            states[26].actions = new LRParseAction[] {
                /*96*//* st.@RightBrace符(2), */aReduce6,
                /*97*//* st.@Comma符(5), */aReduce6,
            };
            // syntaxStates[27]:
            // [10] Member : 'string' ':' Value ⏳ ;☕ ',' '}' 
            states[27].nodes = new int[] {
                /*98*/st.@RightBrace符, // (2) -> aReduce10
                /*99*/st.@Comma符, // (5) -> aReduce10
            };
            states[27].actions = new LRParseAction[] {
                /*98*//* st.@RightBrace符(2), */aReduce10,
                /*99*//* st.@Comma符(5), */aReduce10,
            };
            // syntaxStates[28]:
            // [8] Elements : Elements ',' Element ⏳ ;☕ ',' ']' 
            states[28].nodes = new int[] {
                /*100*/st.@RightBracket符, // (4) -> aReduce8
                /*101*/st.@Comma符, // (5) -> aReduce8
            };
            states[28].actions = new LRParseAction[] {
                /*100*//* st.@RightBracket符(4), */aReduce8,
                /*101*//* st.@Comma符(5), */aReduce8,
            };
            #endregion init actions of syntax states

            return states;
        }
    }
}

另外4个Json.Dict.*.gen.cs_分别是LL(1)、LR(0)、SLR(1)、LR(1)的语法分析状态机,不再赘述。

它是第二个实现,它已经被更高效的实现方式取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

Json.Table.*.gen.bin

与词法分析器类似,这是将数组形式(int[]+LRParseAction[])的语法分析表写入了一个二进制文件。加载Json解析器时,读取此文件即可得到数组形式(int[]+LRParseAction[])的语法分析表。这就不需要将整个语法分析表硬编码到源代码中了,从而进一步减少了内存占用。

为了方便调试、参考,我为其准备了对应的文本格式,例如LALR(1)的语法分析表:

Json.Table.LALR(1).gen.txt

conflicts(0)=not sovled(0)+solved(0)(0 warnings)

29 states.
28 re-used actions
[0]:Shift[4] [1]:Shift[5] [2]:Shift[9] [3]:Goto[13] 
[4]:Shift[14] [5]:Shift[15] [6]:Shift[16] [7]:Shift[17] [8]:Shift[18] 
[9]:Goto[19] [10]:Goto[20] [11]:Reduce[2] [12]:Reduce[7] [13]:Reduce[4] 
[14]:Reduce[9] [15]:Reduce[11] [16]:Reduce[12] [17]:Reduce[13] [18]:Reduce[14] 
[19]:Reduce[15] [20]:Reduce[16] [21]:Reduce[17] [22]:Reduce[18] [23]:Reduce[3] 
[24]:Reduce[5] [25]:Reduce[6] [26]:Reduce[10] [27]:Reduce[8] 
states[0].nodes[5]:
1 3 12 13 14 
states[0].actions[5]:
-4(0)Shift[4] -4(1)Shift[5] Goto[1] Goto[2] Goto[3] 
states[1].nodes[1]:
0 
states[1].actions[1]:
Accept[0] 
states[2].nodes[1]:
0 
states[2].actions[1]:
Reduce[0] 
states[3].nodes[1]:
0 
states[3].actions[1]:
Reduce[1] 
states[4].nodes[4]:
2 6 15 17 
states[4].actions[4]:
Shift[6] -2(2)Shift[9] Goto[7] Goto[8] 
states[5].nodes[13]:
1 3 4 6 8 9 10 11 13 14 16 18 19 
states[5].actions[13]:
-4(0)Shift[4] -4(1)Shift[5] Shift[10] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[11] Goto[12] -2(3)Goto[13] 
states[6].nodes[4]:
0 2 4 5 
states[6].actions[4]:
-4(11)Reduce[2] -4(11)Reduce[2] -4(11)Reduce[2] -4(11)Reduce[2] 
states[7].nodes[2]:
2 5 
states[7].actions[2]:
Shift[21] Shift[22] 
states[8].nodes[2]:
2 5 
states[8].actions[2]:
-2(12)Reduce[7] -2(12)Reduce[7] 
states[9].nodes[1]:
7 
states[9].actions[1]:
Shift[23] 
states[10].nodes[4]:
0 2 4 5 
states[10].actions[4]:
-4(13)Reduce[4] -4(13)Reduce[4] -4(13)Reduce[4] -4(13)Reduce[4] 
states[11].nodes[2]:
4 5 
states[11].actions[2]:
Shift[24] Shift[25] 
states[12].nodes[2]:
4 5 
states[12].actions[2]:
-2(14)Reduce[9] -2(14)Reduce[9] 
states[13].nodes[2]:
4 5 
states[13].actions[2]:
-2(15)Reduce[11] -2(15)Reduce[11] 
states[14].nodes[3]:
2 4 5 
states[14].actions[3]:
-3(16)Reduce[12] -3(16)Reduce[12] -3(16)Reduce[12] 
states[15].nodes[3]:
2 4 5 
states[15].actions[3]:
-3(17)Reduce[13] -3(17)Reduce[13] -3(17)Reduce[13] 
states[16].nodes[3]:
2 4 5 
states[16].actions[3]:
-3(18)Reduce[14] -3(18)Reduce[14] -3(18)Reduce[14] 
states[17].nodes[3]:
2 4 5 
states[17].actions[3]:
-3(19)Reduce[15] -3(19)Reduce[15] -3(19)Reduce[15] 
states[18].nodes[3]:
2 4 5 
states[18].actions[3]:
-3(20)Reduce[16] -3(20)Reduce[16] -3(20)Reduce[16] 
states[19].nodes[3]:
2 4 5 
states[19].actions[3]:
-3(21)Reduce[17] -3(21)Reduce[17] -3(21)Reduce[17] 
states[20].nodes[3]:
2 4 5 
states[20].actions[3]:
-3(22)Reduce[18] -3(22)Reduce[18] -3(22)Reduce[18] 
states[21].nodes[4]:
0 2 4 5 
states[21].actions[4]:
-4(23)Reduce[3] -4(23)Reduce[3] -4(23)Reduce[3] -4(23)Reduce[3] 
states[22].nodes[2]:
6 17 
states[22].actions[2]:
-2(2)Shift[9] Goto[26] 
states[23].nodes[10]:
1 3 6 8 9 10 11 13 14 19 
states[23].actions[10]:
-4(0)Shift[4] -4(1)Shift[5] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[27] 
states[24].nodes[4]:
0 2 4 5 
states[24].actions[4]:
-4(24)Reduce[5] -4(24)Reduce[5] -4(24)Reduce[5] -4(24)Reduce[5] 
states[25].nodes[11]:
1 3 6 8 9 10 11 13 14 18 19 
states[25].actions[11]:
-4(0)Shift[4] -4(1)Shift[5] -3(8)Shift[18] -3(4)Shift[14] -3(5)Shift[15] -3(6)Shift[16] -3(7)Shift[17] -3(9)Goto[19] -3(10)Goto[20] Goto[28] -2(3)Goto[13] 
states[26].nodes[2]:
2 5 
states[26].actions[2]:
-2(25)Reduce[6] -2(25)Reduce[6] 
states[27].nodes[2]:
2 5 
states[27].actions[2]:
-2(26)Reduce[10] -2(26)Reduce[10] 
states[28].nodes[2]:
4 5 
states[28].actions[2]:
-2(27)Reduce[8] -2(27)Reduce[8] 

它是第三个实现,这是目前使用的实现方式。为了加载路径上的方便,我将其从Json.gen\SyntaxParser文件夹挪到了Json.gen文件夹下。

Json.Regulations.gen.cs_

这是一个数组,记录了整个Json文法的全部规则:

Json.Regulations.gen.cs_

using System;
using bitzhuwei.Compiler;

namespace bitzhuwei.JsonFormat {
    partial class CompilerJson {
        public static readonly IReadOnlyList<Regulation> regulations = new Regulation[] {
            // [0] Json = Object ;
            new(0, st.Json枝, st.Object枝), 
            // [1] Json = Array ;
            new(1, st.Json枝, st.Array枝), 
            // [2] Object = '{' '}' ;
            new(2, st.Object枝, st.@LeftBrace符, st.@RightBrace符), 
            // [3] Object = '{' Members '}' ;
            new(3, st.Object枝, st.@LeftBrace符, st.Members枝, st.@RightBrace符), 
            // [4] Array = '[' ']' ;
            new(4, st.Array枝, st.@LeftBracket符, st.@RightBracket符), 
            // [5] Array = '[' Elements ']' ;
            new(5, st.Array枝, st.@LeftBracket符, st.Elements枝, st.@RightBracket符), 
            // [6] Members = Members ',' Member ;
            new(6, st.Members枝, st.Members枝, st.@Comma符, st.Member枝), 
            // [7] Members = Member ;
            new(7, st.Members枝, st.Member枝), 
            // [8] Elements = Elements ',' Element ;
            new(8, st.Elements枝, st.Elements枝, st.@Comma符, st.Element枝), 
            // [9] Elements = Element ;
            new(9, st.Elements枝, st.Element枝), 
            // [10] Member = 'string' ':' Value ;
            new(10, st.Member枝, st.@string, st.@Colon符, st.Value枝), 
            // [11] Element = Value ;
            new(11, st.Element枝, st.Value枝), 
            // [12] Value = 'null' ;
            new(12, st.Value枝, st.@null), 
            // [13] Value = 'true' ;
            new(13, st.Value枝, st.@true), 
            // [14] Value = 'false' ;
            new(14, st.Value枝, st.@false), 
            // [15] Value = 'number' ;
            new(15, st.Value枝, st.@number), 
            // [16] Value = 'string' ;
            new(16, st.Value枝, st.@string), 
            // [17] Value = Object ;
            new(17, st.Value枝, st.Object枝), 
            // [18] Value = Array ;
            new(18, st.Value枝, st.Array枝), 
        };
    }
}

为了减少内存占用,这个硬编码的实现方式也已经被一个二进制文件(Json.Regulations.gen.bin)取代了。现在此文件夹仅供学习参考用。因此我将C#文件的扩展名cs改为cs_,以免其被编译。

Json.Regulations.gen.bin对应的文本格式

19
12 = 1 (13)
12 = 1 (14)
13 = 2 (1 | 2)
13 = 3 (1 | 15 | 2)
14 = 2 (3 | 4)
14 = 3 (3 | 16 | 4)
15 = 3 (15 | 5 | 17)
15 = 1 (17)
16 = 3 (16 | 5 | 18)
16 = 1 (18)
17 = 3 (6 | 7 | 19)
18 = 1 (19)
19 = 1 (8)
19 = 1 (9)
19 = 1 (10)
19 = 1 (11)
19 = 1 (6)
19 = 1 (13)
19 = 1 (14)

总而言之,如下所示:

image

生成的提取器

所谓提取,就是按后序优先遍历的顺序访问语法树的各个结点,在访问时提取出语义信息。

例如,{ "a": 0.3, "b": true, "a": "again" }的语法树是这样的:

R[0] Json = Object ;⛪T[0->12]
 └─R[3] Object = '{' Members '}' ;⛪T[0->12]
    ├─T[0]='{' {
    ├─R[6] Members = Members ',' Member ;⛪T[1->11]
    │  ├─R[6] Members = Members ',' Member ;⛪T[1->7]
    │  │  ├─R[7] Members = Member ;⛪T[1->3]
    │  │  │  └─R[10] Member = 'string' ':' Value ;⛪T[1->3]
    │  │  │     ├─T[1]='string' "a"
    │  │  │     ├─T[2]=':' :
    │  │  │     └─R[15] Value = 'number' ;⛪T[3]
    │  │  │        └─T[3]='number' 0.3
    │  │  ├─T[4]=',' ,
    │  │  └─R[10] Member = 'string' ':' Value ;⛪T[5->7]
    │  │     ├─T[5]='string' "b"
    │  │     ├─T[6]=':' :
    │  │     └─R[13] Value = 'true' ;⛪T[7]
    │  │        └─T[7]='true' true
    │  ├─T[8]=',' ,
    │  └─R[10] Member = 'string' ':' Value ;⛪T[9->11]
    │     ├─T[9]='string' "a"
    │     ├─T[10]=':' :
    │     └─R[16] Value = 'string' ;⛪T[11]
    │        └─T[11]='string' "again"
    └─T[12]='}' }

按后序优先遍历的顺序,提取器会依次访问T[0]T[1]T[2]T[3]并将其入栈,然后访问R[15] Value = 'number' ;⛪T[3],此时应当:


// [15] Value = 'number' ;
var r0 = (Token)context.rightStack.Pop();// T[3]出栈
var left = new JsonValue(JsonValue.Kind.Number, r0.value);
context.rightStack.Push(left);// Value入栈

之后会访问R[10] Member = 'string' ':' Value ;⛪T[1->3],此时应当:

// [10] Member = 'string' ':' Value ;
var r0 = (JsonValue)context.rightStack.Pop();// Value出栈
var r1 = (Token)context.rightStack.Pop();// T[2]出栈
var r2 = (Token)context.rightStack.Pop();// T[1]出栈
var left = new JsonMember(key: r2.value, value: r0);
context.rightStack.Push(left);// Member入栈

这样逐步地访问到根节点R[0] Json = Object ;⛪T[0->12],此时应当:

var r0 = (List<JsonMember>)context.rightStack.Pop();// Member列表出栈
var left = new Json(r0);
context.rightStack.Push(left);// Json入栈

这样,语法树访问完毕了,栈context.rightStack中有且只有1个对象,即最终的Json。此时应当:


// [-1] Json' = Json ;
context.result = (Json)context.rightStack.Pop();

提取器的完整代码InitializeExtractorItems

using System;
using bitzhuwei.Compiler;

namespace bitzhuwei.JsonFormat {
    partial class CompilerJson {
        /// <summary>
        /// <see cref="LRNode.type"/> -&gt; <see cref="Action{LRNode, TContext{Json}}"/>
        /// </summary>
        private static readonly Action<LRNode, TContext<Json>>?[]
            @jsonExtractorItems = new Action<LRNode, TContext<Json>>[1/*'¥'*/ + 8/*Vn*/];

        /// <summary>
        /// initialize dict for extractor.
        /// </summary>
        private static void InitializeExtractorItems() {
            var extractorItems = @jsonExtractorItems;

            #region obsolete
            //extractorDict.Add(st.NotYet,
            //(node, context) => {
            // not needed.
            //});
            //extractorDict.Add(st.Error,
            //(node, context) => {
            // nothing to do.
            //});
            //extractorDict.Add(st.blockComment,
            //(node, context) => {
            // not needed.
            //});
            //extractorDict.Add(st.inlineComment,
            //(node, context) => {
            // not needed.
            //});
            #endregion obsolete

            extractorItems[st.@终/*0*/] = static (node, context) => {
                // [-1] Json' = Json ;
                // dumped by user-defined extractor
                context.result = (Json)context.rightStack.Pop();
            }; // end of extractorItems[st.@终/*0*/] = (node, context) => { ... };
            const int lexiVtCount = 11;
            extractorItems[st.Json枝/*12*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 0: { // [0] Json = Object ;
                    // dumped by user-defined extractor
                    var r0 = (List<JsonMember>)context.rightStack.Pop();
                    var left = new Json(r0);
                    context.rightStack.Push(left);
                }
                break;
                case 1: { // [1] Json = Array ;
                    // dumped by user-defined extractor
                    var r0 = (List<JsonValue>)context.rightStack.Pop();
                    var left = new Json(r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Json枝/*12*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Object枝/*13*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 2: { // [2] Object = '{' '}' ;
                    // dumped by user-defined extractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = new List<JsonMember>();
                    context.rightStack.Push(left);
                }
                break;
                case 3: { // [3] Object = '{' Members '}' ;
                    // dumped by user-defined extractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r1 = (List<JsonMember>)context.rightStack.Pop();
                    var r2 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = r1;
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Object枝/*13*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Array枝/*14*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 4: { // [4] Array = '[' ']' ;
                    // dumped by user-defined extractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = new List<JsonValue>();
                    context.rightStack.Push(left);
                }
                break;
                case 5: { // [5] Array = '[' Elements ']' ;
                    // dumped by user-defined extractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r1 = (List<JsonValue>)context.rightStack.Pop();
                    var r2 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = r1;
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Array枝/*14*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Members枝/*15*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 6: { // [6] Members = Members ',' Member ;
                    // dumped by user-defined extractor
                    var r0 = (JsonMember)context.rightStack.Pop();
                    var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r2 = (List<JsonMember>)context.rightStack.Pop();
                    var left = r2;
                    left.Add(r0);
                    context.rightStack.Push(left);
                }
                break;
                case 7: { // [7] Members = Member ;
                    // dumped by user-defined extractor
                    var r0 = (JsonMember)context.rightStack.Pop();
                    var left = new List<JsonMember>();
                    left.Add(r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Members枝/*15*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Elements枝/*16*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 8: { // [8] Elements = Elements ',' Element ;
                    // dumped by user-defined extractor
                    var r0 = (JsonValue)context.rightStack.Pop();
                    var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r2 = (List<JsonValue>)context.rightStack.Pop();
                    var left = r2;
                    left.Add(r0);
                    context.rightStack.Push(left);
                }
                break;
                case 9: { // [9] Elements = Element ;
                    // dumped by user-defined extractor
                    var r0 = (JsonValue)context.rightStack.Pop();
                    var left = new List<JsonValue>();
                    left.Add(r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Elements枝/*16*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Member枝/*17*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 10: { // [10] Member = 'string' ':' Value ;
                    // dumped by user-defined extractor
                    var r0 = (JsonValue)context.rightStack.Pop();
                    var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r2 = (Token)context.rightStack.Pop();
                    var left = new JsonMember(key: r2.value, value: r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Member枝/*17*/ - lexiVtCount] = (node, context) => { ... };
            /*
            extractorItems[st.Element枝(18) - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 11: { // [11] Element = Value ;
                    // dumped by DefaultExtractor
                    // var r0 = (VnValue)context.rightStack.Pop();
                    // var left = new VnElement(r0);
                    // context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Element枝(18) - lexiVtCount] = (node, context) => { ... };
            */
            extractorItems[st.Value枝/*19*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 12: { // [12] Value = 'null' ;
                    // dumped by user-defined extractor
                    var r0 = (Token)context.rightStack.Pop();
                    var left = new JsonValue(JsonValue.Kind.Null, r0.value);
                    context.rightStack.Push(left);
                }
                break;
                case 13: { // [13] Value = 'true' ;
                    // dumped by user-defined extractor
                    var r0 = (Token)context.rightStack.Pop();
                    var left = new JsonValue(JsonValue.Kind.True, r0.value);
                    context.rightStack.Push(left);
                }
                break;
                case 14: { // [14] Value = 'false' ;
                    // dumped by user-defined extractor
                    var r0 = (Token)context.rightStack.Pop();
                    var left = new JsonValue(JsonValue.Kind.False, r0.value);
                    context.rightStack.Push(left);
                }
                break;
                case 15: { // [15] Value = 'number' ;
                    // dumped by user-defined extractor
                    var r0 = (Token)context.rightStack.Pop();
                    var left = new JsonValue(JsonValue.Kind.Number, r0.value);
                    context.rightStack.Push(left);
                }
                break;
                case 16: { // [16] Value = 'string' ;
                    // dumped by user-defined extractor
                    var r0 = (Token)context.rightStack.Pop();
                    var left = new JsonValue(JsonValue.Kind.String, r0.value);
                    context.rightStack.Push(left);
                }
                break;
                case 17: { // [17] Value = Object ;
                    // dumped by user-defined extractor
                    var r0 = (List<JsonMember>)context.rightStack.Pop();
                    var left = new JsonValue(r0);
                    context.rightStack.Push(left);
                }
                break;
                case 18: { // [18] Value = Array ;
                    // dumped by user-defined extractor
                    var r0 = (List<JsonValue>)context.rightStack.Pop();
                    var left = new JsonValue(r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Value枝/*19*/ - lexiVtCount] = (node, context) => { ... };

        }
    }
}

不同的应用场景会要求不同的语义信息,因而一键生成的提取器代码不是这样的,而是仅仅将语法树压平了,并且保留了尽可能多的源代码信息,如下所示:

一键生成的提取器代码

using System;
using bitzhuwei.Compiler;

namespace bitzhuwei.JsonFormat {
    partial class CompilerJson {
        /// <summary>
        /// <see cref="LRNode.type"/> -&gt; <see cref="Action{LRNode, TContext{Json}}"/>
        /// </summary>
        private static readonly Action<LRNode, TContext<Json>>?[]
            @jsonExtractorItems = new Action<LRNode, TContext<Json>>[1/*'¥'*/ + 8/*Vn*/];

        /// <summary>
        /// initialize dict for extractor.
        /// </summary>
        private static void InitializeExtractorItems() {
            var extractorItems = @jsonExtractorItems;

            #region obsolete
            //extractorDict.Add(st.NotYet,
            //(node, context) => {
            // not needed.
            //});
            //extractorDict.Add(st.Error,
            //(node, context) => {
            // nothing to do.
            //});
            //extractorDict.Add(st.blockComment,
            //(node, context) => {
            // not needed.
            //});
            //extractorDict.Add(st.inlineComment,
            //(node, context) => {
            // not needed.
            //});
            #endregion obsolete

            extractorItems[st.@终/*0*/] = static (node, context) => {
                // [-1] Json' = Json ;
                // dumped by ExternalExtractor
                var @final = (VnJson)context.rightStack.Pop();
                var left = new Json(@final);
                context.result = left; // final step, no need to push into stack.
            }; // end of extractorItems[st.@终/*0*/] = (node, context) => { ... };
            const int lexiVtCount = 11;
            extractorItems[st.Json枝/*12*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 0: { // [0] Json = Object ;
                    // dumped by InheritExtractor
                    // class VnObject : VnJson
                    var r0 = (VnObject)context.rightStack.Pop();
                    var left = r0;
                    context.rightStack.Push(left);
                }
                break;
                case 1: { // [1] Json = Array ;
                    // dumped by InheritExtractor
                    // class VnArray : VnJson
                    var r0 = (VnArray)context.rightStack.Pop();
                    var left = r0;
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Json枝/*12*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Object枝/*13*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 2: { // [2] Object = '{' '}' ;
                    // dumped by DefaultExtractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = new VnObject(r1, r0);
                    context.rightStack.Push(left);
                }
                break;
                case 3: { // [3] Object = '{' Members '}' ;
                    // dumped by DefaultExtractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r1 = (VnMembers)context.rightStack.Pop();
                    var r2 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = new VnObject(r2, r1, r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Object枝/*13*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Array枝/*14*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 4: { // [4] Array = '[' ']' ;
                    // dumped by DefaultExtractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = new VnArray(r1, r0);
                    context.rightStack.Push(left);
                }
                break;
                case 5: { // [5] Array = '[' Elements ']' ;
                    // dumped by DefaultExtractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r1 = (VnElements)context.rightStack.Pop();
                    var r2 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = new VnArray(r2, r1, r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Array枝/*14*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Members枝/*15*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 6: { // [6] Members = Members ',' Member ;
                    // dumped by ListExtractor 2
                    var r0 = (VnMember)context.rightStack.Pop();
                    var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r2 = (VnMembers)context.rightStack.Pop();
                    var left = r2;
                    left.Add(r1, r0);
                    context.rightStack.Push(left);
                }
                break;
                case 7: { // [7] Members = Member ;
                    // dumped by ListExtractor 1
                    var r0 = (VnMember)context.rightStack.Pop();
                    var left = new VnMembers(r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Members枝/*15*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Elements枝/*16*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 8: { // [8] Elements = Elements ',' Element ;
                    // dumped by ListExtractor 2
                    var r0 = (VnElement)context.rightStack.Pop();
                    var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r2 = (VnElements)context.rightStack.Pop();
                    var left = r2;
                    left.Add(r1, r0);
                    context.rightStack.Push(left);
                }
                break;
                case 9: { // [9] Elements = Element ;
                    // dumped by ListExtractor 1
                    var r0 = (VnElement)context.rightStack.Pop();
                    var left = new VnElements(r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Elements枝/*16*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Member枝/*17*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 10: { // [10] Member = 'string' ':' Value ;
                    // dumped by DefaultExtractor
                    var r0 = (VnValue)context.rightStack.Pop();
                    var r1 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var r2 = (Token)context.rightStack.Pop();
                    var left = new VnMember(r2, r1, r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Member枝/*17*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Element枝/*18*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 11: { // [11] Element = Value ;
                    // dumped by DefaultExtractor
                    var r0 = (VnValue)context.rightStack.Pop();
                    var left = new VnElement(r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Element枝/*18*/ - lexiVtCount] = (node, context) => { ... };
            extractorItems[st.Value枝/*19*/ - lexiVtCount] = static (node, context) => {
                switch (node.regulation.index) {
                case 12: { // [12] Value = 'null' ;
                    // dumped by DefaultExtractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = new VnValue(r0);
                    context.rightStack.Push(left);
                }
                break;
                case 13: { // [13] Value = 'true' ;
                    // dumped by DefaultExtractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = new VnValue(r0);
                    context.rightStack.Push(left);
                }
                break;
                case 14: { // [14] Value = 'false' ;
                    // dumped by DefaultExtractor
                    var r0 = (Token)context.rightStack.Pop();// reserved word is omissible
                    var left = new VnValue(r0);
                    context.rightStack.Push(left);
                }
                break;
                case 15: { // [15] Value = 'number' ;
                    // dumped by DefaultExtractor
                    var r0 = (Token)context.rightStack.Pop();
                    var left = new VnValue(r0);
                    context.rightStack.Push(left);
                }
                break;
                case 16: { // [16] Value = 'string' ;
                    // dumped by DefaultExtractor
                    var r0 = (Token)context.rightStack.Pop();
                    var left = new VnValue(r0);
                    context.rightStack.Push(left);
                }
                break;
                case 17: { // [17] Value = Object ;
                    // dumped by DefaultExtractor
                    var r0 = (VnObject)context.rightStack.Pop();
                    var left = new VnValue(r0);
                    context.rightStack.Push(left);
                }
                break;
                case 18: { // [18] Value = Array ;
                    // dumped by DefaultExtractor
                    var r0 = (VnArray)context.rightStack.Pop();
                    var left = new VnValue(r0);
                    context.rightStack.Push(left);
                }
                break;
                default: throw new NotImplementedException();
                }
            }; // end of extractorItems[st.Value枝/*19*/ - lexiVtCount] = (node, context) => { ... };

        }
    }
}

这是步子最小的保守式代码,程序员可以在此基础上继续开发,也可以自行编写访问各类型结点的提取动作。本应用场景的目的是尽可能高效地解析Json文本文件,因而完全自行编写了访问各类型结点的提取动作。

测试

测试用例0

{}

测试用例1

[]

测试用例2

{ "a": 0.3 }

测试用例3

{
  "a": 0.3,
  "b": true
}

测试用例4

{
  "a": 0.3,
  "b": true,
  "a": "again"
}

测试用例5

{
  "a": 0.3,
  "b": true,
  "a": "again",
  "array": [
    1,
    true,
    null,
    "str",
    {
      "t": 100,
      "array2": [ false, 3.14, "tmp" ]
    }
  ]
}

上述测试用例都能够被Json解析器正确解析,也可以在(https://jsonlint.com/)验证。

调用Json解析器的代码如下:

var compiler = new bitzhuwei.JsonFormat.CompilerJson();
var sourceCode = File.ReadAllText("xxx.json");
var tokens = compiler.Analyze(sourceCode);
var syntaxTree = compiler.Parse(tokens);
var json = compiler.Extract(syntaxTree.root, tokens, sourceCode);
// use json ...

文章转载自:BIT祝威

原文链接:C#实现自己的Json解析器(LALR(1)+miniDFA) - BIT祝威 - 博客园

体验地址:引迈 - JNPF快速开发平台_低代码开发平台_零代码开发平台_流程设计器_表单引擎_工作流引擎_软件架构

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2320900.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

机器学习——KNN数据均一化

在KNN&#xff08;K-近邻&#xff09;算法中&#xff0c;数据均一化&#xff08;归一化&#xff09;是预处理的关键步骤&#xff0c;用于消除不同特征量纲差异对距离计算的影响。以下是两种常用的归一化操作及其核心要点&#xff1a; 质押 一 、主要思想 1. 最值归一化&#…

异步编程与流水线架构:从理论到高并发

目录 一、异步编程核心机制解析 1.1 同步与异步的本质区别 1.1.1 控制流模型 1.1.2 资源利用对比 1.2 阻塞与非阻塞的技术实现 1.2.1 阻塞I/O模型 1.2.2 非阻塞I/O模型 1.3 异步编程关键技术 1.3.1 事件循环机制 1.3.2 Future/Promise模式 1.3.3 协程&#xff08;Cor…

哈尔滨工业大学DeepSeek公开课人工智能:大模型原理 技术与应用-从GPT到DeepSeek|附视频下载方法

导 读INTRODUCTION 今天继续哈尔滨工业大学车万翔教授带来了一场主题为“DeepSeek 技术前沿与应用”的报告。 本报告深入探讨了大语言模型在自然语言处理&#xff08;NLP&#xff09;领域的核心地位及其发展历程&#xff0c;从基础概念出发&#xff0c;延伸至语言模型在机器翻…

Excel处理控件Spire.XLS系列教程:C# 在 Excel 中添加或删除单元格边框

单元格边框是指在单元格或单元格区域周围添加的线条。它们可用于不同的目的&#xff0c;如分隔工作表中的部分、吸引读者注意重要的单元格或使工作表看起来更美观。本文将介绍如何使用 Spire.XLS for .NET 在 C# 中添加或删除 Excel 单元格边框。 安装 Spire.XLS for .NET E-…

Web开发-JS应用NodeJS原型链污染文件系统Express模块数据库通讯

知识点&#xff1a; 1、安全开发-NodeJS-开发环境&功能实现 2、安全开发-NodeJS-安全漏洞&案例分析 3、安全开发-NodeJS-特有漏洞 node.js就是专门运行javascript的一个应用程序&#xff0c;区别于以往用浏览器解析原生js代码&#xff0c;node.js本身就可以解析执行js代…

国产达梦(DM)数据库的安装(Linux系统)

目录 一、安装前的准备工作 1.1 导包 1.2 创建用户和组 1.3 修改文件打开最大数 1.4 目录规划 1.5 修改目录权限 二、安装DM8 2.1 挂载镜像 2.2 命令行安装 2.3 配置环境变量 2.4 启动图形化界面 三、配置实例 四、注册服务 五、启动 停止 查看状态 六、数据库客…

git的底层原理

git的底层原理 三段话总结git&#xff0c; 1. 工作原理&#xff1a;git管理是一个DAG有向无环图&#xff0c;HEAD指针指向branch或直接指向commit&#xff0c;branch指向commit&#xff0c;commit指向tree&#xff0c;tree指向别的tree或直接指向blob。 2. git所管理的一个目录…

MATLAB+Arduino利用板上的按键控制板上Led灯

几年不使用&#xff0c;之前的知识都忘掉了。需要逐步捡起来。 1 熟悉按键的使用 2熟悉灯的控制 1 电路 我们将通过 MATLAB 的 Arduino 支持包与 Arduino 板通信&#xff0c;读取按键状态并控制 LED 灯的亮灭。 按键&#xff1a;连接到 Arduino 的数字引脚&#xff08;例如…

Cocos Creator Shader入门实战(五):材质的了解、使用和动态构建

引擎&#xff1a;3.8.5 您好&#xff0c;我是鹤九日&#xff01; 回顾 前面的几篇文章&#xff0c;讲述的主要是Cocos引擎对Shader使用的一些固定规则&#xff0c;这里汇总下&#xff1a; 一、Shader实现基础是OpenGL ES可编程渲染管线&#xff0c;开发者只需关注顶点着色器和…

vue设置自定义logo跟标题

准备 Logo 图片 将自定义的 Logo 图片&#xff08;如 logo.png&#xff09;放置在项目的 public文件夹下。 使用环境变量设置 Logo 和标题&#xff08;可选&#xff09; 创建或修改 .env 文件 在项目根目录下创建或修改 .env 文件&#xff0c;添加以下内容&#xff1a; VITE_A…

尝试在软考65天前开始成为软件设计师-计算机网络

OSI/RM 七层模型 层次名功能主要协议7应用层实现具体应用功能 FTP(文件传输)、HTTP、Telnet、 POP3(邮件)SMTP(邮件) ------- DHCP、TFTP(小文件)、 SNMP、 DNS(域名) 6表示层数据格式,加密,压缩.....5会话层建立,管理&终止对话4传输层端到端连接TCP,UDP3网络层分组传输&a…

VMware主机换到高配电脑,高版本系统的问题

原来主机是i3 ,windows7系统&#xff0c;vmware 14.0,虚机系统是ubuntu 14.04。目标新机是i7 14700KF,windows11系统。原以为安装虚拟机&#xff0c;将磁盘文件&#xff0c;虚拟机配置文件拷贝过去可以直接用。 新目标主机先安装了vmware 15&#xff0c;运行原理虚机&#xff0…

【Linux内核系列】:动静态库详解

&#x1f525; 本文专栏&#xff1a;Linux &#x1f338;作者主页&#xff1a;努力努力再努力wz &#x1f4aa; 今日博客励志语录&#xff1a; 有些鸟儿是注定是关不住的&#xff0c;因为它们的每一片羽翼都沾满了自由的光辉 ★★★ 本文前置知识&#xff1a; 编译与链接的过程…

windows环境下NER Python项目环境配置(内含真的从头安的perl配置)

注意 本文是基于完整项目的环境配置&#xff0c;即本身可运行项目你拿来用 其中有一些其他问题&#xff0c;知道的忽略即可 导入pycharm基本包怎么下就不说了&#xff08;这个都问&#xff1f;给你一拳o(&#xff40;ω*)o&#xff09; 看perl跳转第5条 1.predict报错多个设备…

IDEA批量替换项目下所有文件中的特定内容

文章目录 1. 问题引入2. 批量替换项目下所有文件中的特定内容2.1 右键项目的根目录&#xff0c;点击在文件中替换2.2 输入要替换的内容 3. 解决替换一整行文本后出现空行的问题4. 增加筛选条件提高匹配的精确度 更多 IDEA 的使用技巧可以查看 IDEA 专栏&#xff1a; IDEA 1. 问…

【蓝桥杯】4535勇闯魔堡(多源BFS + 二分)

思路 k有一个范围&#xff08;0到怪物攻击的最大值&#xff09;&#xff0c;求满足要求的k的最小值。很明显的二分套路。 关键是check函数怎么写&#xff0c;我们需要找到一条从第一行到最后一行的路径&#xff0c;每一次可以从上下左右四个方向前进&#xff0c;那么我么可以用…

HTML图像标签的详细介绍

1. 常用图像格式 格式特点适用场景JPEG有损压缩&#xff0c;文件小&#xff0c;不支持透明适合照片、复杂图像PNG无损压缩&#xff0c;支持透明&#xff08;Alpha通道&#xff09;适合图标、需要透明背景的图片GIF支持动画&#xff0c;最多256色简单动画、低色彩图标WebP谷歌开…

Chapter 4-15. Troubleshooting Congestion in Fibre Channel Fabrics

show zone member: Shows the name of the zone to which a device belongs to. This command can be used to find the victims of a culprit device or vice versa. 显示设备所属的区域名称。该命令可用于查找罪魁祸首设备的受害者,反之亦然。 show zone active: Shows the…

QT三 自定义控件

一 自定义控件 现在的需求是这样&#xff1a; 假设我们要在QWidget 上做定制&#xff0c;这个定制包括了关于 一些事件处理&#xff0c;意味着要重写QWidget的一些代码&#xff0c;这是不实际的&#xff0c;因此我们需要自己写一个MyWidget继承QWidget&#xff0c;然后再MyWi…

在 ASP .NET Core 9.0 中使用 Scalar 创建漂亮的 API 文档

示例代码&#xff1a;https://download.csdn.net/download/hefeng_aspnet/90407900 Scalar 是一款可帮助我们为 API 创建精美文档的工具。与感觉有些过时的默认 Swagger 文档不同&#xff0c;Scalar 为 API 文档提供了全新而现代的 UI。其简洁的设计让开发人员可以轻松找到测试…