文章目录
- 一、BLEU-N得分(Bilingual Evaluation Understudy)
- 1. 定义
- 2. 计算
- N=1
- N=2
- BLEU-N 得分
- 3. 程序
给定一个生成序列“The cat sat on the mat”和两个参考序列“The cat is on the mat”“The bird sat on the bush”分别计算BLEU-N和ROUGE-N得分(N=1或N =2时).
- 生成序列 x = the cat sat on the mat \mathbf{x}=\text{the cat sat on the mat} x=the cat sat on the mat
- 参考序列
- s ( 1 ) = the cat is on the mat \mathbf{s}^{(1)}=\text{the cat is on the mat} s(1)=the cat is on the mat
- s ( 2 ) = the bird sat on the bush \mathbf{s}^{(2)}=\text{the bird sat on the bush} s(2)=the bird sat on the bush
一、BLEU-N得分(Bilingual Evaluation Understudy)
1. 定义
设 𝒙 为模型生成的候选序列, s ( 1 ) , ⋯ , s ( K ) \mathbf{s^{(1)}}, ⋯ , \mathbf{s^{(K)}} s(1),⋯,s(K) 为一组参考序列,𝒲 为从生成的候选序列中提取所有N元组合的集合。BLEU算法的精度(Precision)定义如下:
P N ( x ) = ∑ w ∈ W min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) ∑ w ∈ W c w ( x ) P_N(\mathbf{x}) = \frac{\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))}{\sum_{w \in \mathcal{W}} c_w(\mathbf{x})} PN(x)=∑w∈Wcw(x)∑w∈Wmin(cw(x),maxk=1Kcw(s(k)))
其中 c w ( x ) c_w(\mathbf{x}) cw(x) 是N元组合 w w w 在生成序列 x \mathbf{x} x中出现的次数, c w ( s ( k ) ) c_w(\mathbf{s}^{(k)}) cw(s(k)) 是N元组合 w w w 在参考序列 s ( k ) \mathbf{s}^{(k)} s(k) 中出现的次数。
为了处理生成序列长度短于参考序列的情况,引入长度惩罚因子 b ( x ) b(\mathbf{x}) b(x):
b ( x ) = { 1 if l x > l s exp ( 1 − l s l x ) if l x ≤ l s b(\mathbf{x}) = \begin{cases} 1 & \text{if } l_x > l_s \\ \exp\left(1 - \frac{l_s}{l_x}\right) & \text{if } l_x \leq l_s \end{cases} b(x)={1exp(1−lxls)if lx>lsif lx≤ls
其中 l x l_x lx 是生成序列的长度, l s l_s ls 是参考序列的最短长度。
BLEU算法通过计算不同长度的N元组合的精度,并进行几何加权平均,得到最终的BLEU分数:
BLEU-N ( x ) = b ( x ) × exp ( ∑ N = 1 N ′ α N log P N ( x ) ) \text{BLEU-N}(\mathbf{x}) = b(\mathbf{x}) \times \exp\left( \sum_{N=1}^{N'} \alpha_N \log P_N(\mathbf{x})\right) BLEU-N(x)=b(x)×exp N=1∑N′αNlogPN(x)
其中 N ′ N' N′ 为最长N元组合的长度, α N \alpha_N αN 是不同N元组合的权重,一般设为 1 / N ′ 1/N' 1/N′。
2. 计算
N=1
- 生成序列 x = the cat sat on the mat \mathbf{x}=\text{the cat sat on the mat} x=the cat sat on the mat
- 参考序列
- s ( 1 ) = the cat is on the mat \mathbf{s}^{(1)}=\text{the cat is on the mat} s(1)=the cat is on the mat
- s ( 2 ) = the bird sat on the bush \mathbf{s}^{(2)}=\text{the bird sat on the bush} s(2)=the bird sat on the bush
-
W
=
the, cat, sat, on, mat
\mathcal{W}=\text{ {the, cat, sat, on, mat}}
W= the, cat, sat, on, mat
-
w
=
the
w=\text{the}
w=the
- c w ( x ) = 2 , c w ( s ( 1 ) ) = 2 , c w ( s ( 2 ) ) = 2 c_w(\mathbf{x})=2, c_w(\mathbf{s^{(1)}})=2,c_w(\mathbf{s^{(2)}})=2 cw(x)=2,cw(s(1))=2,cw(s(2))=2
- max k = 1 K c w ( s ( k ) ) ) = 2 \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=2 maxk=1Kcw(s(k)))=2
- min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) = 2 \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=2 min(cw(x),maxk=1Kcw(s(k)))=2
-
w
=
cat
w=\text{cat}
w=cat
- c w ( x ) = 1 , c w ( s ( 1 ) ) = 1 , c w ( s ( 2 ) ) = 0 c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=1,c_w(\mathbf{s^{(2)}})=0 cw(x)=1,cw(s(1))=1,cw(s(2))=0
- max k = 1 K c w ( s ( k ) ) ) = 1 \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 maxk=1Kcw(s(k)))=1
- min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) = 1 \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 min(cw(x),maxk=1Kcw(s(k)))=1
-
w
=
sat
w=\text{sat}
w=sat
- c w ( x ) = 1 , c w ( s ( 1 ) ) = 0 , c w ( s ( 2 ) ) = 1 c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=0, c_w(\mathbf{s^{(2)}})=1 cw(x)=1,cw(s(1))=0,cw(s(2))=1
- max k = 1 K c w ( s ( k ) ) ) = 1 \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 maxk=1Kcw(s(k)))=1
- min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) = 1 \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 min(cw(x),maxk=1Kcw(s(k)))=1
-
w
=
on
w=\text{on}
w=on
- c w ( x ) = 1 , c w ( s ( 1 ) ) = 1 , c w ( s ( 2 ) ) = 1 c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=1,c_w(\mathbf{s^{(2)}})=1 cw(x)=1,cw(s(1))=1,cw(s(2))=1
- max k = 1 K c w ( s ( k ) ) ) = 1 \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 maxk=1Kcw(s(k)))=1
- min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) = 1 \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 min(cw(x),maxk=1Kcw(s(k)))=1
-
w
=
mat
w=\text{mat}
w=mat
- c w ( x ) = 1 , c w ( s ( 1 ) ) = 1 , c w ( s ( 2 ) ) = 0 c_w(\mathbf{x})=1, c_w(\mathbf{s^{(1)}})=1,c_w(\mathbf{s^{(2)}})=0 cw(x)=1,cw(s(1))=1,cw(s(2))=0
- max k = 1 K c w ( s ( k ) ) ) = 1 \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 maxk=1Kcw(s(k)))=1
- min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) = 1 \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1 min(cw(x),maxk=1Kcw(s(k)))=1
-
w
=
the
w=\text{the}
w=the
- ∑ w ∈ W min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) = 2 + 1 + 1 + 1 + 1 + 1 = 6 \sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=2+1+1+1+1+1=6 ∑w∈Wmin(cw(x),maxk=1Kcw(s(k)))=2+1+1+1+1+1=6
- ∑ w ∈ W c w ( x ) = 1 + 1 + 1 + 1 + 1 + 1 = 6 \sum_{w \in \mathcal{W}} c_w(\mathbf{x})=1+1+1+1+1+1=6 ∑w∈Wcw(x)=1+1+1+1+1+1=6
- P 1 ( x ) = ∑ w ∈ W min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) ∑ w ∈ W c w ( x ) = 6 6 = 1 P_1(\mathbf{x}) = \frac{\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))}{\sum_{w \in \mathcal{W}} c_w(\mathbf{x})}= \frac{6}{6}=1 P1(x)=∑w∈Wcw(x)∑w∈Wmin(cw(x),maxk=1Kcw(s(k)))=66=1
N=2
- 生成序列 x = the cat sat on the mat \mathbf{x}=\text{the cat sat on the mat} x=the cat sat on the mat
- 参考序列
- s ( 1 ) = the cat is on the mat \mathbf{s}^{(1)}=\text{the cat is on the mat} s(1)=the cat is on the mat
- s ( 2 ) = the bird sat on the bush \mathbf{s}^{(2)}=\text{the bird sat on the bush} s(2)=the bird sat on the bush
- W = the cat, cat sat, sat on, on the, the mat \mathcal{W}=\text{{the cat, cat sat, sat on, on the, the mat} } W=the cat, cat sat, sat on, on the, the mat
w w w | c w ( x ) c_w(\mathbf{x}) cw(x) | c w ( s ( 1 ) ) c_w(\mathbf{s^{(1)}}) cw(s(1)) | c w ( s ( 2 ) ) c_w(\mathbf{s^{(2)}}) cw(s(2)) | max k = 1 K c w ( s ( k ) ) ) \max_{k=1}^{K} c_w(\mathbf{s}^{(k)})) maxk=1Kcw(s(k))) | min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)})) min(cw(x),maxk=1Kcw(s(k))) |
---|---|---|---|---|---|
the cat | 1 | 1 | 0 | 1 | 1 |
cat sat | 1 | 0 | 0 | 0 | 0 |
sat on | 1 | 0 | 1 | 1 | 1 |
on the | 1 | 1 | 1 | 1 | 1 |
the mat | 1 | 1 | 0 | 1 | 1 |
- ∑ w ∈ W min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) = 1 + 0 + 1 + 1 + 1 = 4 \sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))=1+0+1+1+1=4 ∑w∈Wmin(cw(x),maxk=1Kcw(s(k)))=1+0+1+1+1=4
- ∑ w ∈ W c w ( x ) = 1 + 1 + 1 + 1 + 1 = 5 \sum_{w \in \mathcal{W}} c_w(\mathbf{x})=1+1+1+1+1=5 ∑w∈Wcw(x)=1+1+1+1+1=5
- P 2 ( x ) = ∑ w ∈ W min ( c w ( x ) , max k = 1 K c w ( s ( k ) ) ) ∑ w ∈ W c w ( x ) = 4 5 P_2(\mathbf{x}) = \frac{\sum_{w \in \mathcal{W}} \min(c_w(\mathbf{x}), \max_{k=1}^{K} c_w(\mathbf{s}^{(k)}))}{\sum_{w \in \mathcal{W}} c_w(\mathbf{x})}= \frac{4}{5} P2(x)=∑w∈Wcw(x)∑w∈Wmin(cw(x),maxk=1Kcw(s(k)))=54
BLEU-N 得分
为了处理生成序列长度短于参考序列的情况,引入长度惩罚因子 b ( x ) b(\mathbf{x}) b(x): b ( x ) = { 1 if l x > l s exp ( 1 − l s l x ) if l x ≤ l s b(\mathbf{x}) = \begin{cases} 1 & \text{if } l_x > l_s \\ \exp\left(1 - \frac{l_s}{l_x}\right) & \text{if } l_x \leq l_s \end{cases} b(x)={1exp(1−lxls)if lx>lsif lx≤ls其中 l x l_x lx 是生成序列的长度, l s l_s ls 是参考序列的最短长度。
这里 l x = l s ( 1 ) = l s ( 2 ) = 6 l_x=l_{s^{(1)}}=l_{s^{(2)}}=6 lx=ls(1)=ls(2)=6,因此 b ( x ) = e ( 1 − l s l x ) = e 0 = 1 b(\mathbf{x}) =e^{\left( 1 - \frac{l_s}{l_x} \right)}=e^0=1 b(x)=e(1−lxls)=e0=1
BLEU算法通过计算不同长度的N元组合的精度,并进行几何加权平均,得到最终的BLEU分数:
BLEU-N
(
x
)
=
b
(
x
)
×
exp
(
1
N
′
∑
N
=
1
N
′
α
N
log
P
N
(
x
)
)
\text{BLEU-N}(\mathbf{x}) = b(\mathbf{x}) \times \exp\left(\frac{1}{N'} \sum_{N=1}^{N'} \alpha_N \log P_N(\mathbf{x})\right)
BLEU-N(x)=b(x)×exp
N′1N=1∑N′αNlogPN(x)
其中
N
′
N'
N′ 为最长N元组合的长度,
α
N
\alpha_N
αN 是不同N元组合的权重,一般设为
1
/
N
′
1/N'
1/N′。
BLEU-N
(
x
)
=
1
×
exp
(
∑
N
=
1
2
1
2
log
P
N
(
x
)
)
=
exp
(
1
2
log
P
1
(
x
)
+
1
2
log
P
2
(
x
)
)
=
exp
(
1
2
log
1
+
1
2
log
4
5
)
=
exp
(
0
+
log
4
5
)
=
4
5
\text{BLEU-N}(\mathbf{x}) = 1 \times\exp\left( \sum_{N=1}^{2} \frac{1}{2} \log P_N(\mathbf{x})\right)\\ =\exp\left(\frac{1}{2}\log P_1(\mathbf{x})+\frac{1}{2}\log P_2(\mathbf{x)}\right)\\ =\exp\left(\frac{1}{2}\log 1+\frac{1}{2}\log \frac{4}{5}\right)\\ =\exp\left(0+\log \sqrt\frac{4}{5}\right)\\ =\sqrt\frac{4}{5}
BLEU-N(x)=1×exp(N=1∑221logPN(x))=exp(21logP1(x)+21logP2(x))=exp(21log1+21log54)=exp(0+log54)=54
3. 程序
main_string = 'the cat sat on the mat'
string1 = 'the cat is on the mat'
string2 = 'the bird sat on the bush'
# 计算单词
unique_words = set(main_string.split())
total_occurrences, matching_occurrences = 0, 0
for word in unique_words:
count_main_string = main_string.count(word)
total_occurrences += count_main_string
matching_occurrences += min(count_main_string, max(string1.count(word), string2.count(word)))
similarity_word = matching_occurrences / total_occurrences
print(f"N=1: {similarity_word}")
# 计算双词
word_tokens = main_string.split()
bigrams = set([f"{word_tokens[i]} {word_tokens[i + 1]}" for i in range(len(word_tokens) - 1)])
total_occurrences, matching_occurrences = 0, 0
for bigram in bigrams:
count_main_string = main_string.count(bigram)
total_occurrences += count_main_string
matching_occurrences += min(count_main_string, max(string1.count(bigram), string2.count(bigram)))
similarity_bigram = matching_occurrences / total_occurrences
print(f"N=2: {similarity_bigram}")
输出:
N=1: 1.0
N=2: 0.8