Lecture 14 Context-Free Grammar

news2025/1/9 19:03:54

目录

    • Context-Free Grammar
      • Basics of Context-Free Grammars
      • CFG Parsing
    • Constituents
      • Syntactic Constituents
      • Constituents and Phrases
      • Example: A Simple CFG for English and generating sentences
      • CFG Trees
    • CYK Algorithm
      • CYK Algorithm
      • Convert to Chomsky Normal Form
      • The CYK Parsing Algorithm
    • Representing English with CFGs
      • From Toy Grammars to Real Grammars
      • Key Constituents in Penn Treebank
      • Basic English Sentence Structures
      • English Noun Phrases
      • English Verb Phrases
      • Other Constituents

Context-Free Grammar

Basics of Context-Free Grammars

  • Symbols:

    • Terminal: word such as book
    • Non-terminal: syntactic label such as NP and VP
    • convention:
      • lowercase for terminals
      • uppercase for non-terminals
  • Productions:

    • W -> X Y Z
    • Exactly one non-terminal on the LHS
    • An ordered list of symbols on the RHS, can be terminals or non-terminals
  • Start symbol: S

  • Context-Free:

    • Production rules depends only on the LHS, not on ancestors, neighbors
      • Analogous to Markov chain
      • Behaviour at each steps depends only on the current state
    • Context-Free languages more general than regular languages. Allows recursive nesting

CFG Parsing

  • Given production rules: E.g.

    • S -> a S b
    • S -> a b
  • And a string: aaabbb

  • Produce a valid parse tree:

    在这里插入图片描述

  • If English can be represented with CFG:

    • First develop the production rules
    • Can then build a parser to automatically judge whether a sentence is grammatical
  • CFG strike a good balance:

    • CFG covers most syntactic patterns
    • CFG parsing in computational efficient

Constituents

Syntactic Constituents

  • Sentences are broken into constituents

    • Word sequence that function as a coherent unit for linguistic analysis
    • Helps build CFG production rules
  • Constituents have certain key properties:

    • movement: Constituents can be moved around sentences:
      • Abigail gave [her brother] [a fish]
      • Abigail gave [a fish] to [her brother]
      • Contrast: [gave her] and [brother a]
    • substitution: Constituents can be substituted by other phrases of the same type:
      • Max thanked [his older sister]
      • Max thanked [her]
      • Contrast: [Max thanked] and [thanked his]
    • coordination: Constituents can be conjoined with other coordinators like and and or"
      • [Abigail] and [her young brother] brought a fish
      • Abigail [bought a fish] and [gave it to Max]
      • Abigail [bought] and [greedily ate] a fish

Constituents and Phrases

  • Once identify constituents, use phrases to describe them

  • Phrases are determined by their head word:

    • Noun phrase: her younger brother
    • Verb phrase: greedily ate it

Example: A Simple CFG for English and generating sentences

  • Terminal Symbols: rat, the, ate, cheese

  • Non-terminal symbols: S, NP, VP, DT, VBD, NN

  • Productions:

    • S -> NP VP
    • NP -> DT NN
    • VP -> VBD NP
    • DT -> the
    • NN -> rat
    • NN -> cheese
    • VBD -> ate
  • Generating Sentences with CFGs:

    在这里插入图片描述

CFG Trees

  • Generation corresponds to a syntactic tree

  • Non-terminals are internal nodes

  • Terminals are leaves

  • CFG parsing is the reverse process

  • E.g.:

    在这里插入图片描述

CYK Algorithm

CYK Algorithm

  • Bottom-up parsing

  • Tests whether a string is valid given a CFG, without enumerating all possible parses

  • Core idea: Form small constituents first, and merge them into larger constituents

  • Requirement: CFGs must be in Chomsky Normal Forms

Convert to Chomsky Normal Form

  • Change grammar so all rules of form:

    • A -> B C: Non-terminal LHS to two non-terminals RHS
    • A -> a: Non-terminal LHS to one terminal RHS
  • To meet requirements

    • convert rules of form A -> B c into:

      • A -> B X and X -> c
    • convert rules of form A -> B C D into:

      • A -> B Y and Y -> C D
  • CNF disallows unary rules like A -> B to avoid infinite loops

    • Replace RHS non-terminal with its productions:
      • A -> B, B -> cat, B -> dog -> A -> cat, A -> dog

The CYK Parsing Algorithm

  • Convert grammar to Chomsky Normal Form

  • Fill in a parse table, left to right, bottom to top

  • Use table to derive parse

  • S in top right corner of table -> success

  • Convert result back to original grammar

  • E.g.

    在这里插入图片描述

  • Retrieving the Parses

    • S in the top-right corner of parse table indicates success

    • To get parses, follow pointer back for each match:

      在这里插入图片描述


* If multiple solutions are available, all of the trees are valid:》 ![在这里插入图片描述](https://img-blog.csdnimg.cn/38dc4af0eae548febe7f0cdce3932513.png#pic_center)
  • Pseudo Code:
function CYK-Parse(words, grammar) returns table
    for j <- from 1 to LENGTH(words) do
        for all {A | A -> words[j] ∈ grammar}
            table[j-1, j] <- table[j-1, j] ∪ A
        for i <- from j-2 down to 0 do
            for k <- i + 1 to j - 1 do
                for all {A | A -> BC ∈ grammar and B ∈ table[i, k] and C ∈ table[k, j]
                    table[i, j] <- table[i, j] ∪ A
                    
    return table

Representing English with CFGs

From Toy Grammars to Real Grammars

  • Toy grammars with handful productions good for demonstration or extremely limited domains

  • For real texts, we need real grammars

  • Many thousands of production rules

Key Constituents in Penn Treebank

  • Sentence S
  • Noun phrase NP
  • Verb phrase VP
  • Prepositional phrase PP
  • Adjective phrase AdjP
  • Adverbial phrase AdvP
  • Subordinate clause SBAR
  • E.g.

    在这里插入图片描述

Basic English Sentence Structures

  • Declarative sentences S -> NP VP

    • The rat ate the cheese
  • Imperative sentences S -> VP

    • Eat the cheese
  • Yes/no questions S -> VB NP VP

    • Did the rat eat the cheese?
  • Wh-subject questions S -> WH VP

    • Who ate the cheese
  • Wh-object questions S -> WH VB NP VP

    • What did the rat eat?

English Noun Phrases

  • Pre-modifiers:

    • DT, CD, ADJP, NNP, NN
    • E.g.: the two very best Philly cheese steaks
  • Post-modifiers:

    • PP, VP, SBAR
    • E.g.: A delivery from Bob coming today that I don't want to miss

English Verb Phrases

  • Auxiliaries

    • MD, AdvP, VB, TO
    • E.g.: should really have tried to wait
  • Arguments and adjuncts

    • NP, PP, SBAR, VP, AdvP
    • E.g.: told him yesterday that I was ready

Other Constituents

  • Prepositional phrase:

    • PP -> IN NP
    • E.g.: in the house
  • Adjective phrase:

    • AdjP -> (AdvP) JJ
    • E.g.: really nice
  • Adverb phrase:

    • AdvP -> (AdvP) RB
    • not too well
  • Subordinate clause

    • SBAR -> (IN) S
    • E.g.: since I came here
  • Coordination

    • NP -> NP CC NP; VP -> VP CC VP
    • E.g.: Jack and Jill
  • Complex sentences

    • S -> S SBAR; S -> SBAR S
    • E.g.: if he goes, I'll go

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/627867.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

影响电磁铁磁力大小的因素有哪些

影响电磁铁磁力大小的因素主要有四个&#xff0c;一是缠绕在铁芯上线圈的圈数&#xff0c;二是线圈中电流的强度&#xff0c;三是缠绕的线圈与铁芯的距离&#xff0c;四是铁芯的大小形状。 首先要了解电磁铁的磁性是如何产生的&#xff0c;通电螺线管的磁场&#xff0c;由毕奥&…

C#程序设计实验

C#实验 实验1 C# 基本编程 题目 VS下新建一个控制台项目&#xff1a;诸如&#xff1a;hello world程序&#xff0c;运行程序输出结果。并解释C#程序的结构&#xff1a; 诸如&#xff1a;一个基本的C#程序包含几部分 &#xff0c;每一部分的功能是什么。。。 完整代码 usin…

YAPI接口自动化测试该如何正确地操作

目录 前言&#xff1a; 1、它首先是一个很好的接口维护的工具&#xff1b; 2、单个接口测试时&#xff0c;更方便灵活&#xff0c;更易用&#xff1b; 3、接口自动化测试&#xff0c;可以0代码基础进行接口集合的测试&#xff1b; 前言&#xff1a; YAPI是一款易于使用、可…

Lecture 15 Probabilistic Context-Free Grammar

目录 Ambiguity in Parsing Basics of PCFGsBasics of PCFGsStochastic Generation with PCFGs PCFG ParsingCYK for PCFGs Limitations of CFGPoor Independence AssumptionsLack of Lexical Conditioning Ambiguity in Parsing Context-Free grammars assign hierarchical st…

OpenELB 在 CVTE 的最佳实践

作者&#xff1a;大飞哥&#xff0c;视源电子股份运维工程师&#xff0c; KubeSphere 社区用户委员会广州站站长&#xff0c;KubeSphere Ambassador。 公司介绍 广州视源电子科技股份有限公司&#xff08;以下简称视源股份&#xff09;成立于 2005 年 12 月&#xff0c;旗下拥…

最详细整理,HttpRunner接口自动化框架Hook机制详解(详细)

目录&#xff1a;导读 前言一、Python编程入门到精通二、接口自动化项目实战三、Web自动化项目实战四、App自动化项目实战五、一线大厂简历六、测试开发DevOps体系七、常用自动化测试工具八、JMeter性能测试九、总结&#xff08;尾部小惊喜&#xff09; 前言 httprunner 4.x可…

软体机器人,刚柔软机器人仿真建模,干货满满,直接上图!

一、 背景&#xff1a; 软体机器人技术是近年来机器人领域最为热门的研究领域之一。软体机器人具有天然的柔 性、自适应性、低成本和被动安全性&#xff0c;在人机交互、医疗服务等领域具有广泛的应用前景。同时&#xff0c; 软体机器人的研究涉及软材料、机构设计、仿生学、微…

全链路压测

一般区分为两种&#xff1a;测试环境和生产环境压测。因生产环境的压测和真实用户的使用环境完全一致&#xff0c;测试结果更具有参考性。 全链路的压测的实施一般需要给压测请求带一个压测标识&#xff0c;用于压测数据的数据落库&#xff0c;查询&#xff0c;缓存&#xff0c…

设备维修管理系统

设备维修管理系统能够有效提高设备管理水平和设备运行效率。它不仅能够帮助企业实现设备信息化管理&#xff0c;还可以快速定位设备故障&#xff0c;提高设备修复效率&#xff0c;从而更好地保障生产安全和生产效率。 凡尔码搭建设备维护保养管理系统主要由以下几个模块组成&am…

【TA100】图形 2.2 模型与材质基础

一、 渲染管线与模型基础 1.可编程渲染管线 ● 蓝色背景的&#xff1a;可编程管线 ● 顶点着色器&#xff1a;模型的顶点进行计算 ● 片元着色器&#xff1a;将光栅化阶段插值的信息进行计算 2.uv ● 纹理映射&#xff1a;任何3D物体的表面都是2D的→纹理就是一张图→纹理…

6个免费商用图片素材库,再也不用担心版权问题了

本期给大家分享6个免费可商用的视频素材网站&#xff0c;设计师、自媒体、视频剪辑有福啦&#xff0c;再也不用担心版权问题了&#xff0c;记得收藏起来哦~ 菜鸟图库 https://www.sucai999.com/pic.html#?vNTYxMjky 网站主要是为新手设计师提供免费素材的&#xff0c;素材的…

[CKA]考试之基于角色的访问控制-RBAC

由于最新的CKA考试改版&#xff0c;不允许存储书签&#xff0c;本博客致力怎么一步步从官网把答案找到&#xff0c;如何修改把题做对&#xff0c;下面开始我们的 CKA之旅 题目为&#xff1a; Context&#xff1a; 为部署流水线创建一个新的ClusterRole并将其绑定到范围为特定…

Pandas的to_sql()插入数据到mysql中所遇到的问题

使用pymysql驱动API&#xff0c;出现如下错误&#xff1a; DatabaseError: Execution failed on sql ‘SELECT name FROM sqlite_master WHERE type‘table’ AND name?;’: not all arguments converted during string formatting 1. pandas的数据表插入数据到mysql中所遇到…

王道考研数据结构代码总结(后四章)

目录 树基本概念与属性树的基本性质 图拓扑排序 本文包含王道考研讲课中所涉及的数据结构中的所有代码&#xff0c;当PPT代码和书上代码有所区别时以咸鱼的PPT为主&#xff0c;个人认为PPT上的代码比王道书上的代码要便于理解&#xff0c;此外&#xff0c;本博客也许会补充一些…

css01:顶部导航栏,左右分离布局

css01&#xff1a;顶部导航栏&#xff0c;左右分离布局 效果 代码 <!DOCTYPE html> <html><head><meta charset"utf-8"><title>顶部导航栏</title><style>body {margin: 0;padding: 0;}.top-nav {background-color: #ff…

Python采集二手车数据信息,看看啥车最得心意

前言 大家早好、午好、晚好吖 ❤ ~欢迎光临本文章 环境使用: python 3.8 运行代码 pycharm 2022.3.2 辅助敲代码 专业版是付费的 <码可以免费用> 社区版是免费的 模块使用: 内置模块 无需安装 csv 第三方模块 需要安装的 requests >>> pip install req…

大数据可视化开源平台,一招让数据资源活泛起来!

在现代化办公环境中&#xff0c;数据资源也是非常重要的一种发展要素。有不少朋友会私信我们询问道&#xff1a;如何将企业内部的数据资源利用起来&#xff0c;真正发挥其价值为我所有&#xff1f;在这里&#xff0c;推荐大家了解大数据可视化开源平台&#xff0c;这是可以为企…

深度学习的各种卷积的总结

如果你听说过深度学习中不同种类的卷积&#xff08;比如 2D / 3D / 1x1 /转置/扩张&#xff08;Atrous&#xff09;/空间可分/深度可分/平展/分组/混洗分组卷积&#xff09;&#xff0c;并且搞不清楚它们究竟是什么意思&#xff0c;那么这篇文章就是为你写的&#xff0c;能帮你…

既然jmeter也能做接口自动化,为什么还需要pytest自己搭框架?

今天这篇文章呢&#xff0c;我会从以下几个方面来介绍&#xff1a; 1、首先介绍一下pytest框架 2、带大家安装Pytest框架 3、使用pytest框架时需要注意的点 4、pytest的运行方式 5、pytest框架中常用的插件 一、pytest框架介绍 pytest 是 python 的第三方单元测试框架&a…

微信如何群发消息?如何群发突破200上限?

相信每到各种节日的时候&#xff0c;很多人都会发布或收到微信好友的节日祝福或活动通知。群发已经是一件很普遍的事了。逢年过节&#xff0c;发个微信祝福&#xff0c;是维系关系的必须&#xff1b;发个活动通知&#xff0c;是为了告知客户&#xff0c;促进销售。 01 微信自带…