Lecture 16 Dependency Grammar

news2024/11/26 7:27:54

目录

    • Dependency Grammar
      • Dependency Grammar
      • Dependency Relations
      • Application: Question Answering
      • Application: Information Extraction
      • Dependency vs. Constituency
      • Properties of a Dependency Tree
      • Projectivity
      • Treebank Conversion
    • Transition-based Parsing
      • Dependency Parsing
      • Transition-Based Parsing
      • Parsing Model
    • Graph-based Parsing
      • Graph-based Parsing

Dependency Grammar

Dependency Grammar

  • Dependency grammar offers a simpler approach to CFG:

    • Describe relations between pairs of words
    • Namely, between heads and dependents
  • Deal better with languages that are morphologically rich and have a relatively free word order

    • CFG need a separate rule for each possible place a phrase can occur in
  • Head-dependent relations similar to semantic relations between words

    • More useful for applications: coreference resolution, information extraction

Dependency Relations

  • Captures the grammatical relation between:

    • Head: central word
    • Dependent: supporting word
  • Grammatical relation: subject, direct object, …

  • Many dependency theories and taxonomies proposed for different languages

  • Universal dependency: a framework to create a set of dependency relations that are computationally useful and cross-lingual

    在这里插入图片描述

Application: Question Answering

  • Dependency tree more directly represents the core of the sentence: who did want to whom?
    • Captured by the links incident on verb nodes

    在这里插入图片描述

Application: Information Extraction

  • Brasilia, the Brazilian capital, was founded in 1960. to capital(Brazil, Brasilia); founded(Brasilia, 1960)
  • Dependency tree captures relations succinctly

    在这里插入图片描述

Dependency vs. Constituency

  • Dependency tree:

    • Each node is a word token
    • One node is chosen as the root
    • Directed edges link heads and their dependents
  • Constituency tree:

    • Forms a hierarchical tree
    • Word tokens are the leaves
    • Internal nodes are constituent phrases
  • Both use POS

Properties of a Dependency Tree

  • Each word has a single head (parent)
  • There is a single root node
  • There is a unique path to each word from the root
  • All arcs should be projective

Projectivity

  • An arc is projetive if there is a path from head to every word that lies between head and the dependent

    在这里插入图片描述

  • Dependency tree is projective if all arcs are projective. In other words, a dependency tree is projective if it can be drawn with no crossing edges

  • Most sentences are projective, but exception exist

Treebank Conversion

  • A few dependency treebanks but many constituency treebanks
  • Some constituency treebanks can be converted into dependencies
  • Dependency trees generated from constituency trees are always projective
  • Main idea: Identify head-dependent relations in constituency structure and the corresponding dependency relations
    • Use various heuristics
    • Often with manual correction

Transition-based Parsing

Dependency Parsing

  • Find the best structure for a given input sentence

  • Two main approaches:

    • Transition-based: Bottom-up greedy method
    • Graph-based: encodes problem using nodes/edges and use graph theory methods to find optimal solutions
  • Caveat:

    • Transition-based parsers can only handle projective dependency trees
    • Less applicable for languages where cross-dependencies are common

Transition-Based Parsing

  • Processes word from left to right

  • Maintain two data structures:

    • Buffer: Input words yet to be processed
    • Stack: Store words that are being processed
  • At each step, perform one of the 3 actions:

    • Shift: Move a word from buffer to stack
    • Left-Arc: Assign current words as the head of the previous word in stack
    • Right-Arc: Assign previous word as head of current word in stack
  • E.g.

    在这里插入图片描述


    在这里插入图片描述

  • For simplicity, omit labels on the dependency relations. In practice, we parameterize the left-arc and right-arc with dependency labels:

    • E.g. left-arc-nsubj or right-arc-dobj
  • Expands the list of actions to > 3 types

  • Assume an oracle that tells the correct action at every step. Given a dependency tree, the role of oracle is to generate a sequence of ground truth action.

Parsing Model

  • Train a supervised model to mimic the actions of the oracle

    • To learn at every step the correct action to take
    • At test time, the trained model can be used to parse a sentence to create the dependency tree
  • Parse as Classification:

    • Input:
      • Stack: top two elements: s1 and s2
      • Buffer: first element: b1
    • Output:
      • 3 classes: shift, left-arc, right-arc
    • Features:
      • word, part-of-speech
  • Traditionally SVM works best. Nowadays, deep learning models are SOTA

  • Weakness: local classifier based on greedy search

  • Solution:

    • Beam Search: Keep track of top-N best actions
    • Dynamic oracle: during training, use predicted actions occasionally
    • Graph-based parser

Graph-based Parsing

Graph-based Parsing

  • Given an input sentence, construct a fully-connected, weighted, directed graph

  • Vertices: all words

  • Edges: head-dependent arcs

  • Weight: score based on training data (relation that is frequently observed receive in a higher score)

  • Objective: find the maximum spanning tree (Kruskal’s algorithm)

  • Advantage

    • Can produce non-projective trees

    • Score the entire trees:

      • Avoid making greedy local decisions like transition-based parsers
      • Captures long dependencies better
  • E.g.:

    在这里插入图片描述

  • Caveat: Tree may contain cycles

    • Solution: Need to do cleanup to remove cycles

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/627540.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

【openEuler 20.03 TLS编译openGauss2.1.0源码】

openEuler 20.03 TLS编译openGauss2.1.0源码 一、安装环境二、安装前准备二、安装步骤 一、安装环境 项目Value操作系统openEuler 20.03 64bit with ARMopenGauss2.1.0openGauss-third_party2.1.0 二、安装前准备 项目Value购买华为ECS鲲鹏 8vCPU32G 100M/s带宽 openEuler 2…

组网配置案例

汇聚层与接入层或者汇聚层与核心层组网 案例: port-group技术:可实现对于所有组内接口进行相同配置 [ACC-1]port-group group-member GigabitEthernet 0/0/1 to GigabitEthernet 0/0/9 [ACC-1-port-group]port link-type access [ACC-1-port-group]port…

【大数据学习番外篇之爬虫1】 爬虫的介绍与基本使用

目录 1. 爬虫的介绍 2. 爬虫基本使用 2.1 爬取搜狗首页的页面数据 2.2 网页采集器 1. 爬虫的介绍 前戏: 1.你是否在夜深人静的时候,想看一些会让你更睡不着的图片却苦于没有资源... 2.你是否在节假日出行高峰的时候,想快速抢购火…

【CentOS安装软件系列】Centos7安装Mysql8

前言 公司的系统都是内网部署的,mysql也需要自己在内网部署,虽然使用docker安装很方便,但是有一定的风险。所以简单记录一下怎么安装mysql,以备不时之需。 一、下载安装包 下载地址:https://downloads.mysql.com/arch…

javaScript蓝桥杯-----天气趋势 A

目录 一、介绍二、准备三、目标四、代码五、完成 一、介绍 日常生活中,气象数据对于人们的生活具有非常重要的意义,数据的表现形式多种多样,使用图表进行展示使数据在呈现上更加直观。 本题请实现一个 Y 城 2022 年的天气趋势图。 二、准备…

【Python】一文带你认识 Web 框架之 FastAPI

作者主页:爱笑的男孩。的博客_CSDN博客-深度学习,活动,python领域博主爱笑的男孩。擅长深度学习,活动,python,等方面的知识,爱笑的男孩。关注算法,python,计算机视觉,图像处理,深度学习,pytorch,神经网络,opencv领域.https://blog.csdn.net/Code_and516?typeblog个…

TI DSP芯片C2000系列读取FLASH数据

本文记录如何读取TI芯片的flash数据 进入TI官网下载UNIFLASH工具 点击查看详情 点击下载选项,根据系统下载对应版本 下载完成之后,点击安装。安装完成之后双击图标点开。如果你的板子已经供电,且编程器已经连接好,UNIFLASH会自动…

使用C++处理一行输入未知个数的字符的问题

今天分享一下使用C处理一行输入未知个数的字符的问题。 一,问题描述 在一行输入未知个数字符,以回车结束输入。 二,分析问题 第一种方式使用String类型,直接读入一串字符,这种方法简单高效。第二种方式一个一个字符…

HCIA-DHCP,FTP,Telnet

目录 DHCP: DHCP的优点: DHCP的工作原理 DHCP的配置 DHCP全局地址池案例: FTP FTP介绍 FTP基本配置 Telnet Telnet的应用场景 Telnet设备配置 Telnet配置案例 DHCP: 解决传统手工配置IP的问题,可以实现IP的…

Openlayers优化加载地图瓦片太慢的问题,Openlayers瓦片缓存实现和请求失败瓦片重试功能

专栏目录: OpenLayers入门教程汇总目录 前言 Openlayers默认加载地图瓦片很慢,通过对比使用openlayers和leaflet加载速度,能够明显看到openlayers加载速度比leaflet要慢很多。 通过Openlayers源码发现是因为Openlayers的瓦片加载机制是通过tileQueue瓦片加载队列来顺序加载…

变电所运维云平台在电力系统中的应用

安科瑞虞佳豪 变电所运维云平台可以看做是电力监控系统的网络应用延伸,变电所运维云平台通过互联网,电力运维人员通过手机可以随时随地了解工厂配电系统的运行情况,做到无人值守或者少人值守,同时可以监测用能状况、漏电、线缆异…

基于图像识别框架Airtest的Windows项目自动化测试实践

写在前面 本次分享的内容是基于Airtest实现Windows应用的自动化测试,内容大纲: Airtest框架介绍:Airtest适用项目、Airtest特点、Airtest的优势 Airtest框架组成、原理 Airtest环境搭建及IDE的简单使用 Airtest开展Windows应用自动化测试实…

Navicat恢复数据库连接及查询sql的解决办法

文章目录 如题一. 恢复Navicat数据库连接信息注册表编辑工具 二. 恢复Navicat每个数据库的sql文件等 如题 因为公司给电脑加域,导致使用新的用户账户,原先的很多配置都失效了,本篇是讲述一下如何恢复数据库连接工具Navicat的连接数据。 一.…

300多个日夜的付出,从外包跑路的我,上岸阿里,没人知道我经历了什么

前言: 没有绝对的天才,只有持续不断的付出。对于我们每一个平凡人来说,改变命运只能依靠努力幸运,但如果你不够幸运,那就只能拉高努力的占比。 2023年5月,我有幸成为阿里的一名自动化测试工程师&#xff…

RTSP/Onvif协议安防视频平台EasyNVR服务频繁重启是什么原因?

EasyNVR平台优秀的视频能力在于通过RTSP/ONVIF协议,将前端接入设备的音视频资源进行采集,并转码成适合全平台、全终端分发的视频流格式,包括RTSP、RTMP、FLV、HLS、WebRTC等格式。平台可拓展性强、部署轻快,在安防监控领域有着广泛…

MVVM (Model-View-ViewModel Pattern)

MVVM 模式中有三个核心组件:模型、视图和视图模型。 每个组件的用途不同。 下图显示了这三个组件之间的关系。 在交互层次上,视图“了解”视图模型,视图模型“了解”模型,但模型不知道视图模型,而视图模型不知道视图。…

链表内指定区间反转

题目: 将一个节点数为 size 链表 m 位置到 n 位置之间的区间反转,要求时间复杂度 O(n),空间复杂度 O(1)。 例如: 给出的链表为 1→2→3→4→5→NULL,m2,n4 返回 1→4→3→2→5→NULL 数据范围&#xff…

日常开发中,提升技术的13个建议

前言 1. 打好基础,深入学习语言特性 比如,对于Java程序员来说,要了解Java语言的基本概念和核心特性,包括面向对象编程、集合框架、异常处理、多线程等等。可以通过阅读Java的官方文档、教程、参考书籍或在线资源来学习。 如果最基本的基础都不扎实&…

OOM 如何监控可视化、告警推送、服务自愈

OOM,out of memory,就是内存用完了耗尽了的意思。会触发kernel调用OOM killer杀进程来解除这种状况。 OOM分为虚拟内存OOM和物理内存OOM,两者是不一样的。 虚拟内存OOM发生在用户空间,用户空间分配的就是虚拟内存,不…

【裸机驱动LED】使用汇编代码驱动LED(四)—— 驱动格式开发篇

上一篇使用C语言代码来驱动LED,之前我们是手动设置的每一个寄存器的地址,但是这样的效率太低,而且很麻烦。此时我们注意到同属于 GPIO_CCGRx 这一类的寄存器地址,他们之间都相差 4 个字节。 我们要利用这一特性,将之前…