【论文阅读 CIKM‘2021】Learning Multiple Intent Representations for Search Queries

news2025/4/14 4:03:04

文章目录

- Original Paper
- Motivation
- Method
- - Task Description and Problem Formulation
  - NMIR Framework: A High-Level Overview
  - Model Implementation and Training
- Data

Original Paper

Learning Multiple Intent Representations for Search Queries

More related papers can be found in :

ShiyuNee/Awesome-Conversation-Clarifying-Questions-for-Information-Retrieval: Papers about Conversation and Clarifying Questions (github.com)

Motivation

The typical use of representation models has a major limitation in that they generate only a single representation for a query, which may have multiple intents or facets.

propose NMIR(Neural Multiple Intent Representations) to support multiple intent representations for each query

Method

Task Description and Problem Formulation

training query set: $\{q_1,\cdots,q_n\}$
$D_i = {d_{i1},\cdots,d_{im}}$ be the top m retrieved documents in response to the query $q_i$
$F_i=\{f_{i1},\cdots,f_{ik}\}$ denote the set of all textual intent descriptions associated with the query $q_i$
- $k_i$ is the number of query intents

NMIR Framework: A High-Level Overview

one straightforward solution:
- using an encoder-decoder architecture
  - input: query $q_i$
  - output: generates multiple query intent descriptions of the query by taking the top $k_i$ most likely predictions
- drawback: These generations are often synonyms or refer to the same concept
another straightforward solution:
- task as a sequence-to-sequence problem
  - input: query $q_i$
  - output: generate all the query intent descriptions concatenated with each other(like translation)
- drawback:
  - different intent representations are not distinguishable in the last layer of the model.
  - most existing effective text encoding models are not able to represent long sequences of tokens, such as a concatenation of the top 𝑚 retrieved documents

NMIR Framework:

𝜙 (·) and 𝜓 (·) denote a text encoder and decoder pair

Step1: NMIR assigns each learned document representation to one of the query intent descriptions $f_ij$ ∈ 𝐹𝑖 using a document-intent matching algorithm 𝛾:

在这里插入图片描述

$C_i^*$ is a set of documents and each $C_{ij}^*$ is a set of documents form $D_i$ that are assigned to $f_{ij}$ by 𝛾.

Step2: NMIR then transforms the encoded general query representation to its intent representations through a query intent encoder 𝜁.

the representation for the $j^{th}$ query intent is obtained using 𝜁 (𝑞𝑖 , $C_{ij}^*$ ;𝜙).

Train: training for a mini-batch 𝑏 is based on a gradient descent-based minimization:

在这里插入图片描述

$q_{ij}^*$ is a concatenation of the query string, the first 𝑗 −1 intent descriptions, and 𝑘𝑖 − 𝑗 mask tokens
- given the associated cluster $C_{ij}^*$ and the encoded query text plus the past 𝑗−1 intent descriptions.
- helps the model avoid generating the previous intent representations and learn widely distributed representations

where $L_{CE}$ is the cross-entropy loss

在这里插入图片描述

$f_{ijt}$ is the $t^{th}$ token in the given intent description $f_{ij}$ .

Inference: $q_{ij}^*$ s are constructed differently.

first feed“𝑞𝑖 …” to the model and apply beam search to the decoder’s output to obtain
the first intent description $f_{i1}$ '.
then use the model’s output to iteratively create the input for the next step “𝑞𝑖 $f_{i1}$ ’ …”and repeat this process

Model Implementation and Training

在这里插入图片描述

Figure1(a) represents the model architecture.

use Transformer encoder and decoder architectures(pre-trained BART) for implementing 𝜙 and𝜓, respectively

The intent encoding component 𝜁 : use $N^{'}$ layers Guided Transformer model

Guided Transformer is used for influencing an input representation by the guidance of some external information.
- we use 𝜙 ( $q_{ij}$ ) as the input representation and 𝜙 (𝑑) :∀𝑑 ∈ $C_{ij}^*$ as the external information.

The document-intent matching component 𝛾 : develop a clustering algorithm

encodes all the top retrieved documents and creates $k_i$ clusters, using a clustering algorithm(use K-Means).
- $C_{ij} = \{C_{i1},\cdots,C_{ik_i}\}$ denotes a set of clusters and each $C_{ij}$ contains all the documents in the 𝑗 th cluster associated with the query 𝑞𝑖 .
- $M_i=\{\mu_{i1},\cdots\mu_{ik_i}\}$ is a set of all cluster centroids such that $\mu_{ij}$ = centroid( $C_{ij}$ ).
K-Means requires the number of clusters as input.
- consider two cases at inference time
  - assume the number of clusters is equal to a tuned hyper-parameter 𝑘∗ for all queries
  - replace the K-Means algorithm by a non-parametric version of K-Means
Issue: The component 𝛾 requires a one-to-one assignment between the cluster centroids and the query intents in the training data, all clusters may be assigned to a single most dominant query intent. So we use the intent identification function I:
- my view: the problem is how to assign centroids to query intents after clustering.
output:

𝛾 is not differentiable and cannot be part of the network for gradient descent-based optimization. We move it to an asynchronous process as figure1(b) below:

在这里插入图片描述

Asynchronous training: use asynchronous training method to speed up(the clustering of document representations is an efficiency bottleneck) described as figure1(b)

Data

training data: We follow a weak supervision solution based on the MIMIC-Click dataset, recently released by Zamani et al. MIMICS: A Large-Scale Data Collection for Search Clarification
evaluation data: Qulac dataset

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.coloradmin.cn/o/66238.html

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈，一经查实，立即删除！