文章目录

Qulac
- - qulac.json:
  - qulac_hist012_dict.tar.gz:
MIMICS
ClariQ
- ConvAI3 Data Challenge
- - Stage1: initial dataset
  - Stage2: human-in-the-loop
- ClariQ Dataset
- - File Format
  - - train.tsv and dev.tsv
    - test.tsv
    - question_bank.tsv
    - dev_synthetic.pkl.tar.gz & train_synthetic.pkl.tar.gz
    - single_turn_train_eval.pkl & multi_turn\_***_evla.pkl.tar.gz
    - top10k_docs_dict.pkl.tar.gz
    - train.qrel & dev.qrel

Qulac

aliannejadi/qulac: Qulac: A dataset on asking Questions for Lack of Clarity in open-domain information-seeking conversations. (github.com)
在这里插入图片描述

qulac.json:

qulac.json contains the topics, facets, questions, and answers. This is the main file of Qulac. However, it may not be very straightforward to use this file for experiments directly. That is why we have provided some auxiliary data files which we describe in this document. In the qulac.json file, you will find these fields:

topic_id: the ID of the topic in TREC Web Track.
facet_id: the ID of the facet in TREC Web Track.
topic_facet_id: an ID corresponding to a topic and facet pair in the following format: %d-%d. For example, 21-1 corresponds to the first facet (facet_id=1) of the 21st topic in TREC Web Track data.
topic_facet_question_id: an ID corresponding to a topic, facet, and question triplet in the following format: %d-%d-%d. For example, 21-1-5 corresponds to the fifth question of the first facet of the 21st topic. Each row of the data is identified by this ID.
topic: the TREC topic (query).
topic_type: an str value indicating the type of a topic. Possible values are faceted and ambiguous.
facet_type: an str value indicating the type of a facet. Possible values are inf (i.e., informational) and nav (i.e., navigational).
topic_desc: a full description of the topic as it appears in the TREC Web Track data.
facet_desc: a full description of the facet (information need) as it appears in the TREC Web Track data.
question: a clarifying question that the system can pose to the user for the current topic and facet.
answer: an answer to the clarifying question, assuming that the user is in the context of the current row (i.e., the user’s initial query is topic, their information need is facet, and question has been posed to the user).

topic_id	facet_id	topic_facet_id	topic_facet_question_id	topic	topic_type	facet_type	topic_desc	facet_desc	question	answer
193	2	193-2	193-2-5	dog clean up bags	faceted	inf	Can I order dog clean-up bags online?	Are there biodegradable products for the dispo…	are you looking for a way to dispose your dog …	im looking for dog waste bags that are biodegr…
144	2	144-2	144-2-5	trombone for sale	ambiguous	inf	information on where I could buy a new or used…	good places to sell a used trombone	are you looking for a place to sell a used tro…	yes
78	3	78-3	78-3-7	dieting	ambiguous	inf	Find “reasonable” dieting advice, that is no…	Find crash diet plans that promise quick weigh…	do you want to know if dieting is safe	i would like to know more on quick and safe di…

qulac_hist012_dict.tar.gz:

qulac_hist012_dict.tar.gz can be used for experiments involving multi-turn conversations. As we have mentioned in [1], the conversations are artificially generated following the data that is available in qulac.json. Hence, the structure of the dict is as follows (after decompression):

{ <record_id>: 
	{ 
	  'history_id': <the ID of conversation history (context)>,
	  'history_list': [
				{ 'question': <question1 string>,
				  'answer': <answer1 string> },
				{ 'question': <question2 string>,
				  'answer': <answer2 string> },
				{ 'question': <question2 string>,
				  'answer': <answer2 string> },		 					 
			    ],
	 'query': <query (topic) string>,
	 'question': <current question string>,
	 'answer': <current answer string>
  }
  ....
}

Record ID:
```
topic_id - facet_id - past_question_id_1 - past_question_id_2 - current_question_id - answer_flag
```
- The flag is used to indicate whether the record is referring to the results that are obtained with (=1) or without (=0) final answer

 '18-2-1-2-10-1': {	 
	'history_id': '18-2-1-2',
	'history_list': [{'answer': 'no i just want to find spreadsheets and templates',
			'question': 'are you interested in a service for wedding budgeting'},
			{'answer': 'yes i want to find some spreadsheets to help me budget',
			'question': 'are you looking for advice on wedding budgeting'}],
	'query': 'wedding budget calculator',
	'question': 'what is your projected budget for your wedding',
	'answer': 'i need to find a spreadsheet to figure it out'},

'25-1-3-8-1' : {	 
	'history_id': '25-1-3',
	'history_list': [{'answer': 'no i am looking for information on the greek mathematician euclid',
			'question': 'do you need directions to euclid ave'}],
	'query': 'euclid',
	'question': 'do you want to know related people',
	'answer': 'no i only want to know about one particular person'}

MIMICS

microsoft/MIMICS: MIMICS: A Large-Scale Data Collection for Search Clarification (github.com)
在这里插入图片描述

Each clarification in MIMICS consists of a clarifying question and up to five candidate answers

query	headaches
question	What do you want to know about this medical condition?
candidate answers (options)	symptom, treatment, causes, diagnosis, diet

MIMICS contains three datasets:

MIMICS-Click includes over 400k unique queries, their associated clarification panes, and the corresponding aggregated user interaction signals (i.e., clicks).

[‘#HASH#value excel’, ‘What version of Excel are you looking for?’, ‘2010’, ‘2013’, ‘2016’, ‘’, ‘’, ‘medium’, ‘0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]

[‘%2f’, ‘What language are you looking for?’, ‘javascript’, ‘python’, ‘’, ‘’, ‘’, ‘medium’, ‘0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]

[‘.net’, ‘Select one to refine your search’, ‘powershell .net’, ‘iis .net’, ‘windows .net’, ‘sql .net’, ‘exchange .net’, ‘high’, ‘0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]

[‘.net 3.5 framework’, ‘Select one to refine your search’, ‘windows’, ‘powershell’, ‘xml’, ‘azure’, ‘json’, ‘high’, ‘3’, ‘0.8571428571428572’, ‘0.0’, ‘0.0’, ‘0.14285714285714285’, ‘0.0’]

MIMICS-ClickExplore is an exploration data that includes aggregated user interaction signals for over 60k unique queries, each with multiple clarification panes.

Column(s)	Description
query	(string) The query text.
question	(string) The clarifying question.
option_1, …, option_5	(string) Up to five candidate answers.
impression_level	(string) A three-level impression label (i.e., low, medium, or high).
engagement_level	(integer) A label in [0, 10] representing total user engagements.
option_cctr_1, …, option_cctr_5	(real) The conditional click probability on each candidate answer.

[‘0 degrees’, ‘Select one to refine your search’, ‘celsius’, ‘kelvin’, ‘fahrenheit’, ‘’, ‘’, ‘medium’, ‘0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]
[‘0 degrees’, ‘Select one to refine your search’, ‘fahrenheit’, ‘celsius’, ‘kelvin’, ‘’, ‘’, ‘medium’, ‘4’, ‘1.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]
[‘0 degrees’, ‘Select one to refine your search’, ‘boots for 0 degrees’, ‘gloves for 0 degrees’, ‘’, ‘’, ‘’, ‘medium’, ‘0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’, ‘0.0’]

MIMICS-Manual includes over 2k unique real search queries. Each query-clarification pair in this dataset has been manually labeled by at least three trained annotators. It contains graded quality labels for the clarifying question, the candidate answer set, and the landing result page for each candidate answer.

Column(s)	Description
query	(string) The query text.
question	(string) The clarifying question.
option_1, …, option_5	(string) Up to five candidate answers.
question_label	(integer) A three-level quality label for the clarifying question
options_overall_label	(integer) A three-level quality label for the candidate answer set
option_label_1, …, option_label_5	(integer) The conditional click probability on each candidate answer.

[‘multiple system atrophy’, ‘What do you want to know about this medical condition?’, ‘symptom’, ‘treatment’, ‘causes’, ‘diagnosis’, ‘diet’, ‘2’, ‘2’, ‘2’, ‘2’, ‘2’, ‘2’, ‘2’]

[‘team fortress 2’, ‘What would you like to know about this game?’, ‘team fortress 2 steam’, ‘team fortress 2 mods’, ‘team fortress 2 gameplay’, ‘team fortress 2 cheats’, ‘’, ‘1’, ‘2’, ‘2’, ‘2’, ‘2’, ‘2’, ‘’]

[‘google chrome exe’, ‘Select one to refine your search’, ‘64 bit’, ‘32 bit’, ‘’, ‘’, ‘’, ‘’, ‘2’, ‘2’, ‘2’, ‘’, ‘’, ‘’]
[‘google chrome exe’, ‘Select one to refine your search’, ‘32 bit’, ‘64 bit’, ‘’, ‘’, ‘’, ‘’, ‘2’, ‘2’, ‘2’, ‘’, ‘’, ‘’]

ClariQ

ConvAI3 Data Challenge

ClariQ is a part of this challenge.

The challenge ran in two stages:

stage1: participants were provided with a static dataset consisting mainly of an initial user request, clarifying question and user answer
stage2: human-in-the-loop

Stage1: initial dataset

The dataset consist of:

User Request: an initial user request in the conversational form with a label reflects if is needed ranged from 1 to 4
- 1: don’t need any clarification
- 4: need clarification (must)
Clarification question: a set of possible clarifying questions
User Answers: each questions is supplied with a user answer

Stage2: human-in-the-loop

Enables the top-performing teams of the first stage to evaluate their models with the help of human evaluators. We evaluate the performance of a system in two aspects:

how much the conversation can help a user find the information they are looking for
how natural and realistic does the conversation appear to a human evaluator

ClariQ Dataset

aliannejadi/ClariQ: ClariQ: SCAI Workshop data challenge on conversational search clarification. (github.com)

Feature	Value
# train (dev) topics	187 (50)
# faceted topics	141
# ambiguous topics	57
# single topics	39
# facets	891
# total questions	3,929
# single-turn conversations	11,489
# multi-turn conversations	~ 1 million
# documents	~ 2 million

File Format

train.tsv and dev.tsv

They have the same format, contain topics, facets, questions, answers and clarification need labels.

topic_id: the ID of the topic (initial_request).
initial_request: the query (text) that initiates the conversation.
topic_desc: a full description of the topic as it appears in the TREC Web Track data.
clarification_need: a label from 1 to 4, indicating how much it is needed to clarify a topic.
facet_id: the ID of the facet.
facet_desc: a full description of the facet (information need) as it appears in the TREC Web Track data.
question_id: the ID of the question as it appears in question_bank.tsv.
question: a clarifying question that the system can pose to the user for the current topic and facet.
answer: an answer to the clarifying question, assuming that the user is in the context of the current row (i.e., the user’s initial query is initial_request, their information need is facet_desc, and question has been posed to the user).

topic_id	initial_request	topic_desc	clarification_need	facet_id	facet_desc	question_id	question	answer
14	I’m interested in dinosaurs	I want to find information about and pictures of dinosaurs.	4	F0159	Go to the Discovery Channel’s dinosaur site, which has pictures of dinosaurs and games.	Q00173	are you interested in coloring books	no i just want to find the discovery channels website
14	I’m interested in dinosaurs	I want to find information about and pictures of dinosaurs.	4	F0159	Go to the Discovery Channel’s dinosaur site, which has pictures of dinosaurs and games.	Q03021	which dinosaurs are you interested in	im not asking for that i just want to go to the discovery channel dinosaur page

test.tsv

only contains the list of test topics, as well as their ID’s.

topic_id	initial_request
201	I would like to know more about raspberry pi
202	Give me information on uss carl vinson.

question_bank.tsv

Constitutes of all the questions in the collection. The TSV file has two columns: question_id, question(txet)

question_id	question
Q00001
Q02318	what kind of medium do you want this information to be in
Q02319	what kind of penguin are you looking for
Q02320	what kind of pictures are you looking for

Note: selecting Q00001 means selecting no question

dev_synthetic.pkl.tar.gz & train_synthetic.pkl.tar.gz

These files contain dicts of synthetically built multi-turn conversations (up to three turns).

{<record_id>: {'topic_id': <int>,
  'facet_id': <str>,
  'initial_request': <str>,
  'question': <str>,
  'answer': <str>,
  'conversation_context': [{'question': <str>,
   'answer': <str>},
  {'question': <str>,
   'answer': <str>}],
  'context_id': <int>},
  ...
  }

where

<record_id> is an int indicating the ID of the current conversation record.
- While in the dev set there exists multiple <record_id> values per <context_id>, in the test file there would be only one.
'topic_id', 'facet_id', and 'initial_request' indicate the topic, facet, and initial request of the current conversation, according to the single turn dataset.
'question': current clarifying question that is being posed to the user.
'answer': user’s answer to the clarifying question.
'conversation_context' identifies the context of the current conversation. A context consists of previous turns in a conversation. As we see, it is a list of 'question' and 'answer' items. This list tells us which questions have been asked in the conversation so far, and what has been the answer to them.
'context_id' is the ID of the conversation context. Basically, participants should predict the next utternace for each context_id.

  2288: {'topic_id': 8,
  'facet_id': 'F0969',
  'initial_request': 'I want to know about appraisals.',
  'question': 'are you looking for a type of appraiser',
  'answer': 'yes jewelry',
  'conversation_context': [],
  'context_id': 969},
  
 1570812: {'topic_id': 293,
 'facet_id': 'F0729',
 'initial_request': 'Tell me about the educational advantages of social networking sites.',
 'question': 'which social networking sites would you like information on',
 'answer': 'i don have a specific one in mind just overall educational benefits to social media sites',
 'conversation_context': [{'question': 'what level of schooling are you interested in gaining the advantages to social networking sites',
   'answer': 'all levels'},
  {'question': 'what type of educational advantages are you seeking from social networking',
   'answer': 'i just want to know if there are any'}],
 'context_id': 976573}

single_turn_train_eval.pkl & multi_turn_***_evla.pkl.tar.gz

These files are dicts of pre-computed document relevance results after asking each question

 { <evaluation_metric>: 
  	[ 
  	  <context_id>: 
  	  {
    	    <question_id> : 
  	  	 {
  	  	   'no_answer': <float>,
  	  	   'with_answer': <float>
  	  	 }
  	  	 , ... , 
  	  	 'MAX': 
  	  	  {
  	  	    'no_answer': <float>,
  	  	    'with_answer: <float>
  	  	  },
  	  	 'MIN':
  	  	  {
  	  	    'no_answer: <float>,
  	  	    'with_answer: <float>
  	  	  } 
  	  }
    ]
    ...
  }

MAX and MIN: These refer to the maximum and minimum performance that the retrieval model achieves by asking the “best” and “worst” questions among the candidate questions.

top10k_docs_dict.pkl.tar.gz

A dict consisting of a list of document ID’s for a given topic_id, this dict is useful for having the list of top 10,000 documents as an initial ranking.

train.qrel & dev.qrel

These files contain the relevance assessments of ClueWeb09 and ClueWeb12 collections for every facet in the train and dev sets, respectively

<facet_id> 0 <document_id> <relevance_score>

F0001 0 clueweb09-en0038-74-08250 1
F0001 0 clueweb09-enwp01-17-11113 1
F0002 0 clueweb09-en0001-02-21241 1
F0002 0 clueweb09-en0006-52-11056 1

Clarifying Question领域最常见的三个数据集

文章目录

Qulac

qulac.json:

qulac_hist012_dict.tar.gz:

MIMICS

ClariQ

ConvAI3 Data Challenge

Stage1: initial dataset

Stage2: human-in-the-loop

ClariQ Dataset

File Format

train.tsv and dev.tsv

test.tsv

question_bank.tsv

dev_synthetic.pkl.tar.gz & train_synthetic.pkl.tar.gz

single_turn_train_eval.pkl & multi_turn_***_evla.pkl.tar.gz

top10k_docs_dict.pkl.tar.gz

train.qrel & dev.qrel

相关文章

【进阶】Spring核心思想及其项目创建

Electron自定义协议Protocol对web网站做数据交互,使用SSE实时数据推送到网站

模板(template)包含与继承

超详细的Socket通信原理和实例讲解

【算法篇-排序】八大排序

【Linux】在Linux上写一个进度条小程序

2023/1/12总结

进阶必看 | 6个让Revit建模起飞的习惯，高效就靠它

Linux安装sonarqube（含各种错误处理）

【dp】买卖股票的最佳时机系列题目

外贸业务员怎样能提高自己的工作能力？

K_A11_004 基于STM32等单片机采集热敏传感参数串口与OLED0.96双显示

NCS8823替代方案|CS5260Typec转VGA可替代NCS8823|低BOM成本替代NCS8823设计

华为私有云平台FusionCompute搭建

前端基础(十二)_函数高级、全局变量和局部变量、预解析（变量提升）、函数返回值

CAN总线的个人理解

适合制造业的ERP推荐？使用ERP系统的好处有哪些?

用cmd命令窗口运行第一个java程序同时分享idea写的代码用cmd编译运行【建议收藏】

超全的SQL注入姿势总结

HC小区管理系统安装记录一次群里小伙伴梓豪方式安装问题